Hi,
I encountered an issue regarding the TCP metric collection and I don’t know how to further investigate this issue.
I deployed a small sample application that has two backend services. Each backend service uses a dedicated PostgreSQL database respectively. The application works fine after deploying it with Istio and activating mTLS, however, I noticed that TCP metrics are not working for the database connections.
Both backend services utilize the PostgreSQL JDBC driver to connect to their respective database. The databases are deployed using the PostgreSQL helm chart.
To test whether or not TCP metrics are working at all, I deployed the bookinfo sample application using MongoDB. TCP metric collection for MongoDB traffic is working fine.
Kiali shows MongoDB traffic in the service graph while my postgres services are shown as “unused nodes” (another indicator for missing TCP metrics for the postgres deployments).
How can I further investigate/debug this issue?
Best regards,
Dennis
It sounds like you have sidecars injected in front of your postgres instances, is that correct?
A few debugging pointers:
- Make sure the postgreSQL deploys meet the current requirements
- Look at the
istio-proxy
logs on the postgres pods to see if there are any REPORT errors to Mixer.
- Look at the envoy stats to verify traffic is flowing through the proxies. This may require expanding the set of default collected proxy stats (see: Envoy Statistics Operations doc)
Hope that helps get the investigation started,
Doug.
Yes, I have sidecars injected in front of my postgres instances.
Thank you very much for your helpful pointers!
Indeed, the istio-proxy
logs show several REPORT
errors to Mixer. It also shows that the report errors are likely caused by a wrong mTLS configuration.
After I deactivated mTLS, TCP metrics worked fine.
I still don’t know how to fix my mTLS configuration though.
I use a mesh policy with TLS mode STRICT
. I use one global destination rule in namespace istio-system
with name default
to set TLS communication to ISTIO_MUTUAL
. I use two specific destination rules to set the TLS communication to Kiali and Grafana to DISABLED
as they don’t have a sidecar. And I use one destination rule in namespace istio-system
to disable TLS for the k8s API server.
Did I miss anything?
Update: It works!
The not-working mTLS configuration for my application consisted of:
- A
MeshPolicy
to enable mTLS STRICT
mode
- A destination rule with name
default
in namespace istio-system
to enable client-side TLS communication
- A destination rule to disable client-side TLS communication to Kiali service (Kiali does not have a sidecar)
- A destination rule to disable client-side TLS communication to Grafana service (Grafana does not have a sidecar)
However, it turns out that when installing Istio using the istio-demo
profile, the installer deploys two destination rules for the istio-telemetry
service (i.e. Mixer). However, these destination rules do not contain the configuration for client-side TLS communication to Mixer. Thus, the communication between sidecar proxies in the dataplane and Mixer fails. After adding the TLS configuration as described in the istio-demo-auth
profile, everything works as expected (see below).
# Configuration needed by Mixer.
# Mixer cluster is delivered via CDS
# Specify mixer cluster settings
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: istio-policy
namespace: istio-system
labels:
app: mixer
chart: mixer
heritage: Tiller
release: istio
spec:
host: istio-policy.istio-system.svc.cluster.local
trafficPolicy:
portLevelSettings:
- port:
number: 15004
tls:
mode: ISTIO_MUTUAL
connectionPool:
http:
http2MaxRequests: 10000
maxRequestsPerConnection: 10000
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: istio-telemetry
namespace: istio-system
labels:
app: mixer
chart: mixer
heritage: Tiller
release: istio
spec:
host: istio-telemetry.istio-system.svc.cluster.local
trafficPolicy:
portLevelSettings:
- port:
number: 15004
tls:
mode: ISTIO_MUTUAL
connectionPool:
http:
http2MaxRequests: 10000
maxRequestsPerConnection: 10000
---
Great to hear you were able to solve the issue! We’d love to hear ways to make that situation easier to debug.