TCP metric collection for PostgreSQL does not work

Hi,

I encountered an issue regarding the TCP metric collection and I don’t know how to further investigate this issue.

I deployed a small sample application that has two backend services. Each backend service uses a dedicated PostgreSQL database respectively. The application works fine after deploying it with Istio and activating mTLS, however, I noticed that TCP metrics are not working for the database connections.

Both backend services utilize the PostgreSQL JDBC driver to connect to their respective database. The databases are deployed using the PostgreSQL helm chart.

To test whether or not TCP metrics are working at all, I deployed the bookinfo sample application using MongoDB. TCP metric collection for MongoDB traffic is working fine.

Kiali shows MongoDB traffic in the service graph while my postgres services are shown as “unused nodes” (another indicator for missing TCP metrics for the postgres deployments).

How can I further investigate/debug this issue?

Best regards,
Dennis

It sounds like you have sidecars injected in front of your postgres instances, is that correct?

A few debugging pointers:

  • Make sure the postgreSQL deploys meet the current requirements
  • Look at the istio-proxy logs on the postgres pods to see if there are any REPORT errors to Mixer.
  • Look at the envoy stats to verify traffic is flowing through the proxies. This may require expanding the set of default collected proxy stats (see: Envoy Statistics Operations doc)

Hope that helps get the investigation started,
Doug.

Yes, I have sidecars injected in front of my postgres instances.

Thank you very much for your helpful pointers!
Indeed, the istio-proxy logs show several REPORT errors to Mixer. It also shows that the report errors are likely caused by a wrong mTLS configuration.

After I deactivated mTLS, TCP metrics worked fine.
I still don’t know how to fix my mTLS configuration though.

I use a mesh policy with TLS mode STRICT. I use one global destination rule in namespace istio-system with name default to set TLS communication to ISTIO_MUTUAL. I use two specific destination rules to set the TLS communication to Kiali and Grafana to DISABLED as they don’t have a sidecar. And I use one destination rule in namespace istio-system to disable TLS for the k8s API server.

Did I miss anything?

Update: It works!

The not-working mTLS configuration for my application consisted of:

  • A MeshPolicy to enable mTLS STRICT mode
  • A destination rule with name default in namespace istio-system to enable client-side TLS communication
  • A destination rule to disable client-side TLS communication to Kiali service (Kiali does not have a sidecar)
  • A destination rule to disable client-side TLS communication to Grafana service (Grafana does not have a sidecar)

However, it turns out that when installing Istio using the istio-demo profile, the installer deploys two destination rules for the istio-telemetry service (i.e. Mixer). However, these destination rules do not contain the configuration for client-side TLS communication to Mixer. Thus, the communication between sidecar proxies in the dataplane and Mixer fails. After adding the TLS configuration as described in the istio-demo-auth profile, everything works as expected (see below).

# Configuration needed by Mixer.
# Mixer cluster is delivered via CDS
# Specify mixer cluster settings
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: istio-policy
  namespace: istio-system
  labels:
    app: mixer
    chart: mixer
    heritage: Tiller
    release: istio
spec:
  host: istio-policy.istio-system.svc.cluster.local
  trafficPolicy:
    portLevelSettings:
      - port:
          number: 15004
        tls:
          mode: ISTIO_MUTUAL
    connectionPool:
      http:
        http2MaxRequests: 10000
        maxRequestsPerConnection: 10000
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: istio-telemetry
  namespace: istio-system
  labels:
    app: mixer
    chart: mixer
    heritage: Tiller
    release: istio
spec:
  host: istio-telemetry.istio-system.svc.cluster.local
  trafficPolicy:
    portLevelSettings:
      - port:
          number: 15004
        tls:
          mode: ISTIO_MUTUAL
    connectionPool:
      http:
        http2MaxRequests: 10000
        maxRequestsPerConnection: 10000
---

Great to hear you were able to solve the issue! We’d love to hear ways to make that situation easier to debug.