Istio Ingress Gateway is healthy in case of "certificate signed by unknown authority" error

qurname2 · January 7, 2022, 1:49pm

Hello!

We’re using custom, issued by our own CA, certificates in Istio.
During the last certificates changing one of our istio-ingress-gateway pods wasn’t restarted (it must be done for the correct work) due to human error. It led to the error messages in log:
{"level":"error","time":"2021-12-13T13:24:04.890986Z","scope":"xdsproxy","msg":"failed to create upstream grpc client: rpc error: code = Unavailable desc = connection error: desc = \"transport: authentication handshake failed: x509: certificate signed by unknown authority\""}

It’s ok and I know how to fix it - just do a restart of this pod and it’s all. But I noticed one interesting moment, the readiness probe of this pod is healthy .
To my mind, it’s not correct behaviour, cause in fact pod can’t produce customer’s traffic.

  readinessProbe:
    failureThreshold: 30
    httpGet:
      path: /healthz/ready
      port: 15021
      scheme: HTTP
     initialDelaySeconds: 10
     periodSeconds: 2
     successThreshold: 1
     timeoutSeconds: 1

As a workaround, I wanted to create an alert about such behaviour of istio-ingress-gateway pod, but I didn’t find any istio metrics which can describe that pod isn’t healthy due to “a certificate signed by unknown authority” error.
If I missed smth, could somebody point me out to the right metric?

Version

istioctl version
client version: 1.8.2
control plane version: 1.8.2
data plane version: 1.8.2 (13 proxies)

kubectl version --short
Client Version: v1.16.1
Server Version: v1.16.9

jtrbs · January 7, 2022, 11:40pm

Currently there are no istio metrics on that