503 errors even with ISTIO_MUTUAL DestinationRule

Hi all,

I have a very simple nginx deployment with Istio mTLS enabled. But for some reason, client pods are unable to reach the service, the sidecar returns 503 (upstream connect error or disconnect/reset before headers).

When I try to manually access the service from istio-proxy container and pass the client certificates, it works. For example:

Direct connection From istio-proxy:

$ kubectl exec -it nginx-deployment-6f54996ccf-prvls -n services -c istio-proxy bas
curl https://nginx:80 --key /etc/certs/key.pem --cert /etc/certs/cert-chain.pem --cacert /etc/certs/root-cert.pem -k -s -w '%{http_code}\n' -o /dev/null

200

When connecting from the app container (I’m using the same nginx pod for simplicity):

Connection from app container

wget -S -q -O- http://nginx 2>&1 
  HTTP/1.1 503 Service Unavailable
wget: server returned error: HTTP/1.1 503 Service Unavailable

Funny enough, I was not even able to install curl in the app container, because all external outgoing http requests are returning 404:

/ # wget -S -q -O- www.google.com 2>&1 
  HTTP/1.1 404 Not Found
wget: server returned error: HTTP/1.1 404 Not Found

Environment Information
I have tried with both Istio 1.1.2 and 1.1.7 versions (both installed via the helm chart). I am running on GKE cluster.

Below is my manifests for the deployment:

Deployment Manifests

apiVersion: apps/v1 
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1 
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.13.12-alpine
        ports:
        - containerPort: 80
          name: http
---
apiVersion: v1
kind: Service
metadata:
  name: nginx
spec:
  ports:
  - name: http
    port: 80
    targetPort: 80
  selector:
    app: nginx
---
apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
  name: "nginx"
  labels:
    app.kubernetes.io/name: nginx
spec:
  host: "nginx"
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL

I also have the following global mesh policy:

apiVersion: authentication.istio.io/v1alpha1
kind: MeshPolicy
metadata:
  name: default
spec:
  peers:
  - mtls: {}

Notes
This seems very weird to me, because I am able to initiate a TLS connection directly from the sidecar container by passing the certificates. So, there’s no reason why the sidecar is not able to connect to the other end’s sidecar. I am not doing anything special when installing the Istio Helm chart, but we do have other charts on the system that are not managed by me that might be doing something bad. I’ll be happy to grab any information or logs you need.

Lastly, when I deploy the manifests above to a namespace that does not have Istio injection enabled, it works without issues and I’m able to reach the service.

Thanks!

Did you ever figure out where the error was? I also have a 503, and every config looks ok, just like in your example…

Yes, in our case we had a CronJob that was responsible for syncing secrets from default namespace to other namespaces (we needed to that to slowly implement some sort transition mechanism from a non-Istio namespace to Istio one), unfortunately, the sync logic did not exclude non-application secrets, such as SA tokens and Istio certificates. So it was not really an issue with Istio.

I highly recommend you use istioctl authn tls-check to check for any conflicting mesh policies / destination rules.

I use certmanager, from about ns… maybe there is something similar going on… I’ll checking whether I get home… thanks for the heads-up! :slight_smile:

yeah sounds possible, sure you got it! :slight_smile: