Issues with Mesh Expansion with mTLS enabled


I’m trying to setup mesh expansion with a kube cluster running Istio 1.1 in AWS. I’ve gotten that to work successfully with the sample bookinfo app (as per 1.1 docs) with the details service stripped out from k8s and deployed to the mesh expanded VM.

On enabling mTLS for the namespace that the bookinfo app is running, the productpage is able to hit the reviews service successfully (running on K8s) but not details service (running on the mesh expanded VM). I looked at all logs and figured that the service is throwing a 503. On deeper investigation I found out the following details

  1. There are no listeners on the sidecar running on the VM
  2. The details service is logging encrypted access logs instead of the standard access logs
  3. istioctl proxy-status does not list the mesh expanded proxy info (which means that pilot did not relay any info to the side car) which is likely the root cause of the 503 issue (if pilot doesn’t relay any info)
  4. There’s no errors in the VM logs other than a WARN " gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers"

Can anyone help me in figuring out this issue?



Update: I figured this out.

Turns out that the mesh expansion was never working in the first place (the requests were being sent directly to the service running on the VM instead of the envoy proxy) and the envoy sidecar was never running on the configured ENVOY_PORT. On further debugging and searching on the internet, I figured that the issue was with the envoy configuration, it was missing "http2_protocol_options": {}, in the envoy_bootstrap_tmpl.json (from this github issue)
After I added the option, I was able to see the envoy service listening on ENVOY_PORT and istioctl proxy-status does list the mesh expanded service.

I also had to set the ISTIO_SVC_IP to the IP address of the VM instead of default hostname --ip-address for the service to be discovered within the mesh network.

Hope this helps others trying to get mesh expansion work with mTLS turned on.



Yes, if you didn’t get cluster/listeners via curl localhost:15000/clusters, that’s signifies the failure of unable to connect to control plane.

But it’s surprised to me that you need to change bootstrap. Does our instructions (although focused on GCP), not working for you out of the box? Many folks have tried that and don’t need to tweak any envoy bootstrap config.