I am hitting a similar issue nmigrating from helm templated/kubectl apply installed istio 1.5.4 to using istioctl in 1.6.x, whether i use istioctl manifest migrate or not, installing a new revision of istio results in all my EC2 instances being taken out of service, and my services being unreachable as a result.
The same install using istioctl with the same operator config works perfectly in a fresh cluster; but the upgrade doesnt work at all.
i believe this to be related to them changing the status port (at the same time as breaking a major installation option, cool choice) – even though my service object and load balancer are updated properly, no instances will pass healthchecks after the new revision is installed.
Here what i get when i try to curl the service from within the cluster after deploying 1.6.x:
curl istio-ingressgateway.istio-system.svc.cluster.local:15021/healthz/readyz -vL
* Trying 10.100.182.102...
* TCP_NODELAY set
* Connected to istio-ingressgateway.istio-system.svc.cluster.local (10.100.182.102) port 15021 (#0)
> GET /healthz/readyz HTTP/1.1
> Host: istio-ingressgateway.istio-system.svc.cluster.local:15021
> User-Agent: curl/7.61.1
> Accept: */*
< HTTP/1.1 503 Service Unavailable
< content-length: 19
< content-type: text/plain
< date: Wed, 08 Jul 2020 00:09:44 GMT
< server: envoy
* Connection #0 to host istio-ingressgateway.istio-system.svc.cluster.local left intact
no healthy upstream
If i redeploy my helm templated manifest of 1.5.4 over again, things are restored, and the service responds with a 200 to both 15020 and 15021
It seems the this may be part of the issue:
kubectl get ep -n istio-system
istio-system istio-ingressgateway <none> 136m
I’m pretty perplexed as to whats happening here