The primary problem I am having right now is that every time the operator pod is restarted, it reconciles and ‘ensures’ the services (whatever that is) and in the process the nodePorts are changed… Meaning my NLB target groups are recreated and it goes down for 3 minutes.
I have two ingress gateways configured, istio-ingressgateway (the default ) and an extra istio-clientgateway for a web sockets workload. After it reconciles you can see the istio-system event log:
30m Normal SuccessfulCreate replicaset/istio-clientgateway-65fb884b65 Created pod: istio-clientgateway-65fb884b65-vkw29
30m Normal ScalingReplicaSet deployment/istio-clientgateway Scaled up replica set istio-clientgateway-65fb884b65 to 1
30m Normal Scheduled pod/istio-clientgateway-65fb884b65-vkw29 Successfully assigned istio-system/istio-clientgateway-65fb884b65-vkw29 to ip-10-0-21-209.ap-southeast-2.compute.internal
30m Normal Started pod/istio-clientgateway-65fb884b65-vkw29 Started container istio-proxy
30m Normal Pulled pod/istio-clientgateway-65fb884b65-vkw29 Container image "docker.io/istio/proxyv2:1.6.8" already present on machine
30m Normal Created pod/istio-clientgateway-65fb884b65-vkw29 Created container istio-proxy
30m Normal Killing pod/istio-clientgateway-7d4954c5b5-s5vh6 Stopping container istio-proxy
30m Normal ScalingReplicaSet deployment/istio-clientgateway Scaled down replica set istio-clientgateway-7d4954c5b5 to 0
30m Normal SuccessfulDelete replicaset/istio-clientgateway-7d4954c5b5 Deleted pod: istio-clientgateway-7d4954c5b5-s5vh6
30m Normal EnsuringLoadBalancer service/istio-ingressgateway Ensuring load balancer
30m Normal EnsuredLoadBalancer service/istio-ingressgateway Ensured load balancer
30m Normal SuccessfulRescale horizontalpodautoscaler/istio-clientgateway New size: 2; reason: Current number of replicas below Spec.MinReplicas
30m Normal Scheduled pod/istio-clientgateway-65fb884b65-tnxp2 Successfully assigned istio-system/istio-clientgateway-65fb884b65-tnxp2 to ip-10-0-3-79.ap-southeast-2.compute.internal
30m Normal SuccessfulCreate replicaset/istio-clientgateway-65fb884b65 Created pod: istio-clientgateway-65fb884b65-tnxp2
30m Normal Pulled pod/istio-clientgateway-65fb884b65-tnxp2 Container image "docker.io/istio/proxyv2:1.6.8" already present on machine
30m Normal Created pod/istio-clientgateway-65fb884b65-tnxp2 Created container istio-proxy
30m Normal ScalingReplicaSet deployment/istio-clientgateway Scaled up replica set istio-clientgateway-65fb884b65 to 2
30m Normal Started pod/istio-clientgateway-65fb884b65-tnxp2 Started container istio-proxy
30m Warning Unhealthy pod/istio-clientgateway-65fb884b65-tnxp2 Readiness probe failed: Get http://10.0.14.3:15021/healthz/ready: dial tcp 10.0.14.3:15021: connect: connection refused
29m Normal SuccessfulCreate replicaset/istio-clientgateway-7d4954c5b5 Created pod: istio-clientgateway-7d4954c5b5-vrvwb
29m Normal ScalingReplicaSet deployment/istio-clientgateway Scaled up replica set istio-clientgateway-7d4954c5b5 to 1
29m Normal EnsuringLoadBalancer service/istio-clientgateway Ensuring load balancer
29m Normal Scheduled pod/istio-clientgateway-7d4954c5b5-vrvwb Successfully assigned istio-system/istio-clientgateway-7d4954c5b5-vrvwb to ip-10-0-21-209.ap-southeast-2.compute.internal
29m Normal Created pod/istio-clientgateway-7d4954c5b5-vrvwb Created container istio-proxy
29m Normal Pulled pod/istio-clientgateway-7d4954c5b5-vrvwb Container image "docker.io/istio/proxyv2:1.6.8" already present on machine
29m Normal Started pod/istio-clientgateway-7d4954c5b5-vrvwb Started container istio-proxy
29m Normal EnsuredLoadBalancer service/istio-clientgateway Ensured load balancer
29m Normal SuccessfulDelete replicaset/istio-clientgateway-65fb884b65 Deleted pod: istio-clientgateway-65fb884b65-vkw29
29m Normal SuccessfulCreate replicaset/istio-clientgateway-7d4954c5b5 Created pod: istio-clientgateway-7d4954c5b5-x7f8j
29m Normal Scheduled pod/istio-clientgateway-7d4954c5b5-x7f8j Successfully assigned istio-system/istio-clientgateway-7d4954c5b5-x7f8j to ip-10-0-3-79.ap-southeast-2.compute.internal
29m Normal ScalingReplicaSet deployment/istio-clientgateway Scaled down replica set istio-clientgateway-65fb884b65 to 1
29m Normal ScalingReplicaSet deployment/istio-clientgateway Scaled up replica set istio-clientgateway-7d4954c5b5 to 2
29m Normal Killing pod/istio-clientgateway-65fb884b65-vkw29 Stopping container istio-proxy
29m Normal Created pod/istio-clientgateway-7d4954c5b5-x7f8j Created container istio-proxy
29m Normal Pulled pod/istio-clientgateway-7d4954c5b5-x7f8j Container image "docker.io/istio/proxyv2:1.6.8" already present on machine
29m Normal Started pod/istio-clientgateway-7d4954c5b5-x7f8j Started container istio-proxy
29m Normal ScalingReplicaSet deployment/istio-clientgateway Scaled down replica set istio-clientgateway-65fb884b65 to 0
29m Normal SuccessfulDelete replicaset/istio-clientgateway-65fb884b65 Deleted pod: istio-clientgateway-65fb884b65-tnxp2
29m Normal Killing pod/istio-clientgateway-65fb884b65-tnxp2 Stopping container istio-proxy
28m Warning FailedComputeMetricsReplicas horizontalpodautoscaler/istio-clientgateway invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
28m Warning FailedGetResourceMetric horizontalpodautoscaler/istio-clientgateway unable to get metrics for resource cpu: no metrics returned from resource metrics API
28m Normal SuccessfulRescale horizontalpodautoscaler/istio-clientgateway New size: 1; reason: All metrics below target
28m Normal ScalingReplicaSet deployment/istio-clientgateway Scaled down replica set istio-clientgateway-7d4954c5b5 to 1
28m Normal SuccessfulDelete replicaset/istio-clientgateway-7d4954c5b5 Deleted pod: istio-clientgateway-7d4954c5b5-x7f8j
28m Normal Killing pod/istio-clientgateway-7d4954c5b5-x7f8j Stopping container istio-proxy
You can see a lot happening there but the main issue is the ports changing.
kube-proxy-sq767 kube-proxy 2020-08-25T04:38:11.048177621Z I0825 04:38:11.047897 1 service.go:381] Updating existing service port "istio-system/istio-ingressgateway:status-port" at 172.20.52.238:15021/TCP
kube-proxy-sq767 kube-proxy 2020-08-25T04:38:11.049034820Z I0825 04:38:11.047916 1 service.go:381] Updating existing service port "istio-system/istio-ingressgateway:http2" at 172.20.52.238:80/TCP
kube-proxy-sq767 kube-proxy 2020-08-25T04:38:11.049040079Z I0825 04:38:11.047927 1 service.go:381] Updating existing service port "istio-system/istio-ingressgateway:https" at 172.20.52.238:443/TCP
kube-proxy-sq767 kube-proxy 2020-08-25T04:38:11.049043219Z I0825 04:38:11.047935 1 service.go:381] Updating existing service port "istio-system/istio-ingressgateway:tls" at 172.20.52.238:15443/TCP
kube-proxy-sq767 kube-proxy 2020-08-25T04:38:11.106171239Z I0825 04:38:11.104902 1 proxier.go:1609] Opened local port "nodePort for istio-system/istio-ingressgateway:http2" (:32465/tcp)
kube-proxy-sq767 kube-proxy 2020-08-25T04:38:11.106191324Z I0825 04:38:11.104986 1 proxier.go:1609] Opened local port "nodePort for istio-system/istio-ingressgateway:status-port" (:31205/tcp)
kube-proxy-sq767 kube-proxy 2020-08-25T04:38:11.106195214Z I0825 04:38:11.105163 1 proxier.go:1609] Opened local port "nodePort for istio-system/istio-ingressgateway:tls" (:32428/tcp)
kube-proxy-sq767 kube-proxy 2020-08-25T04:38:11.106198584Z I0825 04:38:11.105239 1 proxier.go:1609] Opened local port "nodePort for istio-system/istio-ingressgateway:https" (:32114/tcp)
kube-proxy-sq767 kube-proxy 2020-08-25T04:39:07.730072897Z I0825 04:39:07.729301 1 service.go:381] Updating existing service port "istio-system/istio-clientgateway:status-port" at 172.20.134.13:15021/TCP
kube-proxy-sq767 kube-proxy 2020-08-25T04:39:07.730099512Z I0825 04:39:07.729326 1 service.go:381] Updating existing service port "istio-system/istio-clientgateway:http2" at 172.20.134.13:80/TCP
kube-proxy-sq767 kube-proxy 2020-08-25T04:39:07.730103775Z I0825 04:39:07.729336 1 service.go:381] Updating existing service port "istio-system/istio-clientgateway:https" at 172.20.134.13:443/TCP
kube-proxy-sq767 kube-proxy 2020-08-25T04:39:07.730107387Z I0825 04:39:07.729345 1 service.go:381] Updating existing service port "istio-system/istio-clientgateway:tls" at 172.20.134.13:15443/TCP
kube-proxy-sq767 kube-proxy 2020-08-25T04:39:07.754010667Z I0825 04:39:07.753406 1 proxier.go:1609] Opened local port "nodePort for istio-system/istio-clientgateway:status-port" (:31555/tcp)
kube-proxy-sq767 kube-proxy 2020-08-25T04:39:07.754052420Z I0825 04:39:07.753445 1 proxier.go:1609] Opened local port "nodePort for istio-system/istio-clientgateway:https" (:31353/tcp)
kube-proxy-sq767 kube-proxy 2020-08-25T04:39:07.754057730Z I0825 04:39:07.753701 1 proxier.go:1609] Opened local port "nodePort for istio-system/istio-clientgateway:tls" (:30243/tcp)
kube-proxy-sq767 kube-proxy 2020-08-25T04:39:07.754071920Z I0825 04:39:07.753751 1 proxier.go:1609] Opened local port "nodePort for istio-system/istio-clientgateway:http2" (:31078/tcp)
Another thing is, I have ingress-nginx handling some other things for me… and it and its service doesn’t do anything like this… Even on a controller restart.
So in the end, what would cause an k8s svc to do this and then what in istio is making that change?