Inconsistency between Istio and Kubernetes endpoints

mwoodbri · December 19, 2022, 11:03am

Hi,

We’re happy Istio users but occasionally suffering from a recurring issue where we lose connectively between applications running in the mesh after restarting one of the services (postgres-pooler in this case).

When this happens the sidecar seems to be attempting to connect the main container to pods that no longer exist.

Here’s an example where we have two deployments:

xgckw is working - the endpoints from istioctl proxy-config match those from kubectl get endpointslice.
dhjdj is not working - it is trying to connect to non-existent pods, via config provided by a different istiod, which doesn’t match the endpoints of the relevant Kubernetes Service.

$ kubectl get endpointslice postgres-pooler-x9pvh -n prod
NAME                    ADDRESSTYPE   PORTS   ENDPOINTS                  AGE
postgres-pooler-x9pvh   IPv4          5432    10.240.6.85,10.240.3.250   667d

$ istioctl proxy-status
NAME                                                  CLUSTER        CDS        LDS        EDS          RDS          ISTIOD                      VERSION
deployment-7f78785784-xgckw.prod                  Kubernetes     SYNCED     SYNCED     SYNCED       SYNCED       istiod-6ffd54b448-v9nck     1.13.8
deployment-788b6d8c6d-dhjdj.prod                  Kubernetes     SYNCED     SYNCED     SYNCED       SYNCED       istiod-6ffd54b448-9fwv4     1.13.8

$ istioctl proxy-config endpoint -n prod deployment-7f78785784-xgckw
ENDPOINT                         STATUS      OUTLIER CHECK     CLUSTER
10.0.171.251:5432                HEALTHY     OK                outbound|5432||postgres-pooler.prod.svc.cluster.local
10.240.3.250:5432                HEALTHY     OK                outbound|5432||postgres-pooler.prod.svc.cluster.local
10.240.6.85:5432                 HEALTHY     OK                outbound|5432||postgres-pooler.prod.svc.cluster.local

$ istioctl proxy-config endpoint -n prod deployment-788b6d8c6d-dhjdj
ENDPOINT                         STATUS      OUTLIER CHECK     CLUSTER
10.0.171.251:5432                HEALTHY     OK                outbound|5432||postgres-pooler.prod.svc.cluster.local
10.240.4.229:5432                HEALTHY     OK                outbound|5432||postgres-pooler.prod.svc.cluster.local
10.240.6.152:5432                HEALTHY     OK                outbound|5432||postgres-pooler.prod.svc.cluster.local

10.240.4.229 and 10.240.6.152 correspond to pods that no longer exist.

Does anyone have any ideas about how to debug this further or correct our setup? Any help would be greatly appreciated.

Topic		Replies	Views
Services not able to connect to Postgresql when Istio Sidecar is enabled in Kubernetes	1	4353	March 15, 2019
Service "istio-sidecar-injector" not found	3	3606	April 19, 2021
Istio (Envoy-proxy sidecar) is blocking http traffic on port 8088 Networking	20	10383	June 4, 2020
Istio Sidecars Failing to Connect to Pilot Networking	1	487	June 1, 2019
Problem accessing services without a sidecar from pods with a sidecar	2	3230	July 4, 2019

Inconsistency between Istio and Kubernetes endpoints

Related topics