Bizarre istio-ingressgateway/istiod scheduling with HPA.min > 1 and topologySpreadConstraints on AWS

We are using topologySpreadConstraints on AWS and migrating from dev we are bumping HPA min counts from 1 to 2 or 3 (depending on the cluster) and are using topologySpreadConstraints (TSC) patching as follows:

spec:
  template:
    spec:
      terminationGracePeriodSeconds: 10
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: ScheduleAnyway
        labelSelector:
          matchLabels:
            app: istio-ingressgateway
      - maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: ScheduleAnyway
        labelSelector:
          matchLabels:
            app: istio-ingressgateway

It is similar for IstioD (ID) as for Istio-IngressGateway (IIG). This is to ensure to the maximum degree possible IIG and ID are spread across AZs and don’t colocate on the same host for maximum reliability.

For all OTHER services this TSC works normally producing the expected results. But for Istio’s IIG and ID adding those constraints causes the OPPOSITE effect - without TSC the pods are spread equally by node/zone, but adding the TSC causes some of the pods to be colocated in the same zone or even host even though resources are not constrained.

K8S 1.18.10 and 1.18.15, Istio 1.7.3.

I’m trying to understand if Istio has a custom scheduler that somehow interferes with normal kube TSC scheduling or something else along those lines. If this is not an expected behavior I’ll file a bug.

Thanks!

There should be nothing in istiod that does any sort of custom scheduling or anything. It’s just a standard Kubernetes pod, nothing special.

1 Like

Ok thank you. I’ll keep digging then as it makes no sense whatsoever.