Bizarre istio-ingressgateway/istiod scheduling with HPA.min > 1 and topologySpreadConstraints on AWS

arcivanov · January 20, 2021, 7:59pm

We are using topologySpreadConstraints on AWS and migrating from dev we are bumping HPA min counts from 1 to 2 or 3 (depending on the cluster) and are using topologySpreadConstraints (TSC) patching as follows:

spec:
  template:
    spec:
      terminationGracePeriodSeconds: 10
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: ScheduleAnyway
        labelSelector:
          matchLabels:
            app: istio-ingressgateway
      - maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: ScheduleAnyway
        labelSelector:
          matchLabels:
            app: istio-ingressgateway

It is similar for IstioD (ID) as for Istio-IngressGateway (IIG). This is to ensure to the maximum degree possible IIG and ID are spread across AZs and don’t colocate on the same host for maximum reliability.

For all OTHER services this TSC works normally producing the expected results. But for Istio’s IIG and ID adding those constraints causes the OPPOSITE effect - without TSC the pods are spread equally by node/zone, but adding the TSC causes some of the pods to be colocated in the same zone or even host even though resources are not constrained.

K8S 1.18.10 and 1.18.15, Istio 1.7.3.

I’m trying to understand if Istio has a custom scheduler that somehow interferes with normal kube TSC scheduling or something else along those lines. If this is not an expected behavior I’ll file a bug.

Thanks!

howardjohn · January 22, 2021, 6:20am

There should be nothing in istiod that does any sort of custom scheduling or anything. It’s just a standard Kubernetes pod, nothing special.

arcivanov · January 24, 2021, 9:49pm

Ok thank you. I’ll keep digging then as it makes no sense whatsoever.

ritmas · May 31, 2021, 6:22am

@arcivanov have you dig up the reason of pods being scheduled unevenly?

arcivanov · June 1, 2021, 10:56pm

No, but I have filed a bug with K8S proper: Specifying PodTopologySpreadConstraints has worse outcome than none in 1.18.15 · Issue #98860 · kubernetes/kubernetes · GitHub

Topic		Replies	Views
Pods not balanced from ingressgateway Networking	2	628	September 23, 2020
503s to one of two subsets despite all pods being healthy Networking	2	648	January 21, 2020
Istio 1.6 strange behavior after upgrade (traffic disruption) Networking	0	563	June 11, 2020
Inconsistent networking issues with mongo replica sets	1	3524	May 8, 2020
Is TLS used between istio ingressgateway pod and service pod?	2	1769	February 6, 2019

Bizarre istio-ingressgateway/istiod scheduling with HPA.min > 1 and topologySpreadConstraints on AWS

Related topics