We are using topologySpreadConstraints on AWS and migrating from dev we are bumping HPA min counts from 1 to 2 or 3 (depending on the cluster) and are using topologySpreadConstraints (TSC) patching as follows:
spec:
template:
spec:
terminationGracePeriodSeconds: 10
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: istio-ingressgateway
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: istio-ingressgateway
It is similar for IstioD (ID) as for Istio-IngressGateway (IIG). This is to ensure to the maximum degree possible IIG and ID are spread across AZs and don’t colocate on the same host for maximum reliability.
For all OTHER services this TSC works normally producing the expected results. But for Istio’s IIG and ID adding those constraints causes the OPPOSITE effect - without TSC the pods are spread equally by node/zone, but adding the TSC causes some of the pods to be colocated in the same zone or even host even though resources are not constrained.
K8S 1.18.10 and 1.18.15, Istio 1.7.3.
I’m trying to understand if Istio has a custom scheduler that somehow interferes with normal kube TSC scheduling or something else along those lines. If this is not an expected behavior I’ll file a bug.
Thanks!