When there are lots of external VMs which are accessible only via a firewall and there are multiple namespaces in the cluster, each with its own set of external VMs, you end up with a lot of ServiceEntries, which in turn cause a lot of DNS queries for
$host.$ns.svc.cluster.local $host.svc.cluster.local $host.cluster.local
$host is for example:
vm-1.vm-cluster.example.com with let’s say 10 namespaces and 100 pods in each namespace, that’s quite a number of DNS queries, all answered with NXDOMAIN, so the local istio DNS proxy cache does not work. This can lead to an overload situation of the coreDNS Pods (the memory size, including buffers goes up and cause the DNS Pod to OOM).
My question is: how can I prevent those queries? The solutions I came up with are:
- Replacing $host with a FQDN, which includes the trailing dot, does not work because the X.509 certs contain the hostname without the trailing dot.
- Change the DNS config of each Pod and set the ndots to something like 3. But this sounds like a maintenance nightmare.
I would like to solve this via a clever combination of ServiceEntry, VirtualService and Gateway definitions.