Demistify Mesh Federation Multi-Cluster communication without automatic endpoint discovery

Hi,

Here at Norwegian Refugee Council, we have a couple of AKS clusters running istio 1.9.1. All of the clusters share a common root CA, so cross-cluster communication with mTLS is technically possible.

  1. Operations
  2. Dev/Staging
  3. Production

We basically have a 1cluster=1mesh deployment model. In our case, 3clusters=3meshes.. We want to enable cross-cluster-cross-mesh communication, and we want to have fine-grained control over services that are exposed to other meshes. From the documentation, it’s unclear how to do that.

Usecase:

We have Hashicorp Vault running in the Operations cluster. Vault needs to access for example, postgres-app running on Production to rotate the secrets. We would like this connection as secure as possible, so enabling cross-cluster istio mTLS would be great. (operations.HashicorpVault → production.postgres)

Though, we don’t want automatic endpoint discovery, as there’s only a handful of services that need cross-cluster, cross-mesh connectivity. Also, we want to most secure setup. Only the principle of least privilege => only expose what needs to be exposed. Also, we don’t want to have all the kubeconfigs shared in each clusters.

I’ve pulled together all the information I could find. But I’m still a bit puzzled on how to set this up correctly.

Do we still need an eastwestgateway?

From what I’ve gathered, I think we need an eastwestgateway on each cluster, as specified in the samples/multicluster/gen-eastwest-gateway.sh file. Each cluster would have a different meshID, network, clusterName.

(install on cluster1)
samples/multicluster/gen-eastwest-gateway.sh --mesh mesh1 --cluster cluster1 --network network1

(install on cluster2)
samples/multicluster/gen-eastwest-gateway.sh --mesh mesh2 --cluster cluster2 --network network2

(install on cluster3)
samples/multicluster/gen-eastwest-gateway.sh --mesh mesh3--cluster cluster3 --network network3

Must the meshID, clusterName, and network be specified if automatic discovery is disabled?

Without automatic endpoint discovery, it’s unclear if the meshID, clusterName and network be specified in the IstioOperator

If we don’t enable automatic endpoint discovery, how can we expose services to other meshes running in other clusters?

Would they be exposed through manually-created ServiceEntries? If so, how would the ServiceEntries be declared? I found an istio.io 2020 blog post here (outdated?). Also an (outdated?) github repo here.

  1. Do we need to declare a ServiceEntry to expose a service in another cluster?
  2. Would declaring ServiceEntry.spec.location: MESH_INTERNAL enable istio mTLS by default ?
  3. Do we need to put the “eastwestgateway” IP address/port in the ServiceEntry.spec.endpoints?
  4. Should the ServiceEntry.spec.resolution be DNS?
  5. Must there be a Gateway/VirtualService/DestinationRule combination to route traffic to the eastwestgateway, and from eastwestgateway to the other cluster’s eastwestgateway ? (Similar to
    egressgateway documentation)
  6. In the target cluster, if the target Gateway has AUTO_PASSTHROUGH enabled, how can routing happen to the right service if endpoint discovery is disabled? Must AUTO_PASSTHROUGH be changed to PASSTHROUGH, and then a Gateway + VirtualService + DestinationRule be created in the target cluster to provide routing ?

How should the DestinationRules & Gateways be declared on the target cluster to enable mTLS?

It’s unclear how the DestinationRules should be configured, as well as the target Gateways, to enable istio mTLS between meshes / clusters.


This are some reference links I’m including as part of my research on this.

Multiple Meshes

You can enable inter-mesh communication with mesh federation. When federating, each mesh can expose a set of services and identities, which all participating meshes can recognize.

Mesh Federation

Mesh federation is the act of exposing services between meshes and enabling communication across mesh boundaries. Each mesh may expose a subset of its services to enable one or more other meshes to consume the exposed services. You can use mesh federation to enable communication between meshes in a multi-mesh deployment.

Multiple Control Planes

In some advanced scenarios, load balancing across clusters may not be desired. For example, in a blue/green deployment, you may deploy different versions of the system to different clusters. In this case, each cluster is effectively operating as an independent mesh. This behavior can be achieved in a couple of ways:

  • Do not exchange remote secrets between the clusters. This offers the strongest isolation between the clusters.
  • Use VirtualService and DestinationRule to disallow routing between two versions of the services.