Mesh expansion in AWS

First, the mesh expansion docs note Mesh Expansion is broken in 1.0. However, this open issue indicates mesh expansion being functional in 1.0.x. Third, at least somebody has gotten this to work outside AWS - in IBM Cloud.

I’ve seen Mesh expansion on RHEL/CentOS and I see people have made some progress running on other distros. I was wondering if anyone has attempted mesh expansion in either a baremetal deployment or AWS yet?

There is an open issue (#10210) on GitHub related to Mesh Expansion on AWS, but no helpful information is there.

As I see it, setupMeshEx.sh is written to be GCP-specific. For example:

function istioClusterEnv() {
  # ...
  # TODO: parse it all from $(kubectl config current-context)
  CIDR=$(gcloud container clusters describe ${K8S_CLUSTER} ${GCP_OPTS:-} --format "value(servicesIpv4Cidr)")
  # ...
}

It looks like that is just the CIDR block of the cluster, not too bad. The other gcloud invocations are, in total:

# Copy files to the VM.
# - VM name - required, destination where files will be copied
# - list of files and directories to be copied
function istioCopy() {
  # TODO: based on some env variable, use different commands for other clusters or for testing with
  # bare-metal machines.
  local NAME=$1
  shift
  local FILES=$*

  ${ISTIO_CP:-gcloud compute scp --recurse ${GCP_OPTS:-}} $FILES ${NAME}:
}

and

# Run a command in a VM.
# - VM name
# - command to run, as one parameter.
function istioRun() {
  local NAME=$1
  local CMD=$2

  ${ISTIO_RUN:-gcloud compute ssh ${GCP_OPTS:-}} $NAME --command "$CMD"
}

So it is trying to do 3 things:

  1. Get a CIDR block,
  2. SCP files to a remote machine,
  3. SSH and run a command on the machine.

2 and 3 reduce to ssh command invocations, and are already overrideable by the ISTIO_CP and ISTIO_RUN environment variables, respectively. As far as I can tell CIDR is just for /var/lib/istio/envoy/cluster.env, which tells Envoy about the k8s cluster.

The docs for servicesIpv4Cidr say:

[Output only] The IP address range of the Kubernetes services in this cluster, in CIDR notation (e.g. 1.2.3.4/29). Service addresses are typically put in the last /16 from the container CIDR.

As long as these are not external IPs, we could generate them, as long as we may use internal IPs:

$ kubectl get services
NAME             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
details          ClusterIP   100.65.17.253    <none>        9080/TCP   29d
httpbin          ClusterIP   100.64.223.142   <none>        8000/TCP   1d
kubernetes       ClusterIP   100.64.0.1       <none>        443/TCP    134d
productpage      ClusterIP   100.65.22.94     <none>        9080/TCP   29d
ratings          ClusterIP   100.68.228.166   <none>        9080/TCP   29d
reviews          ClusterIP   100.68.122.211   <none>        9080/TCP   29d
test-apiserver   ClusterIP   100.64.2.18      <none>        80/TCP     69d

So, for something like the above, can I use 100.64.0.0/10 as my CIDR. The IBM Cloud Example from earlier just defaults to 10.0.0.0/16 (following the blog post, this is their cluster IPs). This makes me think, from the above example, I can use 100.64.0.0/10.

The other bit is mesh-expansion.yaml, which installed successfully in my demo instance:

$ kubectl apply -f mesh-expansion.yaml
service "istio-pilot-ilb" configured
service "dns-ilb" configured
service "mixer-ilb" configured
service "citadel-ilb" configured

Has anyone else tried this? I haven’t tested setupMeshEx.sh on my AWS cluster yet, but as long as I can use Kubernetes-internal IPs (I assume DNSMasq handles this, based on the script).

This is as far as I’ve gotten figuring this out outside of the official documentation, and it would be super useful to have this work outside of GCP.

@vngzs We will be glad to have and updated script/instructions for AWS.
If you finally got it working, can you share please the steps you had to adapt to AWS from GCP?

The CIDR address is used for the service mesh inclusion of the specified IP ranges. So 100.64.0.0/10 seems to be wide enough to include all of your endpoints IPs so they all be part of the mesh.

The current mesh expansion doc is indeed using GCP as an example. And in upcoming 1.1 doc, we use gateway to do SNI based routing to expose Citadel and Pilot.

Please feel free to contribute doc & scripts for deploying on AWS mesh expansion. Interested to see how would that look like.

I’m not sure about the status of setupMeshEx.sh, but definitely agree that we should provide better tooling.

You can try below command to get ISTIO CIDR on AWS instances.
ISTIO_SERVICE_CIDR=$(kubectl -n kube-system get cm kubeadm-config -o yaml | grep serviceSubnet | awk ‘{print $2}’`)