Dear all
I’m trying to setup TCP communication from the Istio proxy sidecar to AWS MSK via Istio’s egress gateway. We were able to successfully setup this basic flow for HTTP/HTTPS traffic to www.istio.io as a test but numerous attempts to accomplish the same for connections to the zookeeper endpoints of AWS MSK are failing and I was hoping I can get some assistance from the community.
- Istio version: 1.4.3
- Kubernetes version : 1.15.0
- We have a private AWS EKS cluster (eg: 10.106.20.0/24)
- We have a private AWS MSK cluster (eg: 10.106.11.0/24)
- We have a network policy that only allows DNS lookup and communication to Istio’s components (and temporarily access to 10.106.0.0/16)
- We have set the global.outboundTrafficPolicy to REGISTRY_ONLY
- We configure a service entry, virtual service and gateway resource (example below) and don’t use destination rules / subset as I would think this is not required if you don’t try to access multiple versions or proxy only portions (weight) of the traffic
- Some of our attempts involved adding port 2181 (zookeeper) as a listener port to the istio-egressgateway component (pod/service) because by default it only listens on 80, 443 and 15443. This would allow us to keep the same port throughout every component
- We have 3 AWS MSK Zookeeper strings but I will only list 1 as an example
Test case 1
We’ve tried to use port 2181 throughout the full chain (app–>sidecar–>egressgateway) by adding port 2181 to the and we see traffic floating from the sidecar to the egressgateway, where it fails with No healthy upstream as if the final destination route in the virtual host doesn’t work
istio-egressgateway-6cb6c46b9-64p96 istio-proxy [2020-04-14T17:55:01.625Z] "- - -" 0 UH "-" "-" 0 0 0 - "-" "-" "-" "-" "-" - - 10.106.20.209:2181 10.106.21.56:44626 - -
We got to this point by using following resources and I’m feeling this is the closest I can get to a working solution
---
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
name: msk-se-z1
namespace: msk
spec:
hosts:
- z-1.msk-dev-cluster.9xhpez.c2.kafka.eu-west-1.amazonaws.com
addresses:
- 10.106.11.0/24
ports:
- name: tcp
number: 2181
protocol: TCP
exportTo:
- "*"
location: MESH_EXTERNAL
resolution: NONE
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: msk-egressgateway
namespace: msk
spec:
selector:
app: istio-egressgateway
servers:
- port:
number: 2181
name: tcp-2181
protocol: TCP
hosts:
- z-1.msk-dev-cluster.9xhpez.c2.kafka.eu-west-1.amazonaws.com
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: msk-vs-z1
namespace: msk
spec:
exportTo:
- "*"
hosts:
- z-1.msk-dev-cluster.9xhpez.c2.kafka.eu-west-1.amazonaws.com
gateways:
- mesh
- msk-egressgateway
tcp:
- match:
- gateways:
- mesh
destinationSubnets:
- 10.106.11.0/24
port: 2181
route:
- destination:
host: istio-egressgateway.istio-system.svc.cluster.local
port:
number: 2181
- match:
- gateways:
- msk-egressgateway
port: 2181
route:
- destination:
host: z-1.msk-dev-cluster.9xhpez.c2.kafka.eu-west-1.amazonaws.com
port:
number: 2181
weight: 100
---
Test case 2
We’ve tried the same as above, but instead of using the newly added 2181 port on the egress gateway, we tried to used a default port, like port 80. This time it our request is never processed by the egress gateway, instead we’re seeing following error on the application Packet len1213486160 is out of range!
The resources are almost identical as above, only this time we used port 80
---
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
name: msk-se-z1
namespace: msk
spec:
hosts:
- z-1.msk-dev-cluster.9xhpez.c2.kafka.eu-west-1.amazonaws.com
addresses:
- 10.106.11.0/24
ports:
- name: tcp
number: 2181
protocol: TCP
exportTo:
- "*"
location: MESH_EXTERNAL
resolution: NONE
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: msk-egressgateway
namespace: msk
spec:
selector:
app: istio-egressgateway
servers:
- port:
number: 80
name: tcp-80
protocol: TCP
hosts:
- z-1.msk-dev-cluster.9xhpez.c2.kafka.eu-west-1.amazonaws.com
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: msk-vs-z1
namespace: msk
spec:
exportTo:
- "*"
hosts:
- z-1.msk-dev-cluster.9xhpez.c2.kafka.eu-west-1.amazonaws.com
gateways:
- mesh
- msk-egressgateway
tcp:
- match:
- gateways:
- mesh
destinationSubnets:
- 10.106.11.0/24
port: 2181
route:
- destination:
host: istio-egressgateway.istio-system.svc.cluster.local
port:
number: 80
- match:
- gateways:
- msk-egressgateway
port: 80
route:
- destination:
host: z-1.msk-dev-cluster.9xhpez.c2.kafka.eu-west-1.amazonaws.com
port:
number: 2181
weight: 100
---
We’ve tried so many things and it feels like there is something fundamental I’m not seeing or misunderstanding.For example: Is it allowed to created all of our resources (service entry, virtual service and gateway) in the namespaces of the application, which is different from the istio egressgateway namespace?
PS: I can also add that if we delete our network policy and delete the virtual services and gateways created above, we are able to access our kafka cluster, only this time the sidecar sends the request directly to the internet, which is the expected behaviour
Already a big thanks for pointing me in the right direction or share your own config if you’re connecting with external zookeeper services over TCP