OK, I’m desperate. I can’t get it to work. I have a Kafka cluster in a not injected namespace
in the same cluster. It’s using an operator so I don’t have a lot of control over how the
pod spec and service look like. Here is how istio (1.6.5) is installed:
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
name: cps-istio
namespace: istio-system
spec:
components:
cni:
enabled: true
namespace: kube-system
values:
cni:
excludeNamespaces:
- istio-system
- kube-system
logLevel: info
meshConfig:
outboundTrafficPolicy:
# mode: REGISTRY_ONLY
mode: ALLOW_ANY
profile: demo
This is how the spec of the clusters pod looks like:
apiVersion: v1
kind: Pod
metadata:
annotations:
cni.projectcalico.org/podIP: 100.121.134.33/32
prometheus.io/port: "9020"
prometheus.io/scrape: "true"
name: kafka-0-pjt47
spec:
...
name: kafka
ports:
- containerPort: 9094
name: tcp-external
protocol: TCP
- containerPort: 29092
name: tcp-ssl
protocol: TCP
- containerPort: 9020
name: metrics
protocol: TCP
status:
phase: Running
podIP: 100.121.134.33
qosClass: Burstable
And the troublesome headless service:
apiVersion: v1
kind: Service
metadata:
labels:
app: kafka
name: kafka-headless
namespace: cloud-platform-workload
spec:
clusterIP: None
ports:
- name: tcp-ssl
port: 29092
protocol: TCP
targetPort: 29092
- name: metrics
port: 9020
protocol: TCP
targetPort: 9020
selector:
app: kafka
kafka_cr: kafka
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
Using istioctl gives the expected warnings, but I can’t change this as this is not managed by
me:
istioctl analyze
Info [IST0118] (Service kafka-headless.cloud-platform-workload) Port name metrics (port: 9020, targetPort: 9020) doesn't follow the naming convention of Istio port.
So, without the proxy the service (in another namespace) happily connects to Kafka over TLS!
but as soon as I inject the proxy I see this:
istioctl proxy-config cluster axsh
kafka-headless.cloud-platform-workload.svc.cluster.local 9020 - outbound ORIGINAL_DST
kafka-headless.cloud-platform-workload.svc.cluster.local 29092 - outbound ORIGINAL_DST
Shelling in the container gives me the headless info of the Kafka headless
nslookup kafka-headless.cloud-platform-workload.svc.cluster.local
Server: 100.64.0.10
Address: 100.64.0.10#53
Name: kafka-headless.cloud-platform-workload.svc.cluster.local
Address: 100.121.134.33
Name: kafka-headless.cloud-platform-workload.svc.cluster.local
Address: 100.108.9.188
Name: kafka-headless.cloud-platform-workload.svc.cluster.local
Address: 100.119.65.217
Looking at the listeners we get the IP’s listed and if you look at he cluster info
it’s listed as type ORIGINAL_DST
istioctl proxy-config listener axsh
100.121.134.33 29092 TCP
100.108.9.188 29092 TCP
100.119.65.217 29092 TCP
In the logs of the service I see TLS trouble:
[kafka-admin-client-thread | adminclient-1] WARN org.apache.kafka.clients.NetworkClient - [AdminClient clientId=adminclient-1] Connection to node -1 (kafka-headless.cloud-platform-workload.svc/100.121.134.33:29092) terminated during authentication. This may happen due to any of the following reasons: (1) Authentication failed due to invalid credentials with brokers older than 1.0.0, (2) Firewall blocking Kafka TLS traffic (eg it may only allow HTTPS traffic), (3) Transient network issue.
And the proxy gives this:
[2020-07-11T13:48:29.045Z] "- - -" 0 UF,URX "-" "-" 0 0 5 - "-" "-" "-" "-" "100.121.134.33:29092" outbound|29092||kafka-headless.cloud-platform-workload.svc.cluster.local - 100.121.134.33:29092 100.108.9.147:55432 - -
I tried a lot of thing, adding ServiceEntries, VS, DR, etc… I see also a lot of articles but nothing works. I think
my main trouble is the the headless services give an IP (and that Pod’s don’t get service entries).
But with having that in mind (the no Pod DNS entries), what can I try next? For my first try, I just want to make it work… after that, I want to move the TLS wrapping to envoy.