Does Envoy route service call to the service pod directly, or does it go through the kube service ClusterIP?


This is a simple question: when handling in mesh service call, does Envoy route the call directly to one of the pod endpoints, or does it route to Kubernetes service ClusterIP, and let it (the iptables redirect rules on the host which were setup by Kube-proxy) route to one of the destination pod IP?

I think Envoy chooses the destination pod directly based on the routing rules we setup with VirtualServices and DestinationRules, without going through kube service ClusterIP, otherwise service ClusterIP would route round robin and you loose the fine traffic control ability promised by Istio.

But I just want to make sure my understanding is correct. The reason I have a little doubt is that I dumped out pilot eds and config json files, for a single service (say the “details” service comes with the BookInfo sample app), I saw the true “details” POD IP in endpoints section, but I also saw the service ClusterIP in dynamic_route_configs section.

What is the point of including ClusterIP in the xDS data if Envoy doesn’t use it for routing? Is the purpose that just in case some client uses the ClusterIP (say rather than the service name /details:9080 in the URL, Istio still knows how to route it to the POD endpoint, without truly going through ClusterIP?

Thank you!



I did some digging and found the answer: according to this Istio document, step 5, from the matching service (with version if applicable), Enovy further looks for the endpoints that it obtained from Pilot via ADS using the matching service name as key, and directly sends the request to one of the endpoints, without going through the Kubernetes service ClusterIP.

To be sure, I got into the productpage Envoy sidecar container and did “lsof -i” to get all connections and service ClusterIP was not there, but rather the direct pod IPs.

This confirms my believe and it does make sense. If Enovy indeed routed to the service ClusterIP, then it would lose the ability for L7 finer traffic control.

I guessed I was fooled by this Istio Kiali picture of the Istio sample app bookInfo call flow:

The triangle represents the service and square represents the pod. It gives an impression that any web service call will go through the service first and then the service will route the request to the backend pod. That is certainly true for normal Kube (without Enovy sidecar) flow but a little misleading for Istio service mesh flow.



It gives an impression that any web service call will go through the service first and then the service will route the request to the backend pod.

The flow is a bit different. The request for a service is first processed at the source proxy and routed to a destination workload based on any applicable route rules. The triangle represents the resulting destination service. The square in this case doesn’t represent a pod but rather a versioned app based on app and version labels applied to the destination workload. You can change the graph type in Kiali to “Workload” to see workload nodes (circles). A workload is not necessarily equivalent to a pod, but it does represent the entity physically servicing the request. Also, if you prefer not to see the service nodes you can optionally remove them from the graph by unchecking the “Service Nodes” option in the Display dropdown.

You are right that the graph can make it look like a request “goes through” the service but we wanted to be able to represent that the source requests were routed to the service, and then serviced by a specific app/workload. For example, it allows us to visualize that productpage is making requests for reviews, and that those requests are in turn routed to three different versions of the reviews service.