We use Istio Ingress Gateway to load balance our gRPC services. While the requests to the gRPC services backend are evenly distributed across the pods, the requests are not evenly distributed across the Istio Ingress Gateway pod, since gRPC connection is persistent, and the ingress gateway services are load balanced by Kubernetes Service (L4 load balancer).
Normally this isn’t an issue, but under extremely high load, we observe some impact on the end to end latency.
What would be the best approach to mitigate this issue?
Client side reconnecting is how it was approached at high loads to avoid the long running connections from grpc. Also worked well when auto-scaling up.
I came across this question with the same ask, but I had found the answer else where, and so I’d like to share.
We are on istio 1.8
The solution I’ve adopted is from here: Set reasonable "max_connection_duration" and "drain_timeout" default in istio ingress · Issue #27280 · istio/istio · GitHub
Where we use a EnvoyFilter to have ingressgateway envoy containers set a a max_connection_duration for downstream.
More options you can select are here:
https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/core/v3/protocol.proto.html#envoy-api-msg-core-httpprotocoloptions
And the EnvoyFilter we currently have for setting a max 10 min for each connection.
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: ingress-grpc-disconnect-filter
namespace: istio-system
spec:
workloadSelector:
labels:
app: istio-ingressgateway
istio: istio-ingressgateway
configPatches:
- applyTo: NETWORK_FILTER
match:
listener:
filterChain:
filter:
name: "envoy.filters.network.http_connection_manager"
patch:
operation: MERGE
value:
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
common_http_protocol_options:
max_connection_duration: 600s
Hope this helps whoever needs it!