How to make sure that gRPC requests are distributed evenly among Ingress Gateway pods

We use Istio Ingress Gateway to load balance our gRPC services. While the requests to the gRPC services backend are evenly distributed across the pods, the requests are not evenly distributed across the Istio Ingress Gateway pod, since gRPC connection is persistent, and the ingress gateway services are load balanced by Kubernetes Service (L4 load balancer).

Normally this isn’t an issue, but under extremely high load, we observe some impact on the end to end latency.

What would be the best approach to mitigate this issue?

Client side reconnecting is how it was approached at high loads to avoid the long running connections from grpc. Also worked well when auto-scaling up.

I came across this question with the same ask, but I had found the answer else where, and so I’d like to share.

We are on istio 1.8

The solution I’ve adopted is from here: Set reasonable "max_connection_duration" and "drain_timeout" default in istio ingress · Issue #27280 · istio/istio · GitHub

Where we use a EnvoyFilter to have ingressgateway envoy containers set a a max_connection_duration for downstream.
More options you can select are here:
https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/core/v3/protocol.proto.html#envoy-api-msg-core-httpprotocoloptions

And the EnvoyFilter we currently have for setting a max 10 min for each connection.

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter

metadata:
  name: ingress-grpc-disconnect-filter
  namespace: istio-system

spec:
  workloadSelector:
    labels:
      app: istio-ingressgateway
      istio: istio-ingressgateway
  configPatches:
  - applyTo: NETWORK_FILTER
    match:
      listener:
        filterChain:
          filter:
            name: "envoy.filters.network.http_connection_manager"
    patch:
      operation: MERGE
      value:
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          common_http_protocol_options:
            max_connection_duration: 600s

Hope this helps whoever needs it!