Istio ingress request rate/concurrency limit(with queueing)

I would like to be able to throttle/queue http/https requests in the Istio ingress gateway to avoid overwhelming pods with too many requests.
The way I see it: per service there will be a per pod limit (similar to conn-limit in haproxy), this limit defines the amount of concurrent requests that a single pod can handle.
Based on the above limit definition: when istio ingress receives requests, it only forwards them to pods that did not reach the limits, if all pods reached the limits it will queue the request and wait for one of them to be available to handle it.
This can go further in shared mesh cases where the same service is available in multiple clusters, if all the pods in one cluster are full traffic wise - requests can in theory even go across the mesh to other clusters.
So my question, is whether this is possible with Istio? if so how can one define such policy? how do you control the queue length? etc…

1 Like