Managing load concurrency and queueing across multiple pods with Istio CircuitBreaker

Hi, I have a http2 service which is called from outside the k8 cluster.

  1. How do I ensure each instance of my service serves max ‘x’ requests parallel and queue upto ‘y’ requests. As requests come in, they should get distributed to the instance with lowest load (considering the queue size too). I am looking at using the istio circuit breaker http connection pool settings.


Is this correct approach to achieve what I want?