The LEAST_CONN load balancing of gRPC doesn't work

Hi,
We are using Istio 1.5 on AWS EKS to check the load balancing of gRPC.
However, the LEAST_CONN option doesn’t seem to be working properly.

How we verified it is described below.

I tried to check the LEAST_CONN option with 1 grpc-client pod and 2 grpc-server pod like below.

$ kubectl get po -n au-service
NAME                                READY   STATUS    RESTARTS   AGE
client-pod-5fbdf7f84b-l9vpn   3/3     Running   109        12d
server-pod-67fb898f7c-7nlx9   3/3     Running   0          139m
server-pod-67fb898f7c-pb887   3/3     Running   0          139m

And, I applied the DestinationRule with LEAST_CONN load balancing option and 0.01sec idleTimeout to server-pod.

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: server-pod
  namespace: test-namespace
spec:
  host: server-pod.au-service.svc.cluster.local
  trafficPolicy: # Apply to all ports
    loadBalancer:
      simple: LEAST_CONN ★
    connectionPool:
      http:
        idleTimeout: 0.01s ★

And I sent “sleep request” from client-pod to server-pod to keep 1 connection between client-pod and one of server-pod.(This sleep request sleeps 120 sec at server-pod)

$kubectl exec -it curl-pod -n test-namespace -- curl   client-pod.test-namespace.svc.cluster.local:6565/echoPath/sleep
{"message":"response from server-pod-67fb898f7c-pb887"}  ★this response comes back after the 120s.

While the “sleep request” was sleeping, I sent some simple requests(It returns about 10 ms.)
I thought that all simple requests would be routed to the other server-pod, but that wasn’t the case.

$ ./test.sh ※loop simple request by sigle thread. 
{"message":"Response from server-pod-67fb898f7c-7nlx9"}
{"message":"Response from server-pod-67fb898f7c-pb887"} ★
{"message":"Response from server-pod-67fb898f7c-7nlx9"}
{"message":"Response from server-pod-67fb898f7c-7nlx9"}
{"message":"Response from server-pod-67fb898f7c-7nlx9"}
{"message":"Response from server-pod-67fb898f7c-7nlx9"}
{"message":"Response from server-pod-67fb898f7c-7nlx9"}
{"message":"Response from server-pod-67fb898f7c-7nlx9"}
{"message":"Response from server-pod-67fb898f7c-pb887"} ★
{"message":"Response from server-pod-67fb898f7c-7nlx9"}
{"message":"Response from server-pod-67fb898f7c-7nlx9"}
{"message":"Response from server-pod-67fb898f7c-7nlx9"}
{"message":"Response from server-pod-67fb898f7c-pb887"} ★
{"message":"Response from server-pod-67fb898f7c-7nlx9"}
{"message":"Response from server-pod-67fb898f7c-7nlx9"}

This behavior is the same for HTTP 1.1.

Does anyone know why ?
How can I check the LEAST_CONN Option with grpc?

hi,I have a similar problem about the LEAST_CONN, do you have any idea to verify the LEAST_CONN