Rate Limiting kicks in even before the quota limits are reached

Handler code is as follows which has maxAmount set to 50 and validDuration is set to 1s.

What I would expect from this setting is that the rate limiting should kick in only when there are are more than 50 requests/s. But, I am seeing that rate limiting kicks in at 15 requests/s itself. I am running in a minikube configuration.

Any idea why the behaviour is not on the expected lines?

apiVersion: config.istio.io/v1alpha2
kind: handler
name: quotahandler
namespace: istio-system
compiledAdapter: memquota
- name: requestcountquota.instance.istio-system
maxAmount: 500
validDuration: 1s
# The first matching override is applied.
# A requestcount instance is checked against override dimensions.
# The following override applies to ‘reviews’ regardless
# of the source.
- dimensions:
destination: helidon-quickstart-mp
maxAmount: 50
validDuration: 1s

@Ram: are you still seeing this? Also, how many istio policy pods do you have when you see this?
It could be that these pods are prefetching tokens and hence you are seeing this…

@gargnupur: I am running with the defaults on minikube. Have not autoscaled or manually added any policy pods. I did notice that when I set a larger sliding window, istio does a better job of rate limiting.

Ex: Rate limiting set to 550/10s results in lesser % of 429s vs 55/s (55/s results in 2 times as much 429s)

The part I am still not clear is, why is 550/10s better than 55/s?
I am throttling the load at 58/s so that I avoid the spikes in request.