Istio-ingressgateway pod downsizing causes 502 responses from loadbalancer (ALB)

These are the versions of the tools we are currently using:
Istio v1.4.3 (setup through official helm chart)
Kubernetes v1.15.7 (setup through kops)

We have setup a Kubernetes cluster on AWS using kops
We are using the aws-alb-ingress-controller helm chart for provisioning our ALB loadbalancer as our ingress into the cluster

We terminate our SSL connections on the ALB using ACM

The istio-ingressgateway service is of type NodePort and exposes the traffic port (80) and the status port (15020)
It has externalTrafficPolicy set to Cluster so that all (5) nodes report as healthy to the ALB

Our ALB is configured to forward traffic to the istio-ingressgateway traffic port and perform healthchecks on the status port (HTTP /healthz/ready)

The setup seems to be working fine. We see all nodes as healthy in the ALB and traffic is distributed over all backends.

However, when we apply load to the system (+/- 50 r/s in a randomized fashion) by mocking long-running requests (ie. curl https://some-service.our.domain/wait/750 which simply blocks for 750ms), we see that when we scale down the istio-ingress gateway deployment (eg: from 3 to 2 replicas) itself, in-flight connections are dropped and the loadbalancer (ALB) returns 502 responses.

I was under the impression that the istio-ingressgateway pod would handle the SIGTERM that was sent to it by Kubernetes correctly and would stop accepting new requests on the pod that was terminating, while waiting for gracePeriod seconds before forcefully killing the pod with SIGKILL. However, we see that the pod is immediately killed, causing the 502 connections returned from the ALB

The istio-ingressgateway pods are configured with a readinessProbe:

readinessProbe:
  failureThreshold: 30
  httpGet:
    path: /healthz/ready
    port: 15020
    scheme: HTTP
  initialDelaySeconds: 1
  periodSeconds: 2
  successThreshold: 1
  timeoutSeconds: 1

Are we missing something in our setup which would provide the expected behaviour? Or am I misunderstanding how istio should handle this downsizing?

Thanks in advance

Drain duration is set to 5s by default. It can be cusomized by setting

    env:
      TERMINATION_DRAIN_DURATION_SECONDS: 30

on the ingress gateway. By increasing the drain duration my 502 errors vanished during scaling down istio-ingressgateway pods

There is a bug in the helm chart in istio which prevents setting it from the values.yaml file. A pull request to fix the issue has been created: https://github.com/istio/istio/pull/20984