Istio reset with AWS NLB and Slack access


I’ve a strange behavior I’m unable to pinpoint. I’ve an eks cluster using NLB and istio as a daemon set, sometimes when a workload calls itself as if it was an external application (NOT using .svc.cluster.local) it ends up with a premature closing of the communication.

I can observe a relatively high number of target RST on NLB side.

Trying to pinpoint it I’ve used ksniff and can confirm the request stop early and never touch the application, my curl test ends up sometimes with the following while there’s no restart on any container in the pod, multiple request can return 200, and sometimes I got this:

> User-Agent: curl/7.78.0
> Accept: */*
* TLSv1.2 (IN), TLS alert, close notify (256):
* Empty reply from server
* Closing connection 0
* TLSv1.2 (OUT), TLS alert, close notify (256):
curl: (52) Empty reply from server

I wasn’t able to find a correlation between any istio configuration option and this premature closure of TCP session (but the delay between start and close is always 5s which made me mess with tcpKeepalive without success).

On a side note: I’ve tried to join istio’s slack but folowing ends up on an outdated invitation link.

I guess it would be easier to exchange through Slack to debug further, but I’d be happy to provide anything better on the subject here. I’m just unsure of what to provide actually without making a wall of text.