Connection Pool TLS External Service

We have a downstream server that runs Node v8 and connects to mesh external hosts via TLS. We are using the keep alive (KA) agent in Node with a 60s initial delay (keepAliveMsecs). We are trying to figure out how we can maintain a HTTP keep alive connection to avoid the overhead of opening new TLS sockets with Istio in the mix.

Our hope was that it would be transparent and we could remove the KA agent in our code and shove the responsibility of connection pooling to the proxy, but we have found that might not be possible due to the fact that Envoy does not pool TLS TCP sockets like it does HTTP TCP sockets.

We are running Istio 1.4.6.

Here is what we have tried and noticed:


No ServiceEntry (se)/DestinationRule (dr)/VirtualService (vs) defined:

Downstream <-- TCP KA Socket --> Envoy <-- Socket --> Upstream (external)

The defaults apply, so there is a 60min timeout on the Socket which will close the downstream TCP KA Socket. There are times where the upstream will reset the connection and thus we get an ECONNRESET back to the downstream (the increased frequency triggered us to start digging into this process).


se (MESH_EXTERNAL/protocol: TLS) + dr (connectionPool.http.idleTimeout < 60min):

Downstream <-- TCP KA Socket --> Envoy <-- Socket --> Upstream (external)

So same result as with no Istio components, but I expected the socket to timeout earlier, it times out in 60min again, so not sure if the idleTimeout is getting passed through to the envoy TCP conn. We get the same frequency of ECONNRESET.


se (MESH_EXTERNAL/protocol: TLS) + dr (connectionPool.http.idleTimeout < 60min, tcp. tcpKeepalive.time/interval set to match the app KA agent):

Downstream <-- TCP KA Socket --> Envoy <-- TCP KA Socket --> Upstream (external)

This seems to be what we want? The idle timeout is still not respected, so it will timeout after 60m and close the downstream socket correctly, but the KA settings should bring us back to our regular connection topology and reduce the ECONNRESET. We are testing this theory now.


We have tried removing the app KA to let envoy manage, but ran into it closing the connection after each request due to the description I provided above.

Any feedback would be appreciated on best practice in this scenario.

1 Like

we have more less the same problem
running mongo as external service and follow the recommended setup
with

...
location: MESH_EXTERNAL

before enabling istio everythink works well but since 11days we
“Got socket exception” multiple times per day
this is 100% related to istio setup, because we have not deployed any application, only switched to istio …

does someone has an idea
and yes KeepAlive is configured in applications as before, the only difference is using
more less following setup