Hi there,
We are using Istio as part of a gRPC microservices architecture.
We found that every 5 minutes, a gRPC connection that a client service has established to an upstream service get closed, as shown below:
Blockquote
INFO: 2019/03/14 14:51:42 transport: loopyWriter.run returning. connection error: desc = “transport is closing”
INFO: 2019/03/14 14:51:42 base.baseBalancer: handle SubConn state change: 0xc0001607b0, TRANSIENT_FAILURE
INFO: 2019/03/14 14:51:44 base.baseBalancer: handle SubConn state change: 0xc0001607b0, CONNECTING
INFO: 2019/03/14 14:51:44 roundrobinPicker: newPicker called with readySCs: map
INFO: 2019/03/14 14:51:44 base.baseBalancer: handle SubConn state change: 0xc0001607b0, READY
INFO: 2019/03/14 14:51:44 roundrobinPicker: newPicker called with readySCs: map[{xxxxxxxx:50000 0 }:0xc0001607b0]
INFO: 2019/03/14 14:56:42 transport: loopyWriter.run returning. connection error: desc = “transport is closing”
INFO: 2019/03/14 14:56:42 pickfirstBalancer: HandleSubConnStateChange: 0xc000160b40, TRANSIENT_FAILURE
INFO: 2019/03/14 14:56:42 transport: loopyWriter.run returning. connection error: desc = “transport is closing”
INFO: 2019/03/14 14:56:42 base.baseBalancer: handle SubConn state change: 0xc0001607b0, TRANSIENT_FAILURE
INFO: 2019/03/14 14:56:43 pickfirstBalancer: HandleSubConnStateChange: 0xc000160b40, CONNECTING
INFO: 2019/03/14 14:56:43 pickfirstBalancer: HandleSubConnStateChange: 0xc000160b40, READY
INFO: 2019/03/14 14:56:44 base.baseBalancer: handle SubConn state change: 0xc0001607b0, CONNECTING
INFO: 2019/03/14 14:56:44 roundrobinPicker: newPicker called with readySCs: map[]
INFO: 2019/03/14 14:56:44 base.baseBalancer: handle SubConn state change: 0xc0001607b0, READY
INFO: 2019/03/14 14:56:44 roundrobinPicker: newPicker called with readySCs: map[{xxxxxxxx:50000 0 }:0xc0001607b0]
We noticed on the sidecar container (istio-proxy) that at the sametime, there is a kind of proxy reconfiguration which is removing all listeners, closing connections and creating connections again:
Blockquote
[2019-03-14 14:56:41.965][17][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:102] gRPC config stream closed: 13,
[2019-03-14 14:56:42.013][17][warning][misc] [external/envoy/source/common/protobuf/utility.cc:129] Using deprecated option ‘envoy.api.v2.Listener.use_original_dst’. This configuration will be removed from Envoy soon. Please see https://github.com/envoyproxy/envoy/blob/0286b82280d33d39b5fec7dbc6be34d2fc943c7d/DEPRECATED.md for details.
…
[2019-03-14 14:56:42.038][17][debug][main] [external/envoy/source/server/server.cc:143] flushing stats
[2019-03-14 14:56:42.185][17][debug][upstream] [external/envoy/source/common/upstream/original_dst_cluster.cc:163] Cleaning up stale original dst hosts.
[2019-03-14 14:56:42.185][17][debug][upstream] [external/envoy/source/common/upstream/original_dst_cluster.cc:163] Cleaning up stale
… removing listener …
[2019-03-14 14:56:42.412][17][debug][config] [external/envoy/source/server/listener_manager_impl.cc:617] removing listener: name=172.30.193.179_5432, hash=2680903410418478357, address=172.30.193.179:5432
[2019-03-14 14:56:42.412][17][debug][config] [external/envoy/source/server/listener_manager_impl.cc:617] removing listener: name=172.30.198.57_15011, hash=5351683613128278624, address=172.30.198.57:15011
… listener removal complete …
[2019-03-14 14:56:42.413][17][debug][config] [external/envoy/source/server/listener_manager_impl.cc:617] listener removal complete: name=172.30.243.191_443, hash=5038682578597857908, address=172.30.243.191:443
[2019-03-14 14:56:42.413][17][debug][config] [external/envoy/source/server/listener_manager_impl.cc:617] listener removal complete: name=172.30.168.120_14267, hash=235074651536250116, address=172.30.168.120:14267
[2019-03-14 14:56:42.413][23][debug][connection] [external/envoy/source/common/network/connection_impl.cc:101] [C80] closing data_to_write=0 type=1
[2019-03-14 14:56:42.413][23][debug][connection] [external/envoy/source/common/network/connection_impl.cc:183] [C80] closing socket: 1
[2019-03-14 14:56:42.413][23][debug][main] [external/envoy/source/server/connection_handler_impl.cc:68] [C80] adding to cleanup list
[2019-03-14 14:56:42.413][23][debug][connection][external/envoy/source/common/network/connection_impl.cc:101] [C81] closing data_to_write=0 type=1
[2019-03-14 14:56:42.413][23][debug][connection] [external/envoy/source/common/network/connection_impl.cc:183] [C81] closing socket: 1
[2019-03-14 14:56:42.413][23][debug][main] [external/envoy/source/server/connection_handler_impl.cc:68] [C81] adding to cleanup list
[2019-03-14 14:56:42.413][17][debug][config] [external/envoy/source/server/listener_manager_impl.cc:617] listener removal complete: name=172.30.0.1_443, hash=16711819288184588908, address=172.30.0.1:443
[2019-03-14 14:56:42.413][17][debug][config] [external/envoy/source/server/listener_manager_impl.cc:617] listener removal complete: name=0.0.0.0_15014, hash=8249298906339225086, address=0.0.0.0:15014
…
[2019-03-14 14:56:42.415][17][debug][config] [external/envoy/source/server/listener_manager_impl.cc:617] listener removal complete: name=10.0.0.19_3128, hash=15935048846062630651, address=10.0.0.19:3128
[2019-03-14 14:56:42.415][17][debug][config] [external/envoy/source/server/listener_manager_impl.cc:617] listener removal complete: name=172.30.168.120_14268, hash=2359658510866416823, address=172.30.168.120:14268
[2019-03-14 14:56:42.801][17][debug][upstream] [external/envoy/source/common/upstream/upstream_impl.cc:1307] starting async DNS resolution for zipkin.istio-system
[2019-03-14 14:56:42.802][17][debug][upstream] [external/envoy/source/common/network/dns_impl.cc:158] Setting DNS resolution timer for 5000 milliseconds
[2019-03-14 14:56:42.802][17][debug][upstream] [external/envoy/source/common/network/dns_impl.cc:158] Setting DNS resolution timer for 5000 milliseconds
[2019-03-14 14:56:42.802][17][debug][upstream] [external/envoy/source/common/upstream/upstream_impl.cc:1307] starting async DNS resolution for istio-pilot.istio-system
[2019-03-14 14:56:42.802][17][debug][upstream] [external/envoy/source/common/network/dns_impl.cc:158] Setting DNS resolution timer for 5000 milliseconds
…
[2019-03-14 14:56:42.809][17][debug][upstream] [external/envoy/source/common/network/dns_impl.cc:158] Setting DNS resolution timer for 5000 milliseconds
[2019-03-14 14:56:42.810][17][debug][upstream] [external/envoy/source/common/upstream/upstream_impl.cc:1314] async DNS resolution complete for zipkin.istio-system
[2019-03-14 14:56:42.810][17][debug][upstream] [external/envoy/source/common/network/dns_impl.cc:158] Setting DNS resolution timer for 5000 milliseconds
[2019-03-14 14:56:42.810][17][debug][upstream] [external/envoy/source/common/upstream/upstream_impl.cc:1314] async DNS resolution complete for istio-pilot.istio-system
[2019-03-14 14:56:43.847][23][debug][main] [external/envoy/source/server/connection_handler_impl.cc:257] [C82] new connection
[2019-03-14 14:56:43.848][23][debug][http2] [external/envoy/source/common/http/http2/codec_impl.cc:721] [C82] setting stream-level initial window size to 268435456
[2019-03-14 14:56:43.848][23][debug][http2] [external/envoy/source/common/http/http2/codec_impl.cc:743] [C82] updating connection-level initial window size to 268435456
[2019-03-14 14:56:43.879][17][debug][admin] [external/envoy/source/server/http/admin.cc:1021] [C1][S8552626438515267892] request complete: path: /listeners
[2019-03-14 14:56:43.879][17][debug][admin] [external/envoy/source/server/http/admin.cc:1021] [C1][S14631235835349617749] request complete: path: /stats?usedonly
[2019-03-14 14:56:44.037][23][debug][main] [external/envoy/source/server/connection_handler_impl.cc:257] [C83] new connection
[2019-03-14 14:56:44.037][23][debug][http2] [external/envoy/source/common/http/http2/codec_impl.cc:721] [C83] setting stream-level initial window size to 268435456
[2019-03-14 14:56:44.037][23][debug][http2] [external/envoy/source/common/http/http2/codec_impl.cc:743] [C83] updating connection-level initial window size to 268435456
Is that the expected behavior?
We have tested the same apps without Istio and the connection keeps established for ages:
Blockquote
INFO: 2019/03/14 16:03:18 ccResolverWrapper: sending new addresses to cc: [{xxxxxxxxx:50000 0 }]
INFO: 2019/03/14 16:03:18 base.baseBalancer: got new resolved addresses: [{xxxxxxxxx:50000 0 }]
INFO: 2019/03/14 16:03:18 base.baseBalancer: handle SubConn state change: 0xc00011a7d0, CONNECTING
INFO: 2019/03/14 16:03:19 base.baseBalancer: handle SubConn state change: 0xc00011a7d0, READY
How can we avoid that the connection get closed every ‘n’ minutes?
Thanks,
David