Ingress gateway connection refused on rolling restart

tricky · February 14, 2020, 7:56pm

Hi, I was successfully using Istio 1.3.3 and have tried upgrading to 1.4.4.
I am finding now that if I curl my application url during a rolling restart of the ingress gateway deployment, there is a period of approx 2-3 minutes where all requests return ‘connection refused’.

The new ingress gateway pod starts up, and in the logs of the new pod I see:

[Envoy (Epoch 0)] [2020-02-14 19:29:27.176][17][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:91] gR │
│ PC config stream closed: 14, no healthy upstream │
│ [Envoy (Epoch 0)] [2020-02-14 19:29:27.176][17][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:54] Un │
│ able to establish new stream │
│ 2020-02-14T19:29:33.191249Z info Envoy proxy is ready

But no requests appear in the logs of the new for 2-3 minutes after, during which I see connection refused errors in my shell. The old pod looks to have drained itself and terminated correctly:

2020-02-14T19:29:47.798373Z │ 2020-02-14T19:29:47.798427Z │ 2020-02-14T19:29:47.798436Z │ 2020-02-14T19:29:47.798563Z │ 2020-02-14T19:29:47.798622Z │ 2020-02-14T19:29:47.798632Z │ 2020-02-14T19:29:47.798691Z │ 2020-02-14T19:29:47.799585Z │ 2020-02-14T19:29:47.799619Z │ 2020-02-14T19:29:47.799699Z │ n-time-s 45 --parent-shutdown-time-s │ --log-format [Envoy │ [Envoy (Epoch 0)] │ e to child startup │ [Envoy (Epoch 0)] │ cess │ [Envoy (Epoch 1)] │ , so no admin HTTP server started. │ 2020-02-14T19:29:52.799747Z │ 2020-02-14T19:29:52.799788Z │ 2020-02-14T19:29:52.799797Z │ 2020-02-14T19:29:52.799802Z │ 2020-02-14T19:29:52.799807Z info Agent draining Proxy │
info Received new config, creating new Envoy epoch 1 │
info waiting for epoch 0 to go live before performing a hot restart │
info watchFileEvents has successfully terminated │
info Watcher has successfully terminated │
info Status server has successfully terminated │
error accept tcp [::]:15020: use of closed network connection │
info Graceful termination period is 5s, starting… │
info Epoch 1 starting │
info Envoy command: [-c /var/lib/istio/envoy/envoy_bootstrap_drain.json --restart-epoch 1 --drai │
60 --service-cluster XXX-ingressgateway --service-node router~xx.xxx.x.XXX~XXXX.svc.cluster.local --max-obj-name-len 189 --local-address-ip-version v4 │
(Epoch 1)] [%Y-%m-%d %T.%e][%t][%l][%n] %v -l warning --component-log-level misc:error] │
[2020-02-14 19:29:47.880][19][warning][main] [external/envoy/source/server/server.cc:633] shutting down admin du │
│
[2020-02-14 19:29:47.880][19][warning][main] [external/envoy/source/server/server.cc:639] terminating parent pro │
│
[2020-02-14 19:29:47.880][36][warning][main] [external/envoy/source/server/server.cc:354] No admin address given │
│
info Graceful termination period complete, terminating remaining proxies. │
warn Aborting epoch 0… │
warn Aborting epoch 1… │
warn Aborted all epochs │
info Agent has successfully terminated

I am using the default drain settings:

–drainDuration
- ‘45s’ #drainDuration
- --parentShutdownDuration
- ‘1m0s’ #parentShutdownDuration
- --connectTimeout
- ‘10s’ #connectTimeout
- --serviceCluster

I am using the default readiness probes:

readinessProbe:
failureThreshold: 30
httpGet:
path: /healthz/ready
port: 15020
scheme: HTTP
initialDelaySeconds: 1
periodSeconds: 2
successThreshold: 1
timeoutSeconds: 1

I am quite new to Istio. Please can you help point me in the right direction? I saw none of these connection refused errors during a rolling restart with 1.3.3.

Thanks

bobbyiliev · March 26, 2021, 7:57am

Hi there @tricky

I’m seeing the same problem intermittently during rolling updates.

Did you find a solution for this?

Regards,
Bobby

Topic		Replies	Views
Upgrade to 1.5.4 and ingressgateway Networking	0	715	June 25, 2020
gRPC config stream closed: 14, no healthy upstream	0	4260	May 6, 2020
Egress gateway no longer starting up Networking	0	1347	June 14, 2019
"gRPC config stream closed: 13" error in Envoy proxies? Config	9	13210	August 11, 2023
istio-Ingressgateway pod is failing with 503 error Networking	1	1962	June 25, 2019

Ingress gateway connection refused on rolling restart

Related topics