Running without galley causes istio-policy and istio-telemetry CrashLoopBackOff [SOLVED]

I have been undertaking a major effort to try and make Auth0 validation work.

Some responses to people having auth issues suggested disabling the galley service because of what appear to be issues with galley performing validation and silently failing.

In response to trying to fix my issues I used the helm install options to disable galley and see that they pod disappears on a helm install on a fresh AWS kops cluster. However the istio-proxy and istio-telemetry pods fail as a result.

Both fail in similar ways and I’m not sure if this is kosher as there are instructions around to do this explicitly ?


kubectl get pods -n istio-system
NAME                                      READY   STATUS             RESTARTS   AGE
istio-citadel-5555fbbd6c-c6mfj            1/1     Running            0          15m
istio-ingressgateway-54497b5849-wplkh     0/1     Running            0          15m
istio-init-crd-10-dt4xp                   0/1     Completed          0          16m
istio-init-crd-11-mnvql                   0/1     Completed          0          16m
istio-pilot-c947b6bcb-l82kh               1/2     Running            0          15m
istio-policy-dddf49987-nmzs7              1/2     CrashLoopBackOff   8          15m
istio-sidecar-injector-5cfc4c8f74-qdwd7   1/1     Running            0          15m
istio-telemetry-5bdf95bc6f-gssr7          1/2     CrashLoopBackOff   9          15m
kiali-66d74fc6cc-w72sw                    1/1     Running            0          15m
prometheus-7d9fb4b69c-7zmrv               1/1     Running            0          15m

2019-06-15T16:22:47.533029Z     info    PilotSAN []string(nil)
2019-06-15T16:22:47.533086Z     info    Starting proxy agent
2019-06-15T16:22:47.533295Z     info    watching /etc/certs for changes
2019-06-15T16:22:47.533321Z     info    Received new config, resetting budget
2019-06-15T16:22:47.533332Z     info    Reconciling retry (budget 10)
2019-06-15T16:22:47.533366Z     info    Epoch 0 starting
2019-06-15T16:22:47.533422Z     info    Envoy command: [-c /etc/istio/proxy/envoy.yaml --restart-epoch 0 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster istio-telemetry --service-node sidecar~100.96.1.11~istio-telemetry-5bdf95bc6f-gssr7.istio-system~istio-system.svc.cluster.local --max-obj-name-len 189 --allow-unknown-fields -l warning]
[2019-06-15 16:22:47.552][18][warning][misc] [external/envoy/source/common/protobuf/utility.cc:174] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-06-15 16:22:47.552][18][warning][misc] [external/envoy/source/common/protobuf/utility.cc:174] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
2019-06-15T16:22:58.147285Z     info    watchFileEvents: "/etc/certs/..2019_06_15_16_22_58.873596552": CREATE
2019-06-15T16:22:58.147338Z     info    watchFileEvents: "/etc/certs/..2019_06_15_16_22_58.873596552": MODIFY|ATTRIB
2019-06-15T16:22:58.147348Z     info    watchFileEvents: "/etc/certs/key.pem": CREATE
2019-06-15T16:22:58.147442Z     info    watchFileEvents: "/etc/certs/root-cert.pem": CREATE
2019-06-15T16:22:58.147480Z     info    watchFileEvents: "/etc/certs/cert-chain.pem": CREATE
2019-06-15T16:22:58.147505Z     info    watchFileEvents: "/etc/certs/..data_tmp": RENAME
2019-06-15T16:22:58.147533Z     info    watchFileEvents: "/etc/certs/..data": CREATE
2019-06-15T16:22:58.147546Z     info    watchFileEvents: "/etc/certs/..2019_06_15_16_22_22.825094329": DELETE
2019-06-15T16:23:08.147511Z     info    watchFileEvents: notifying
2019-06-15T16:23:08.147837Z     info    Received new config, resetting budget
2019-06-15T16:23:08.147852Z     info    Reconciling retry (budget 10)
2019-06-15T16:23:08.147908Z     info    Epoch 1 starting
2019-06-15T16:23:08.147980Z     info    Envoy command: [-c /etc/istio/proxy/envoy.yaml --restart-epoch 1 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster istio-telemetry --service-node sidecar~100.96.1.11~istio-telemetry-5bdf95bc6f-gssr7.istio-system~istio-system.svc.cluster.local --max-obj-name-len 189 --allow-unknown-fields -l warning]
[2019-06-15 16:23:08.166][60][warning][misc] [external/envoy/source/common/protobuf/utility.cc:174] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-06-15 16:23:08.166][60][warning][misc] [external/envoy/source/common/protobuf/utility.cc:174] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-06-15 16:23:08.166][18][warning][main] [external/envoy/source/server/server.cc:536] shutting down admin due to child startup
[2019-06-15 16:23:08.166][18][warning][main] [external/envoy/source/server/server.cc:544] terminating parent process
[2019-06-15 16:24:08.176][18][warning][main] [external/envoy/source/server/server.cc:425] caught SIGTERM
2019-06-15T16:24:08.267936Z     info    Epoch 0 exited normally
2019-06-15T16:24:08.268006Z     warn    Failed to delete config file /etc/istio/proxy/envoy-rev0.json for 0, remove /etc/istio/proxy/envoy-rev0.json: no such file or directory


2019-06-15T16:46:27.121582Z info pickfirstBalancer: HandleSubConnStateChange: 0xc420455680, CONNECTING
2019-06-15T16:46:27.124965Z info grpc: addrConn.createTransport failed to connect to {istio-galley.istio-system.svc:9901 0 }. Err :connection error: desc = “transport: Error while dialing dial tcp: lookup istio-galley.istio-system.svc on 100.64.0.10:53: no such host”. Reconnecting…
2019-06-15T16:46:27.125014Z info pickfirstBalancer: HandleSubConnStateChange: 0xc420455680, TRANSIENT_FAILURE
2019-06-15T16:46:27.662624Z info mcp (re)trying to establish new MCP sink stream
2019-06-15T16:46:27.662689Z error mcp Failed to create a new MCP sink stream: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = “transport: Error while dialing dial tcp: lookup istio-galley.istio-system.svc on 100.64.0.10:53: no such host”
2019-06-15T16:46:28.662884Z info mcp (re)trying to establish new MCP sink stream
2019-06-15T16:46:28.662943Z error mcp Failed to create a new MCP sink stream: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = “transport: Error while dialing dial tcp: lookup istio-galley.istio-system.svc on 100.64.0.10:53: no such host”
2019-06-15T16:46:29.663123Z info mcp (re)trying to establish new MCP sink stream
2019-06-15T16:46:29.663188Z error mcp Failed to create a new MCP sink stream: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = “transport: Error while dialing dial tcp: lookup istio-galley.istio-system.svc on 100.64.0.10:53: no such host”
2019-06-15T16:46:30.663358Z info mcp (re)trying to establish new MCP sink stream
2019-06-15T16:46:30.663422Z error mcp Failed to create a new MCP sink stream: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = “transport: Error while dialing dial tcp: lookup istio-galley.istio-system.svc on 100.64.0.10:53: no such host”
2019-06-15T16:46:31.663614Z info mcp (re)trying to establish new MCP sink stream
2019-06-15T16:46:31.663677Z error mcp Failed to create a new MCP sink stream: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = “transport: Error while dialing dial tcp: lookup istio-galley.istio-system.svc on 100.64.0.10:53: no such host”
2019-06-15T16:46:32.663852Z info mcp (re)trying to establish new MCP sink stream
2019-06-15T16:46:32.663917Z error mcp Failed to create a new MCP sink stream: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = “transport: Error while dialing dial tcp: lookup istio-galley.istio-system.svc on 100.64.0.10:53: no such host”
2019-06-15T16:46:32.711449Z info pickfirstBalancer: HandleSubConnStateChange: 0xc420455680, CONNECTING
2019-06-15T16:46:32.714757Z info grpc: addrConn.createTransport failed to connect to {istio-galley.istio-system.svc:9901 0 }. Err :connection error: desc = “transport: Error while dialing dial tcp: lookup istio-galley.istio-system.svc on 100.64.0.10:53: no such host”. Reconnecting…
2019-06-15T16:46:32.714785Z info pickfirstBalancer: HandleSubConnStateChange: 0xc420455680, TRANSIENT_FAILURE

Please see also: https://github.com/istio/istio/issues/14841

1 Like

I think this is different than 14841, mixer is trying to talk to Galley (that is what MCP is) but can’t because it doesn’t exist. You need to set both --set global.useMCP=false --set galley.enabled=false, did you just disable galley?