Istio proxy takes too long to start up

When we scale a pod 0 -> 1 in kubernetes, istio-proxy sidecar takes 34 seconds to start up. This is due to envoy taking that much long to start up. Is there anyway to tune envoy in istio so that it starts up much faster?

Istio-proxy version is 1.4.7
Envoy version is 1.12.0

Thanks!

Can you post your sidecar logs? if the image was not present on the node at start time it could have been waiting to pull the image. Is it always taking 34 seconds?

Thanks for your reply. It takes 34 seconds consistently.

Here is the log:

2020-11-30T19:16:56.996452Z info FLAG: --applicationPorts="[8080]"
2020-11-30T19:16:56.996958Z info FLAG: --binaryPath="/usr/local/bin/envoy"
2020-11-30T19:16:56.996976Z info FLAG: --concurrency=“2”
2020-11-30T19:16:56.996989Z info FLAG: --configPath="/etc/istio/proxy"
2020-11-30T19:16:56.997003Z info FLAG: --connectTimeout=“10s”
2020-11-30T19:16:56.997014Z info FLAG: --controlPlaneAuthPolicy=“NONE”
2020-11-30T19:16:56.997029Z info FLAG: --controlPlaneBootstrap=“true”
2020-11-30T19:16:56.997041Z info FLAG: --customConfigFile=""
2020-11-30T19:16:56.997053Z info FLAG: --datadogAgentAddress=""
2020-11-30T19:16:56.997064Z info FLAG: --disableInternalTelemetry=“false”
2020-11-30T19:16:56.997086Z info FLAG: --discoveryAddress=“istio-pilot.istio-system:15010”
2020-11-30T19:16:56.997098Z info FLAG: --dnsRefreshRate=“300s”
2020-11-30T19:16:56.997111Z info FLAG: --domain=“openfaas.svc.cluster.local”
2020-11-30T19:16:56.997123Z info FLAG: --drainDuration=“45s”
2020-11-30T19:16:56.997134Z info FLAG: --envoyAccessLogService=""
2020-11-30T19:16:56.997158Z info FLAG: --envoyMetricsService=""
2020-11-30T19:16:56.997169Z info FLAG: --help=“false”
2020-11-30T19:16:56.997195Z info FLAG: --id=""
2020-11-30T19:16:56.997229Z info FLAG: --ip=""
2020-11-30T19:16:56.997255Z info FLAG: --lightstepAccessToken=""
2020-11-30T19:16:56.997274Z info FLAG: --lightstepAddress=""
2020-11-30T19:16:56.997285Z info FLAG: --lightstepCacertPath=""
2020-11-30T19:16:56.997296Z info FLAG: --lightstepSecure=“false”
2020-11-30T19:16:56.997309Z info FLAG: --log_as_json=“false”
2020-11-30T19:16:56.997320Z info FLAG: --log_caller=""
2020-11-30T19:16:56.997331Z info FLAG: --log_output_level=“default:info”
2020-11-30T19:16:56.997350Z info FLAG: --log_rotate=""
2020-11-30T19:16:56.997362Z info FLAG: --log_rotate_max_age=“30”
2020-11-30T19:16:56.997374Z info FLAG: --log_rotate_max_backups=“1000”
2020-11-30T19:16:56.997386Z info FLAG: --log_rotate_max_size=“104857600”
2020-11-30T19:16:56.997402Z info FLAG: --log_stacktrace_level=“default:none”
2020-11-30T19:16:56.997440Z info FLAG: --log_target="[stdout]"
2020-11-30T19:16:56.997454Z info FLAG: --mixerIdentity=""
2020-11-30T19:16:56.997466Z info FLAG: --parentShutdownDuration=“1m0s”
2020-11-30T19:16:56.997479Z info FLAG: --pilotIdentity=""
2020-11-30T19:16:56.997492Z info FLAG: --proxyAdminPort=“15000”
2020-11-30T19:16:56.997504Z info FLAG: --proxyComponentLogLevel=“misc:error”
2020-11-30T19:16:56.997516Z info FLAG: --proxyLogLevel=“warning”
2020-11-30T19:16:56.997534Z info FLAG: --serviceCluster=“mining.openfaas”
2020-11-30T19:16:56.997553Z info FLAG: --serviceregistry=“Kubernetes”
2020-11-30T19:16:56.997565Z info FLAG: --statsdUdpAddress=""
2020-11-30T19:16:56.997576Z info FLAG: --statusPort=“15020”
2020-11-30T19:16:56.997592Z info FLAG: --templateFile=""
2020-11-30T19:16:56.997602Z info FLAG: --trust-domain=""
2020-11-30T19:16:56.997621Z info FLAG: --zipkinAddress=“zipkin.istio-system:9411”
2020-11-30T19:16:56.997673Z info Version 1.4.7-c51f95cf5fdea0c270f938344945c69c1a578b4a-Clean
2020-11-30T19:16:56.998365Z info Obtained private IP [10.42.4.73 fe80::84a:bdff:fef9:b32e]
2020-11-30T19:16:57.001397Z info Proxy role: &model.Proxy{ClusterID:"", Type:“sidecar”, IPAddresses:string{“10.42.4.73”, “10.42.4.73”, “fe80::84a:bdff:fef9:b32e”}, ID:“mining-84b6b969d-bp9s9.openfaas”, Locality:(*envoy_api_v2_core.Locality)(nil), DNSDomain:“openfaas.svc.cluster.local”, ConfigNamespace:"", Metadata:(*model.NodeMetadata)(nil), SidecarScope:(*model.SidecarScope)(nil), MergedGateway:(*model.MergedGateway)(nil), ServiceInstances:*model.ServiceInstance(nil), WorkloadLabels:labels.Collection(nil), IstioVersion:(*model.IstioVersion)(nil)}
2020-11-30T19:16:57.001909Z info PilotSAN string(nil)
2020-11-30T19:16:57.001944Z info MixerSAN string(nil)
2020-11-30T19:16:57.003961Z info Effective config: binaryPath: /usr/local/bin/envoy
concurrency: 2
configPath: /etc/istio/proxy
connectTimeout: 10s
discoveryAddress: istio-pilot.istio-system:15010
drainDuration: 45s
envoyAccessLogService: {}
envoyMetricsService: {}
parentShutdownDuration: 60s
proxyAdminPort: 15000
serviceCluster: mining.openfaas
statNameLength: 189
tracing:
zipkin:
address: zipkin.istio-system:9411
2020-11-30T19:16:57.004044Z info PilotSAN string(nil)
2020-11-30T19:16:57.004121Z info Starting proxy agent
2020-11-30T19:16:57.094515Z info Opening status port 15020
2020-11-30T19:16:57.096154Z info Received new config, creating new Envoy epoch 0
2020-11-30T19:16:57.096361Z info watching /etc/certs for changes
2020-11-30T19:16:57.096354Z info Epoch 0 starting
2020-11-30T19:16:57.303309Z info Envoy command: [-c /etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster mining.openfaas --service-node sidecar~10.42.4.73~mining-84b6b969d-bp9s9.openfaas~openfaas.svc.cluster.local --max-obj-name-len 189 --local-address-ip-version v4 --log-format [Envoy (Epoch 0)] [%Y-%m-%d %T.%e][%t][%l][%n] %v -l warning --component-log-level misc:error --concurrency 2]
[Envoy (Epoch 0)] [2020-11-30 19:16:57.994][23][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:91] gRPC config stream closed: 14, no healthy upstream
2020-11-30T19:16:58.127040Z info watchFileEvents: “/etc/certs”: MODIFY|ATTRIB
2020-11-30T19:16:58.127651Z info watchFileEvents: “/etc/certs”: MODIFY|ATTRIB
2020-11-30T19:16:58.128623Z info watchFileEvents: “/etc/certs/…2020_11_30_19_16_22.458601565”: MODIFY|ATTRIB
2020-11-30T19:16:58.129108Z info watchFileEvents: “/etc/certs/…2020_11_30_19_16_22.458601565”: MODIFY|ATTRIB
2020-11-30T19:17:00.297046Z info Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 1 successful, 0 rejected; lds updates: 0 successful, 0 rejected
[Envoy (Epoch 0)] [2020-11-30 19:17:00.598][23][warning][filter] [src/envoy/http/authn/http_filter_factory.cc:102] mTLS PERMISSIVE mode is used, connection can be either plaintext or TLS, and client cert can be omitted. Please consider to upgrade to mTLS STRICT mode for more secure configuration that only allows TLS connection with client cert. See https://istio.io/docs/tasks/security/mtls-migration/
2020-11-30T19:17:03.071535Z info watchFileEvents: “/etc/certs”: MODIFY|ATTRIB
2020-11-30T19:17:03.072299Z info watchFileEvents: “/etc/certs”: MODIFY|ATTRIB
2020-11-30T19:17:03.894737Z info Envoy proxy is NOT ready: failed retrieving Envoy stats:

2020-11-30T19:17:27.875271Z info Envoy proxy is NOT ready: failed retrieving Envoy stats:
2020-11-30T19:17:29.875401Z info Envoy proxy is NOT ready: failed retrieving Envoy stats:
2020-11-30T19:17:30.604441Z info Envoy proxy is ready
2020-11-30T19:17:32.895368Z info watchFileEvents: “/etc/certs”: MODIFY|ATTRIB
2020-11-30T19:17:32.896463Z info watchFileEvents: “/etc/certs”: MODIFY|ATTRIB
2020-11-30T19:17:32.897481Z info watchFileEvents: “/etc/certs/…2020_11_30_19_16_22.458601565”: MODIFY|ATTRIB
2020-11-30T19:17:32.897900Z info watchFileEvents: “/etc/certs/…2020_11_30_19_16_22.458601565”: MODIFY|ATTRIB
2020-11-30T19:17:35.009736Z info watchFileEvents: “/etc/certs”: MODIFY|ATTRIB
2020-11-30T19:17:35.010692Z info watchFileEvents: “/etc/certs”: MODIFY|ATTRIB
2020-11-30T19:17:35.011033Z info watchFileEvents: “/etc/certs/…2020_11_30_19_16_22.458601565”: MODIFY|ATTRIB
2020-11-30T19:17:35.011637Z info watchFileEvents: “/etc/certs/…2020_11_30_19_16_22.458601565”: MODIFY|ATTRIB
2020-11-30T19:17:42.896564Z info watchFileEvents: notifying

How many clusters do you have?

Just one cluster at the moment.

I forgot to mention, I use 300mCPU limit for istio side car. If I use 2CPU limit, it starts up in 6 seconds but that is still too long for us. From what I have seen in the internet, it seems like istio taking 6 seconds to start up is normal? Because of this, we cannot afford to scale up from zero.

I meant how many clusters are created in Envoy not K8s clusters (I am assuming 1 is K8s cluster)?

What is your use-case? What would be an acceptable startup time?