Simple Install 1.2.3/1.2.4 Error: Error from server (Timeout): error when creating "/tmp/security/custom-resources.yaml"

I’m doing a very simple install. I was getting this error with Istio 1.2.3 and just retried and get the same error with 1.2.4:

curl -O -L https://github.com/istio/istio/releases/download/1.2.4/istio-1.2.4-osx.tar.gz
tar xvf istio-1.2.4-osx.tar.gz
cd istio-1.2.4

helm install install/kubernetes/helm/istio-init --name istio-init --namespace istio-system

# Verify that this results in 23 CRDs
kubectl get crds | grep istio.io | wc -l

helm install install/kubernetes/helm/istio \
  --name istio --namespace istio-system \
  --set prometheus.enabled=false \
  --wait

I get Error: timed out waiting for the condition

If I look at Istio pods I see:

NAME                                      READY   STATUS             RESTARTS   AGE
istio-citadel-657c84d86f-8k9ws            1/1     Running            0          8m3s
istio-galley-6d4c54fc76-rhm64             1/1     Running            0          8m3s
istio-ingressgateway-7f768f54c7-sdp75     1/1     Running            0          8m3s
istio-init-crd-10-29lvm                   0/1     Completed          0          8m56s
istio-init-crd-11-jnjwr                   0/1     Completed          0          8m56s
istio-init-crd-12-sdnsd                   0/1     Completed          0          8m56s
istio-pilot-6b65d765b5-cblpq              2/2     Running            0          8m3s
istio-policy-5d7d7d557d-5l9pz             2/2     Running            1          8m3s
istio-security-post-install-1.2.4-8hx6m   0/1     CrashLoopBackOff   5          7m10s
istio-sidecar-injector-78949dd945-7zc7s   1/1     Running            0          8m2s
istio-telemetry-77797d4d8-zxsss           2/2     Running            1          8m3s

If I look at logs on the security-post-install pod that is in state CrashLoopBackOff, I see:

+ kubectl apply -f /tmp/security/custom-resources.yaml
Error from server (Timeout): error when creating "/tmp/security/custom-resources.yaml": Timeout: request did not complete within requested timeout 30s

I can see that retries and fails with the same error before the full Helm install fails.

If I google that error, I see a similar thread: Unable to install istio with terraform helm plugin

The author posted a solution:

I identified my issue. Was missing an open port on 443 between the control plane and the worker nodes. That was added when they enabled support for validating webhooks. It is not really called out anywhere in their documentation that 443 is required for that to work.

How would I do that? How would I verify that that is open?

FYI, this is on a new, simple, relatively empty Kubernetes cluster, running on Amazon EKS, with Kubernetes version “v1.13.8-eks-a977ba”.

Just add a security group rule to allow inbound traffic to the control plane from the nodes, and egress from the nodes to the control plane. 443 is the port

1 Like

I had followed the Official Hashicorp Terraform guide for configuring an EKS cluster: https://learn.hashicorp.com/terraform/aws/eks-intro

The settings given will result in the error I describe in the OP. The problem is this, which I’ve copy pasted from the Hashicorp guide:

resource "aws_security_group_rule" "demo-node-ingress-cluster" {
  description              = "Allow worker Kubelets and pods to receive communication from the cluster control plane"
  from_port                = 1025
  protocol                 = "tcp"
  security_group_id        = "${aws_security_group.demo-node.id}"
  source_security_group_id = "${aws_security_group.demo-cluster.id}"
  to_port                  = 65535
  type                     = "ingress"
}

That sets the allowed cluster->node port-range to [1025,65535] which will result in the error described in the OP. If I expand the allowed cluster->node port-range set to [0,65535], it resolves this issue.

The node->cluster port range advised by the Hashicorp guide is [443,443] and that is working.

Aaron_Mell, you said “allow inbound traffic to the control plane from the nodes,”. You need traffic allowed in both directions, but the precise problem experienced here, by following the Hashicorp guidelines, is with restricted traffic from the cluster to the nodes.