Exact steps to migrate from a helm install

Hey all!

Love istio – but we’ve installed it with helm and would like to migrate to istioctl. We have isito running in production and want to reduce or eliminate downtime. What are the EXACT steps required to migrate? I couldn’t find documentation on this. We’re on istio 1.4.4.

Here’s my theory:

istioctl manifest migrate [env-values.yaml]
helm del istio -n istio-system
helm del istio-init -n istio-system
kubectl get crd | grep istio | cut -d" " -f 1 | xargs kubectl delete crd # picked up from previous post
istioctl manifest apply [migrate-values.yaml]
kubectl apply -f [all-istio-yaml]

Is this the exact procedure required to migrate from helm to istioctl?

Thanks in advance!

1 Like

hi,

Am also planning to migrate from helm to istioctl as @haunted mentioned. Can someone please provide the step for migrating from helm install in 1.5 to istioctl 1.6.4

Hi, that’s pretty much the right recipe. You should inspect the output of istioctl manifest migrate to make sure it looks right - manifest migrate does just mechanical changes so the two versions should contain identical content, just expressed a little bit differently.

@ostromart Currently istio 1.5.0 is running via helm install… Is there any option to upgrade to istioctl 1.6.4 without purging existing helm install …

1 Like

I’m curious about that too. If not, I have to do it under a maintenance window for our prod system.

I’m curious about upgrading from 1.5 w/ helm as well. Seems like I can use istioctl generate manifest and commit the result to our gitops system, but from what I’ve read - it’s not clear if doing that would replace the existing istio install or if it would generate a canary, I think we can generate a canary and produce a manifest but again, it’s not entirely clear.

That process worked well enough - the upgrade on the other hand was a bit of a nightmare. I ended up nuking 1.4.4 entirely, generating a new minimal manifest for 1.5.7 and going from there. The migrate command against the working 1.4.4 manifest resulted in a broken IstioOperator and bricked cluster.

I am hitting a similar issue nmigrating from helm templated/kubectl apply installed istio 1.5.4 to using istioctl in 1.6.x, whether i use istioctl manifest migrate or not, installing a new revision of istio results in all my EC2 instances being taken out of service, and my services being unreachable as a result.

The same install using istioctl with the same operator config works perfectly in a fresh cluster; but the upgrade doesnt work at all.

i believe this to be related to them changing the status port (at the same time as breaking a major installation option, cool choice) – even though my service object and load balancer are updated properly, no instances will pass healthchecks after the new revision is installed.

Here what i get when i try to curl the service from within the cluster after deploying 1.6.x:

curl istio-ingressgateway.istio-system.svc.cluster.local:15021/healthz/readyz -vL

* Trying 10.100.182.102...

* TCP_NODELAY set

* Connected to istio-ingressgateway.istio-system.svc.cluster.local (10.100.182.102) port 15021 (#0)

> GET /healthz/readyz HTTP/1.1

> Host: istio-ingressgateway.istio-system.svc.cluster.local:15021

> User-Agent: curl/7.61.1

> Accept: */*

> 

< HTTP/1.1 503 Service Unavailable

< content-length: 19

< content-type: text/plain

< date: Wed, 08 Jul 2020 00:09:44 GMT

< server: envoy

< 

* Connection #0 to host istio-ingressgateway.istio-system.svc.cluster.local left intact
no healthy upstream

If i redeploy my helm templated manifest of 1.5.4 over again, things are restored, and the service responds with a 200 to both 15020 and 15021

It seems the this may be part of the issue:

kubectl get ep -n istio-system
istio-system   istio-ingressgateway                 <none>                                                                      136m

I’m pretty perplexed as to whats happening here

@nothingofuse I’m curious – did you migrate the helm values file into a manifest? That only worked for us to get 1.4.4 off helm. When I migrated the 1.4.4 manifest to 1.5.7 (using 1.5.7 binary) it completely wrecked the deployment. I wish I could help ya more – but perhaps the problem is migration? If its not a production cluster, it might be helpful to do istioctl profile dump minimal – see if that will install, and work your way up from there.

I think i found the issue!

this doc helped me out: https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service/#does-the-service-have-any-endpoints

my service had no endpoints, which points to the selector being bad and sure enough!

selector:
    app: istio-ingressgateway
    istio: ingressgateway
    release: default-istio
Pod Template:
  Labels:           app=istio-ingressgateway
                    chart=gateways
                    heritage=Tiller
                    istio=ingressgateway
                    release=istio

Helm was manipulating the release label, and moving away from helm caused the selector to fail.

Now I’m looking into how/why the istioctl installation doesnt update the selector if its going to update the labels of the things its selecting.

I tried using istioctl manifest migrate, i had to delete a bunch of lines from our values.yaml that we were using, but it was a fine jumping off point. i used istioctl 1.6.3 and my manifest was for istio 1.5.4

Im working through the migration steps to get into our CICD pipeline and eventually go into production, want to get it right with 0/minimal downtime in my test cluster first; really hoping i dont need a new load balancer and a bunch of DNS shenanigans to get this migration done.

This is what i added to my operator config in order to keep my loadbalancer/ingressgateway workign through the 1.5.4 to 1.6.4 migration:

apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
  components:
    ingressGateways:
      - enabled: true
        k8s:
          service:
            ports:
              - name: status-port
                port: 15021
                targetPort: 15021
              - name: http2
                port: 80
                targetPort: 8080
              - name: https
                port: 443
                targetPort: 8443
              - name: tls
                port: 15443
                targetPort: 15443
            selector:
              app: istio-ingressgateway
              istio: ingressgateway
              release: istio

@haunted, I’m really sorry to hear about your experience. Would you mind writing up what steps you followed so we can create any issues and/or update the docs?

@ostromart Sure – I’d be happy to try and reproduce the issue for y’all.

Here’s the rough steps taken.

  1. Migrate 1.4.4 off helm and onto istioctl by way of the steps above
  2. Used 1.5.7 istioctl binary to migrate the manifest over to the new Operator
  3. Run upgrade dry-run via 1.5.7 binary and get IOP warnings
  4. Attempt to force upgrade results in most of the istio pods not coming up at all (no galley, etc)
  5. Nuke namespace and attempt clean 1.5.7 install with migrated manifest - same results
  6. Use istioctl profile dump minimal and attempt another clean install - works
  7. Modify profile to support settings that were in 1.4.4 – works

Our conclusion was that the migrated 1.4.4 manifest wasn’t in the correct format for 1.5.7.

Looks like people here are having a similar experience doing this to what I’ve had when I’ve tried each time. I’ve not been able to successfully migrate away from Helm in my staging cluster without downtime so far. It’d be great to have the steps you need to take documented somewhere.

I faced two main issues; the ingress gateways being deleted when trying to tear down the Helm installation at the end, and still being left with a broken istioctl installation because of poor parity between the Helm values and the new istioctl values. I’ve not given it a try for about a month so might set some time aside to try it again soon.

If I’m completely honest, as an aside, I’ve been using Linkerd in newer clusters with far fewer issues than I’ve faced with Istio. The biggest difference I’ve noticed is the documentation and guides are really clear. The only things I’ve found about this migration process (and for some other stuff) has come in the form of outdated blog posts.

Edit: Looks like it’s still not possible with the latest version of Istio to use the manifest migrate on Helm Values that work with 1.5.x. The only thing that works is using an old version of istioctl (I used 1.4.3) to perform an initial migration, the migrate the output of that to get an incomplete operator manifest. In that case, does it mean the best approach I can take is just starting from scratch and porting the values manually as best as I can?

^^ I’m going to agree with the spirit of seeruk’s comment. istio really needs to up it’s game and document these processes with examples. It’s fairly difficult for me to comprehend why they wouldn’t do this when they know people run their software in production environments. I have years of experience with k8s at this point and I find istio to be super difficult to manage and understand using the documentation. It’s sad. The documentation almost casually implies these commands “just work” but they really don’t in my experience.

@ostromart

    base:
      enabled: true

This is what was missing from the migrate command that stood out to me. Seems kinda important if you migrate a 1.4.4 -> 1.5.7 manifest right?

@haunted, thanks for writing that up. I’d like to get to the bottom of what went wrong for you.
Are you able to share the input values.yaml? I’d like to run through this myself to see what happens.
base.enabled: true is usually not necessary because that’s the default in most profiles (except the empty profile). manifest migrate should just produce the deltas from the default profile to get to whatever you have in values.yaml.

@ostromart Certainly. I can scrub them a little bit and host them somewhere. Should the original values file and the 1.4.4 manifest + the “bad” 1.5.7 manifest suffice? I’ll assume so and get you this data soon. I’ll reply here with the link,

Yes, that will be enough.

btw. I’m working on improving the migration docs for helm users - https://github.com/istio/istio.io/pull/7689 is a start. I absolutely agree that the docs are incomplete in this area.