Istio 1.4.6 (Helm) to Istio 1.6.5 via Canary

I’ve been investigating the best way forward to upgrade Istio from 1.4 installed via Helm to a latest version.

The docs state that we shouldn’t skip major versions, but I’ve also seen the upgrade page describing how to jump to 1.6 using a canary deployment of Istio. Is this the official / advised upgrade path?

If we were to take this upgrade path; during the period where we rollout and some workloads are part of the control plane of 1.6.5 and others 1.4.6, can these workloads communicate?

My concern is around downtime caused by some workloads being replaced before others.

Thanks,

Elliot

Did you come up with a working plan for this? I tried upgrading from 1.4.10 to 1.6.8 using the canary approach and just installing the canary took down the entire deployment. I never got 1.6.8 to work at all (problems with istiod-canary being unable to create CRDs) and ended up wiping istio-system and redeploying 1.4.10.

Since I posted, the upgrade page has been updated explaining how to upgrade from 1.4: https://istio.io/v1.6/docs/setup/upgrade/#upgrading-from-1-4

There’s still some mixed messaging, as earlier release notes state specifically that you cannot skip minor versions.

Your issue could be due to not disabling the 1.4 validation, the Istiod role not having required privileges, or maybe not using a compatible version of K8s (1.15+).

My personal understanding of the upgrade path:

  1. Disable Istio 1.4 validation
  2. Install Istio 1.6 as a canary using Istioctl
  3. Remove the old istio-injection label and add the revision label to the application namespace
  4. cycle pods and confirm new proxy’s are working correctly
  5. Remove the 1.4 control plane components

To add to this and answer the second part of my initial question - I deployed an application with multiple microservices with mixed data-plane of 1.4 and 1.6 and I did not appear to experience downtime for http and grpc requests between services. Would still appreciate an Istio contributor confirmation that this is the designed behaviour.

1 Like

I followed that exact page. Validation was disabled. I’m on k8s 1.16 (GKE version though). As soon as I installed the canary, all deployments stopped working and istiod-canary was unable to start due to its service account not having permission to create custom resources. I was expecting it to create the necessary cluster roles and bindings as part of the canary install but that did not seem to happen.

Migrating the existing values file used to install istio previously was also riddled with validation errors. CNI didn’t get migrated automatically and there were likely a bunch of other settings that didn’t translate well.

I think I’ll try upgrading to 1.15 with helm first before trying the migration to 1.6 istioctl again. I’m hoping that will help get istiod running at least, and reduce the complexity of migrating.

My understanding is you’ll need to create those role permissions and they aren’t created by the install. Once those are present I would expect Istiod to start correctly and you could follow the process above. I’ve tested this on a microk8s local cluster.

I think this is less complexity than installing Istio 1.5 as that’s when the control plane components changed and why we see benefit in upgrading using the canary deployment.

On the values file, I’ve decided to rewrite it for the operator instead of using the migration tool as we had some custom behaviour dynamically setting values.

@Elliot_Messias: Im following a similar path that you took.
I have two control planes: 1.4.7 and 1.6.11

Im very intrigued by your statement “I deployed an application with multiple microservices with mixed data-plane of 1.4 and 1.6 and I did not appear to experience downtime for http and grpc requests between services.” . Is there any specific knob you had to set to get mixed data plane services to talk to each other? Im wondering, did you use mutual tls setting?

Regards
Meher