How to design a deployment strategy for Istio applications that use traffic routing

#1

I am trying to figure out a good Istio based build/deploy pipeline for an application. The application supports a number of distinct environments (e.g., pre-staging, staging, and production, etc.), as well as feature branch deployments (e.g., feature-jp12, bugfix-dp34, etc.). The environments will be routed based on the hostname, and feature branches based on a header value.

I’ve found a bunch of documentation and blog posts about how to set up Istio’s VirtualService, DestinationRule, and so on, but I’m struggling with how to actually implement it in a CI/CD pipeline.

The application is built from source code, and the image name and tag combine to specify the environment and feature branch (if any). This information is plugged into a kustomize pipeline to correctly build all the definitions.

The problem that I’m having is that I want builds to be independent. So when a branch is built, it produces a new k8s Deployment, and it should then be deployed to the k8s cluster. The problem is that other branches have been built, so there is already a VirtualService/DestinationRule for the application in the cluster! I can write code or use kustomize to get the currently running VritualService/DestinationRule and update it to add the routing to the Deployment that was just built and deployed, but this has a number of problems.

First, there is a timing problem, as there is no global lock, so two deployments could occur at the same time, with one deployment overwriting the changes in the VirtualService that the other made. I can work around this by building a lock in the CI/CD system, so this seems solvable. Are there better approaches?

If I go with the approach above, I have the problem of garbage collection. I don’t want every version ever deployed to be on the cluster, so I need some way to clean up the VirtualService and DestinationRules. In a perfect world, I would delete the Deployment and the parts of the VirtualService/DestinationRule that pointed to that deployment would be removed. However, it is my understanding that this isn’t possible.

What are other people doing in this space? Is the solution to either hand edit/maintain the VirtualService/DestinationRule, adding and removing versions of the application? Or more custom tooling to do this automatically? I feel like I must be missing something.

0 Likes

#2

you can have multiple virtual services for same host (at gateway only). they get merged. same with dest rules. define dest rules with just the subset and nothing else

0 Likes

#3

Thanks, that makes sense for deploying new subsets.

How does it work for deletes/cleanup?

0 Likes

#4

It is the same thing. A good structure I’m thinking is to have service A v1 in a branch for repo A, and latest v2 in master for repo A. The route config and gateway config should be tracked separately in a different repo X. Then as you deploy v1/v2 onto your cluster, you can make changes to repo X to push through the desired gateway or route config changes.

Does this make sense?

0 Likes

#5

That does make sense in terms of workflow, but I’m still trying to figure out how to automate the management of the routing. In your example it’s the “make changes to repo X to push through the desired gateway or route config changes” part.

Let’s say I have one service with three versions: v1, v2, v3. The versions are not feature branches, just versions. v1 and v2 are deployed, and the routing configured for 100% going to v2. We then build and deploy v3, and the following workflow is started:

  1. Configure routing so 1% of traffic is sent to v3
  2. If everything looks good, route 100% of traffic to v3
  3. If there are errors, route 100% of traffic to v2.

v1 acts as the “previous version” of v2. If v3 is good, and it is handling 100% of traffic, then v1 is no longer needed, and can be removed (both Deployment and traffic routing). It is this removal of v1 (or v2 later) that is tripping me up. I can’t delete the VirtualService (as it’s still in use) and k8s merges the yaml, so how do I remove the references to v1 in the VirtualService/Deployment?

The best I can think of right now is to write a script that looks for old versions, but I’d really like to avoid parsing yaml or json and directly modifying it, as this feels brittle and highly dependent on Istio data structures, and I’m not even sure if it’s possible (given the merge behavior of k8s).

1 Like

#6

Sorry for the slow reply, but perhaps I’m missing something simple here. If you want to remove references of v1 in VS/deployment because v2 or v3 is good, you may simply modify your virtual service to not route any traffic to v1 then apply it to your mesh. Once that is done you may delete the v1 deployment.

Note: you always want to store your VS resource yaml in git and roll out these changes via a git workflow like gitOps.

0 Likes

#7

Sorry for the delay in my reply.

The thing that I am trying to figure out is how to working with the Virtual Service when there is more than one process editing it. Each branch’s build/deploy is independent and automated. So the deploy would need to do something like the following

  1. Download current Virtual Service
  2. Add newly deployed version to the Virtual Service
  3. Save the changed Virtual Service

This has a host of problems including concurrency and garbage collection of old versions. We’re trying to not require a human to go in and modify the Virtual Service - this should be done by a tool.

GitOps does solve one of the problems (concurrent editing of Virtual Service), but doesn’t really solve any others (e.g., when does a version get removed from the Virtual Service?)

0 Likes