Skip to main content

Guide to ZDT with Istio Canary deployment

20 Feb 2025

Zero downtime is a must-have criterion, especially in FinTech, HealthTech, and many other industries. Achieving true zero downtime is a long-chased milestone in the software development lifecycle, and many approaches are available to provide a reasonable solution. However, traditional deployment strategies like blue/green and basic canary releases often fall short, introducing either high cost or unacceptable risk. This guide explores a "smart" approach using Istio's powerful service mesh capabilities to achieve truly zero downtime deployments with canary releases, minimizing risk and maximizing application uptime, but first, let's look at the widely used techniques.

The challenges of traditional deployment strategies

I want to discuss the two most common deployment strategies – Blue/Green Deployments and Canary Deployments.

Blue/green deployments

Blue-green deployment is a software release management strategy for reducing downtime and risks during application updates or deployments. It involves maintaining two environments.

  • Blue environment: This is the live or production environment running the current version of the application.
  • Green environment: This is the new version of the application, set up and tested, but not yet live.

The new version of the application is deployed and tested in the green environment, without affecting the users accessing the blue (current) environment. Once the green environment is fully validated, the traffic is switched from blue to green, making the new version live. If there are issues with the green environment after the switch, the traffic can easily be reverted to the blue environment (which is still intact), ensuring minimal downtime and risk.

Problems with blue/green deployments

  • Cost: Such deployments generally require creating and maintaining a production-like environment, which is quite costly.
  • Not really a production environment: Although the resources match the actual production environment, the underlying application states and data might differ. This means the application still gets tested in the lab environment, leaving room for issues in the actual production environment.

Canary deployments

A Canary Deployment is a technique for gradually rolling out a new version of your application to a subset of users or requests to minimize the risk associated with new releases.

Here, the new version of the application gets deployed in the same production environment where the stable version is running. A fraction of production gets diverted to the new version of the application for testing, and gradually rolling the new version to the stable version if the tests are successful; otherwise, it is removed from the environment.

Problem with canary deployments

  • Production risk: Since actual production traffic gets diverted to the new version of the application, it poses the risk of leaking bugs to actual customers, which means downtime for many of them.

As we can see, none of them provides true downtime when it comes to customer experience and application robustness.

The Solution: Smart ZDT deployment with Istio

A modified canary and blue-green by Istio (a service mesh) can let us utilize the goodness of both canary and blue-green.

Istio, a robust open-source service mesh, offers a solution that combines the best of both worlds. By intelligently routing traffic, Istio enables isolated testing of canary releases within the production environment, eliminating the need for duplicate infrastructure and preventing bugs from reaching the users.

How does smart zero downtime work with Istio?

The canary version of the application is deployed in the actual production environment. Only the test traffic is routed to the canary version using the Istio ingress gateway and virtualservice, while 100% of the production traffic continues to be served by the existing stable version of the application.

How smart zero downtime works with Istio Opcito TechnologiesSteps for deployment using Istio based on request headers

Step 1 - Install Istio in your Kubernetes Cluster

If Istio isn't installed, follow your Kubernetes environment's Istio installation guide.

To implement a Smart Deployment in Istio based on the user ID present in a cookie, you need to modify the Istio configuration so that traffic is routed depending on the value of the user ID cookie. This can be done by defining a header match rule in the Istio VirtualService that inspects the cookie header.

Step - 2. Deploy your service versions

Ensure that you have at least two versions of your service: the stable version (v1) and the canary version (v2).

Step - 3. Create the DestinationRule

You'll need a DestinationRule to define subsets for the stable and canary versions.

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: my-app
spec:
  host: my-app
  subsets:
    - name: v1
      labels:
        version: v1
    - name: v2
      labels:
        version: v2

Step - 4. Define the VirtualService for canary deployment based on cookie

We can define a virtual service like this where we can route different traffic to respective versions. We can For example, if the user-id is less than 1000, you could route those requests to the canary version (v2). Otherwise, the requests go to the stable version (v1).

Here's how you can create the routing rule based on the user-id cookie:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
 name: my-app
spec:
  hosts:
   - my-app
 http:
   - match:
     - headers:
        cookie:
           regex: "user-id=test" # Match user-id cookie
     route:
      - destination:
          host: my-app
          subset: v2 # Route to v2 (Canary version) if the condition matches
     - match:
        - uri:
            prefix: “/”
route:
    - destination:
       host: my-app
       subset: v1 # Route to v1 (Stable version) if the condition matches

Explanation:

  • Regex Match on Cookie: We're using the cookie header and regex to extract the user-id value from the cookie.
    • If the user ID is test, which means a user defined in the test suite with an ID different from actual production users, (e.g., user-id=test), traffic is routed to the canary version (v2).
    • If any other user logs in, it will use the match 2 and traffic is routed to the stable version (v1).

You can customize this further by adding matches based on different user ID ranges or other cookies.

Step - 5. Testing the canary deployment based on user-id cookie

To test the routing based on the user-id cookie, you can send requests with the appropriate user-id in the cookie using curl.

 - For Canary Version (v2) (e.g., user-id is test):
curl -H "Cookie: user-id=test" http://<istio-ingressgateway-url>/my-app
 - For Stable Version (v1) (e.g., any production user-id like: user1):
curl -H "Cookie: user-id=user1" http://<istio-ingressgateway-url>/my-app

This will divert 100% of production traffic to the stable version and only test traffic to the canary version.

Step - 6. Monitoring the canary deployment

You can monitor your canary deployment using Istio's observability tools (Prometheus, Grafana, Kiali, etc.) to ensure the traffic flows as expected and to check for any issues with the new version.

The advantage of smart ZDT with Istio:

As we can see, it addresses the problems listed in the above approaches.

  • Cost: No additional cost as it uses the same production environment.
  • Robustness: The test is more efficient as it uses actual production environment, its states, and data.
  • Zero downtime: The live traffic will not be affected in any way, and the introduction of a new version will have no impact on the user experience.

Smart zero downtime deployments using Istio offer a powerful and practical solution for modern application delivery. By combining the benefits of canary releases and intelligent traffic management, you can minimize risk, improve testing, and achieve true zero downtime, ultimately leading to happier users and a more reliable application. Realizing the full potential of Istio canary deployments can be complex. Opcito's experts have extensive experience in implementing and managing these solutions. Contact us today to learn more about how we can help you achieve seamless, zero-downtime deployments and optimize your application delivery pipeline.

Note: Stay tuned for more on "Maintaining Data Consistency in Zero Downtime." A new blog is coming soon.

Subscribe to our feed

select webform