Performance degradation after applying istio 1.5.2

Currently we are evaluating istio to get the benefit of it for our kubernetes cluster running in GKE.

We have two parts of a transaction - initiate and execute. During initiate phase(Client sends a REST call), the anchoring app or orchestrator app interacts with database several times, IBM MQ and few more microservices and then finishes the initiate part. On the execute side, same anchoring app or orchestrator receives REST call from same client and interacts with database, MQ and few other microservices to complete the execute part of the transaction.

Without Istio in the cluster:

  • Initiate transaction avg response time is around 330-350 ms
  • Execute transaction avg response time is around 450-500 ms

With Istio in the cluster:

  • Initiate transaction avg response time is around 1300 ms to 1400 ms
  • Execute transaction avg response time is around 2300-2400 ms

Our profile setting is as follows for a namespace:
istioctl manifest apply --set profile=demo --set values.tracing.enabled=true

Notes:

  • All applications and MQ is running as docker container in one namespace where the istio is installed with above manifest.
  • We have implemented Haproxy as ingress controller which is a different namespace and istio is not applied there.
  • Database server is outside the kubernetes cluster and running in a VM

Could you please comment about this response time degradation post istio installation in the cluster?
Are we missing something?

Thanks in advance.

Have you try to disable tracing?

Thanks for your answer @RaymondKYLiu. I tried without tracing enabled. Its marginal change. And moreover, we would like to leverage Kiali and Distributed tracing using Jaeger capability from istio. That is one of the main reason, we are evaluating it.

I think this might be one reason why Istio dev wants to rip out the addons that make up the demo profile - the demo profile is for just that, demo’ing. It is not meant for production or for performance evaluation. For example, from what I understand, the Prometheus that comes with the demo profile is not configured for performance.

Perhaps some of the Istio performance team can chime in here and provide some info. But I’m almost positive that the demo profile will not give you good performance numbers and you should not use the demo profile to evaluate the performance of Istio as it is simply not geared for that purpose.

There is a #perf-and-scalability room in the Istio Slack if you want to ask there, as well.

@jmazzitelli, Thank you very much for your answer. If we understand you correctly, I should install istio in deafult mode in the cluster and then install kiali, grafana, jaeger in seperate m/c and connect both of these m/c. Please correct if this is wrong understanding.

By the way, to join slack, I filled up the form on 7th May but still not added.I followed the suggestion from [geeknoid] given in the link https://discuss.istio.io/u/geeknoid)Istio Slack Channel. Do I need to do anything else to join slack?
Thanks in advance.

You may want to check out this blog post https://istio.io/blog/2019/performance-best-practices/. As others mentioned here please do not use the demo profile, if has pretty much every non-performance oriented setting enabled

For your situation, I’d like to suggest you try to use Meshery (https://github.com/layer5io/meshery) to do the service mesh performance check. The performance check use Fortio and report to you with the UI. It would be easy to check your service mesh latency and other metrics you need. And in your situation, maybe you can deploy Meshery at you local but not on your production environment(keep the situation isolated) with Kubernetes cluster or docker is all ok. And we can check the cluster’s South-North traffic latency and other metrics you need.

In the other hand, If you deploy Istio as all the components Kigali, tracing etc. You could get the cluster North-Western traffic on Kiali UI with the traffic latency.

And in my experience, although the official document default set the tracing components to deploy false. But, I’d like to say that would not be work for all the situations. The service mesh must follow your business domain. That means we need to confirm and make a choice what feature we need and these must come from your business model