I have been doing a PoC for adopting Istio into one of the projects I am working on. So far it works perfectly well with our micro-service architecture.
However I ran into a requirement where I need to disable metrics and traces generation for a set of micro-services (without removing other benefits provided by Istio) due to performance requirements. Is there anyway to achieve this?
Happy to hear about the start of your Istio journey!
I’d be interested to know what the perf requirements are that we believe disabling metrics and traces will solve (is it mesh perf or backend system perf?, for instance). That might inform some of the potential solutions.
At the moment, there is work going on to enable pod annotations to opt out of policy checks. See: https://github.com/istio/istio/pull/10886. That work could be extended to turn of telemetry reporting through Mixer.
I don’t believe that there is a (simple) way to turn off tracing in the Envoy proxies for a subset at the moment. There is, of course, a way to turn the sampling rate on tracing way down, but that applies cross-mesh.
Other than that, Mixer rules themselves provide a mechanism for selecting when to generate data. You could remove the istio-system-namespaced metrics configuration and apply them in only in namespaces that you desire… or edit them to have a match clause that meets your needs.
Mixer isn’t involved in generating trace spans for services by default. But, if you opted into having Mixer generate trace spans (instead of Envoy), you could similarly select using the Mixer rules.
I am mainly interested in any perf gain I can get from the mesh in the data flow path. I am mostly concerned about the round-trip time of a call made to the micro-services (which includes the time spent in each Envoy side-car). I do understand that the latency added by the Sidecars is really low. However, any perf gain could potentially help us in the long run. (Unfortunately I cannot share details about the actual functionality of the micro-services.)
The issue you mentioned (istio #10886) is mostly focuses on reducing the unwanted metrics. However, this will still result in metrics being calculated by the Envoy Sidecars (which is what I was hoping to remove).
What I was hoping for was maybe an annotation I could add to a pod that could cause the Sidecars to not collect metrics (and possibly disable some of the other features). Probably this will complicate the Sidecar implementation for you. I would like to know whether it is possible at the moment (I presume it’s not) and whether this is something you might consider moving forward.
It is not possible to selectively disable telemetry by pod/deployment. Please file an issue to extend the work from https://github.com/istio/istio/pull/10886 to cover telemetry as well. For telemetry, this can also work by service or namespace so please add details about the right granularity for your use-case.
At the moment, the inclusion list stuff is not configurable at runtime. However, there has been some recent activity around adding that configurability to Istio. While that work was mostly about debug scenarios (turning more metrics on), it could also be used to turn Envoy’s own stats gathering off I imagine.
I’m not sure of the Envoy community’s stance on disabling all observability in the proxy, but there may be opportunity there to expose such a switch.
From what I can understand from the above, we cannot disable metrics and tracing generation completely from Envoy itself because it is not supported by it.
The best option we can go for as of now is to stop the metrics and tracing data at mixer level. (This is mostly targeted towards reducing unwanted data being collected.)
Please correct me if I am wrong.
Thank you all for the help!
I will try to file an issue for extending https://github.com/istio/istio/pull/10886 to cover telemetry as well. I will also try to bring this up in the Envoy community and see if I can get a switch to disable Observability in Envoy completely as well.
Just out of curiosity:
Aren’t the metrics generated by the Istio Proxy, custom attributes added by the Istio community?
Also, I assume no changes were done to Zipkin Tracing though.
Wouldn’t it be possible to customize the Envoy configuration pushed by the pilot to achieve such a switch from Istio Level? Push a config with tracing configurations disabled for certain micro-services. (I understand that this might be too big a change on your end. Just like to hear your views.)
There doesn’t currently exist a way to selectively completely disable telemetry from Envoy via Istio on a per-service basis. However, that doesn’t mean one couldn’t be added.
As I relate above, there are now multiple related efforts on-going, with open PRs, that could provide some of what you are looking for (including one that will allow specification of an override stats inclusion list). These efforts could definitely be extended to cover what I believe is your use case.
Please file the issue and we can follow up there.
To address your questions:
Envoy generates its own set of metrics, independent of any Istio configuration. Istio also generates metrics that is entirely controllable by configuration (and focus on slightly different use cases).
I’m not sure what you mean by “no changes were done to Zipkin Tracing”, but perhaps the Distributed Tracing FAQ would provide some answers.
It is possible – and work along those lines is being done. But it is not supported currently.
I meant to ask whether there are any changes done to the default Distributed Tracing’s tags and other additional information sent by the Envoy Proxies.
Btw, I have created a GitHub issue (#11582) for this. We can discuss about it there and see if this is a useful feature to add to Istio.
Is there any traction on this at all? We’ve had the issue of not being able to selectively disable test services from Lightstep tracing in our Istio deployment for ages. The chief problem is that Lightstep can bill you based on your created spans, so being able to turn off tracing for services where that observability doesn’t matter is crucial for us.