We were using 1.8.2 istio in which we observed high memory consumption from istio-proxy sidecar rather that application container which is taking very less memory compared to sidecar. I tried applying Sidecar resource in order to limit its metadata, but it didn’t help. Then I probably read about memory leak in that version so I upgraded istio to 1.10.6. Still the issue continues. So wanted to understand how istio-proxy sidecar consumes memory usually in order to find the root cause.
I’ve attached heap profile information with this post.
As of now this istio-proxy container is consuming more than 1GB memory.
Note: I’ve also configured few envoy filters to capture url and upstream_address in the proxy.
Kindly analyze the heap profile output and let me know what are the reasons for memory consumption in istio-proxy container. As stated in documentation istio-proxy can consume 90MB of memory for metadata, also as its not buffering any traffic, it will not consume memory for given traffic.
This probably is linked to the number of services you have exposed on the mesh. I learned last week, on one of solo.io’s workshop, that using the Sidecar object to restrict the services visibility can help reduce the memory usage.
Try look at that and see if it applies to your scenario.
I hope this helps.
Finally I’ve cut down to the root cause of this issue, its istio metrics. Currently have disabled all metrics, but eventually I need them back. So optimizing metrics exposed, by dropping some metrics and labels using telemetry V2. But mainly I’m looking for 2 things to achieve:
- telemetry V2 allows disabling of metrics in group, but I need to drop a particular metric from that group, example: I need to drop only
istio_request_duration_milliseconds_bucket
metric, but I need istio_request_duration_milliseconds_sum and istio_request_duration_milliseconds_count.
- I need to patch the bootstrap configuration (even via EnvoyFilter is fine), so that I need to disable few envoy metrics, say envoy_cluster_lb_*.
It would be grateful if I get any help on these.
1 Like
Hello @anand_gt . Did you manage to cherry pick the metrics as you described above?
I believe I’ll need to go the same way thanks the number o services and pods in my mesh.