Metric expiry in TelemetryV2 proxies

skhalash · December 1, 2020, 11:05am

Community,

Istio Mixer used to have a useful feature called metricsExpirationPolicy, which meant that Mixer would stop holding on to a metric after a certain period. This feature has not been implemented in Telemetry V2 yet.

In our environments, where new workloads are created and then destroyed periodically, Envoy proxies keep accumulating huge amount of time series (hundreds of thousands), referencing workloads that are long gone. It increases memory pressure on the Prometheus side, which eventually leads it to be OOM killed.

One solution proposed by the Istio Community was to drop or normalize some labels to decrease the cardinality. It’s not suitable for us since we want to keep the labels as they are. We just want Envoy proxies to stop exposing time series that were inactive for some time. Restarting Envoy proxies fixes the problem, but is obviously out of question for production environments.

Does anyone have an idea how to circumvent this issue?

Topic		Replies	Views
Prometheus Stats Expiry	3	1216	April 1, 2019
Envoy's Metrics Service Policies and Telemetry	6	1128	February 5, 2019
Question about Customizing Istio Metrics	1	511	January 18, 2022
Envoy metrics disable Policies and Telemetry	0	495	September 22, 2022
Help to understand the flow of metrics in istio telemetry Policies and Telemetry	4	899	October 25, 2019

Metric expiry in TelemetryV2 proxies

Related topics