we would like introduce istio into production, tough proxy/envoy already has metrics to monitor incoming/outgoing traffic, we need monitor istio’s itself state such as:
- if any proxy/envoy has mismatch rule/data with pilot? and may need raise alarm.
2.proxy/envoy is runing, but iptables was not updated successfully for some reasons, so all live traffic did not go through proxy/envoy.
the existing Istio Pilot dashboard on grafana covered CDS/RDS/EDS rejected data, like “label_replace(sum(pilot_xds_eds_reject{job=“pilot”}) by (node, err), “node”, “$1”, “node”, “.~.~(.)~.”)”. but can not list which proxy/envoy has issue for rejection.
In production env with hundreds pods, good to cover #1 and #2 case, so like to know if anyone else already did similar metrics monitor. thanks a lot!!