How to update Prometheus config?


#1

Greetings,

I have a Istio on GKE setup with Grafana added on after install, but I’m unable to see any metrics outside the istio-system namespace in Prometheus/Grafana. According to the troubleshooting guide, this appears to be because I’m completely missing a istio-mesh target. (I’m rather mystified by how this happened, since everything else for Istio appears to be in place, but that’s another story.)

My question: how exactly do I go about updating this config in a persistent way? It’s not editable in the UI and if I update the prometheus configmap with kubectl, my changes appear to get silently nuked – the changes last for a few seconds, and then the resourceVersion gets bumped up and my additions disappear.

$ kubectl apply -f prom-configmap.yaml --namespace=istio-system
$ kubectl get configmap prometheus --namespace=istio-system -o yaml >prom-configmap2.yaml

At this point the changes are registered with the timestamp set and resourceVersion ticking up:

$ diff prom-configmap.yaml prom-configmap2.yaml | more
121c121
<  
---
> 
>   creationTimestamp: "2019-02-15T06:32:49Z"
>   resourceVersion: "955954"
>   uid: 83994e96-30eb-11e9-a9e9-42010a940019

But when I try again a few seconds later, my additions have disappeared:

$ kubectl get configmap prometheus --namespace=istio-system -o yaml >prom-configmap3.yaml
$ diff prom-configmap3.yaml prom-configmap2.yaml 
<   resourceVersion: "955978"
---
>   resourceVersion: "955954"
107a108,121
>     - job_name: 'istio-mesh'
>       # Override the global default and scrape targets from this job every 5 seconds.
>       scrape_interval: 5s
>       # metrics_path defaults to '/metrics'
>       # # scheme defaults to 'http'.
>       static_configs:
>        - targets: ['istio-mixer.istio-system:42422']
> 
>       kubernetes_sd_configs:
>       - role: endpoints
>         namespaces:
>           names:
>           - istio-system

Cheers,
-j.


#2

Hi, did you install your own prometheus per the docs?
https://cloud.google.com/istio/docs/istio-on-gke/installing#adding_prometheus
Since your changes are being reverted, it sounds like you might be trying to edit the installed prom (called promsd) which is used for internal metrics only and can’t be configured for anything else.


#3

Prometheus is a dependency for Grafana, so it’s installed automatically. I’m installing Grafana per the instructions on that page:

helm template --set grafana.enabled=true --set servicegraph.enabled=true --set kiali.enabled=true \
  --set prometheus.enabled=true --global.mtls.enabled=false --namespace istio-system \ 
  install/kubernetes/helm/istio >on.yaml
helm template --set grafana.enabled=false --set servicegraph.enabled=false --set kiali.enabled=false \
  --set prometheus.enabled=false --global.mtls.enabled=false --namespace istio-system \ 
  install/kubernetes/helm/istio >off.yaml
diff --line-format=%L on.yaml off.yaml >grafana.yaml
kubectl apply -f grafana.yaml

I remain somewhat unclear on the relationship between promsd and prometheus, but that’s specifically enabling prometheus, not promsd.


#4

Istio on GKE uses its own internal prometheus for internal metrics which cannot be modified. This
prometheus was called “prometheus” before and was renamed to “promsd” in 1.0.3-gke.3 to avoid colliding with user installed prometheus.

If you’re using gke.0, you have you call your prometheus something different e.g. prometheus-user, otherwise it will conflict. That’s why the docs point to different yamls for versions gke.0 and gke.3, the former installs prometheus as prometheus-user.
The easiest way is to use 1.0.3-gke.3 and then your approach to installing prometheus should work.


#5

Thank you, but I am using 1.0.3-gke3 and the metrics are not coming through. Does the lack of istio-mesh indicate that something was broken in my installation process?

Also, I would like to understand what, exactly, is automatically overriding my Prometheus config changes. From what you’re saying, it sounds like Istio for some reason thinks my prometheus is promsd and is making the changes. If this is the case, where is the ‘meta-config’ responsible for this and are there logs somewhere for its actions?