I have a GKE cluster with Istio 1.4 installed and I disabled the Prometheus deploy on manifest to deploy the Prometheus Operator (kube-prometheus) on my own.
So I enabled the sidecar autoinjection and all of my monitoring deployments now have the Envoy sidecar (Grafana, AlertManager, PrometheusOperator, Prometheus, KubeStateMetrics).
I followed the Istio recomendation to enable mTLS globally using MeshPolicy default, and set the DestionationRule for Grafana to be scraped:
apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
name: "grafana"
namespace: "monitoring"
spec:
host: "grafana.monitoring.svc.cluster.local"
trafficPolicy:
portLevelSettings:
- port:
number: 3000
tls:
mode: ISTIO_MUTUAL
But Prometheus still giving me the error on Grafana target: server returned HTTP status 503 Service Unavailable
I tested with istioctl
istioctl authn tls-check prometheus-k8s-0.monitoring grafana.monitoring.svc.cluster.local
And the output seems to be ok:
HOST:PORT STATUS SERVER CLIENT AUTHN POLICY DESTINATION RULE
grafana.monitoring.svc.cluster.local:3000 OK STRICT ISTIO_MUTUAL /default monitoring/grafana
Does anyone know what’s happening?
Edit
It’s really strange 'cause I deployed the Istio Mesh Monitor and it’s working, is anything related to the namespace?
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
monitoring: istio-mesh
release: istio
name: istio-mesh-monitor
namespace: istio-system
spec:
endpoints:
- interval: 5s
port: prometheus
- interval: 5s
port: http-monitoring
namespaceSelector:
matchNames:
- istio-system
selector:
matchExpressions:
- key: istio
operator: In
values:
- mixer
Edit 2
I figured out why Prometheus is failing to scrape Grafana: 'cause it’s requesting the metrics endpoint by IP so Envoy don’t know how to operate the requests giving me a 503, anyone knows how to make Envoy accept IP requests? I don’t know if it’s possible to enable this through any Istio resource.
And AlertManager is another problem, I have a service (headless) with 2 separate endpoints pointing to the same location (IP:PORT), it’s something from the Prometheus Operator, but I guess this is simple to solve.