503 between pod to pod communication (1.5.1)

Just wanted to report back on this thread since it is the only useful documentation I have found on accessing a pod IP directly. The headless service approach worked in terms of connecting, but it doesn;t deal with failures well. When one of the pods in the upstream service is killed, the downstream envoy proxy continues to send requests to it. This appears to continue indefinitely until the service is idle for some period and the connections are thrown away.

@sdake I have raised a github issue, but haven’t had any response to it yet.

This solution looks perfect! But I’m curious that what is the best way to monitor prometheus server itself. I have two prometheus servers I want to scrape each other. When i do it over a head less service one of scrape is giving HTTP status 503 Service Unavailable

- job_name: prometheus
  scrape_interval: 30s
  - targets:
    - server-0.headless:9090
    - server-1.headless:9090

Is this the right approach something else can compliment this ?

we use heartbeat, prometheus has watchdog alert which is always alerting, this alert goes to alertmanager and alertmanager can send this as heartbeat to pagerduty, opsgenie etc.

prometheus (watchdog alert) → alertmanager → pagerduty/ opsgenie etc.

if above pipeline breaks then pagerduty/ opsgenie etc. can alert, this is heartbeat type of alert, if heartbeat isnt received for x seconds then alert

@deepak_deore Hey! Were you able to solve this problem? I ran into the same when trying to inject istio-sidecar to prometheus.