Running Istio 1.6.5
Solution is:
ingress gw routes to service 1
service 1 orchestrates calls to services a b and c - collates responses into a single reply
service c calls 3rd party via egress gw
Third party has “issue” - we see http 503s returned at the egress gw / service c / service 1 / ingress gw
Third party api isn’t idempotent. Developers aware of this and have retry logic in the application.
Initially we had a problem with the 3rd party api call being re-tried by the istio proxy at a number of levels on account of the default retry policy receiving the HTTP 503.
In ALL our virtual services between ingress and egress we’ve applied a retry policy of 0 to stop the proxy from retrying.
This fixes our immediate problem - the error / retry is handled in the application.
The question is, is there a downside to removing all retries wholesale in the istio-proxy?
for ref - the default retry policy looks like this:
“retryPolicy”: {
“retryOn”: “connect-failure,refused-stream,unavailable,cancelled,retriable-status-codes”,
“numRetries”: 2,
“retryHostPredicate”: [
{
“name”: “envoy.retry_host_predicates.previous_hosts”
}
],
“hostSelectionRetryMaxAttempts”: “5”,
“retriableStatusCodes”: [
503
]