Retry is not working for 5xx

I am trying retry configuration but it is not working for me. Below are the codes that I am using.

Service and Deployment

apiVersion: v1
kind: Service
metadata:
  name: prometheus-flask-app-prometheus-flask-app
  labels:
    app: prometheus-flask-app
    chart: sbd-prometheus-flask-app-1.1
    release: prometheus-flask-app
    heritage: Helm
spec:
  type: ClusterIP
  ports:
    - port: 80
      targetPort: 5000
      protocol: TCP
      name: sbd-prometheus-flask-app
  selector:
    app: prometheus-flask-app
    release: prometheus-flask-app

apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus-flask-app-prometheus-flask-app
  labels:
    app: prometheus-flask-app
    chart: sbd-prometheus-flask-app-1.1
    release: prometheus-flask-app
    heritage: Helm
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus-flask-app
      release: prometheus-flask-app
  template:
    metadata:
      labels:
        app: prometheus-flask-app
        release: prometheus-flask-app
    spec:
      containers:
        - name: sbd-prometheus-flask-app
          image: flaskapp:latest
          imagePullPolicy: Always
          ports:
            - containerPort: 5000
          livenessProbe:
            httpGet:
              path: /healthz
              port: 5000
            initialDelaySeconds: 600
          readinessProbe:
            httpGet:
              path: /healthz
              port: 5000
            initialDelaySeconds: 30
          resources:
            requests:
              cpu: 20m
              memory: 100Mi

Gateway

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: prometheus-flask-app-prometheus-flask-app-gateway
spec:
  selector:
    istio: ingressgateway # use istio default controller
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "*"

VirtualService

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: prometheus-flask-app-prometheus-flask-app-vs
spec:
  hosts:
  - "*"
  gateways:
  - prometheus-flask-app-prometheus-flask-app-gateway
  http:
  - route:
    - destination:
        host: prometheus-flask-app-prometheus-flask-app.knative-deployment.svc.cluster.local # knative-deployment is the namespace name
        port:
          number: 80
        subset: v1
    retries:
      attempts: 3
      perTryTimeout: 5s
      retryOn: 5xx,gateway-error,connect-failure,refused-stream,reset,retriable-status-codes

Then scaled the pods to 0 using kubectl scale command to check whether retry is working or not. Then tried hitting the endpoint and was getting the message no healthy upstream then checked the logs of the istio-ingressgateway pod.

Here is the result of the logs-

So how to get retry working?

hi @anant, I am facing the same issue. I got some commands that tells that how many retries has happen but its not working. One thing I can help you with instead of scaling pod to 0 either try to pause the container inside the pod or try fault injection because communication happens with istio-proxy.
Please let me know if you find any solution

Hi @Anant @dikshantdevops are you getting any possible solution for this I’m facing same issue

Hi @pranavgangrade what issue are you exactly facing

@dikshantdevops also I get liveness and readiness issue while adding proxy to the pod

@pranavgangrade
Can you describe in more detail. Have you just started implementing istio in your cluster or it was implemented and you are trying to use retry mechanism only. Also the commands you are using to check logs.

@dikshantdevops just started implementing istio in the cluster

@pranavgangrade then check if you have configured gateway and virtual service correctly. Also after injecting ingress into namespace, you need to restart all your pod. Hope you are doing this.