Kiali reports red health status for grpc-web

Hello the Community.

I’m testing deployment with istio 1.7.3 on digital ocean k8s cluster.
I have http service with web application and number of grpc services proxyng the API to the web application with grpc-web.

Thanks for istio - everything is working almost as on local development with microk8s.
But on public cluster I see the health status of grpc services is Failure.

I have not found failed requests in all logs i’ve found, and not in traces, not in Graphana, and not in browser network tab.

At same time the status of the web application is reasonable.
We have number of 404 routes, which are not implemented, and Kiali status is Yellow, and failed requests can be easily found in traces, visible in Graphana, etc.

I installed the dashboard just according to documentation, as we are only testing now, and run it with

istioctl dashboard kiali

as well as for other dashboards.

For reference the configutation:
Gateway:

kind: Gateway
apiVersion: networking.istio.io/v1alpha3
metadata:
  name: environment-common
  namespace: default
spec:
  servers:
    - hosts:
        - '*'
      port:
        name: http
        number: 80
        protocol: HTTP2
      tls:
        httpsRedirect: true
    - hosts:
        - '*'
      port:
        name: https
        number: 443
        protocol: HTTPS
      tls:
        credentialName: ingress-cert
        mode: SIMPLE
  selector:
    istio: ingressgateway

Service:

apiVersion: v1
kind: Service
metadata:
  name: interests
  labels:
    app: interests
    helm.sh/chart: interests-0.0.1
    app: interests
    app.kubernetes.io/name: interests
    app.kubernetes.io/instance: interests
    app.kubernetes.io/version: "latest"
    app.kubernetes.io/managed-by: Helm
spec:
  type: ClusterIP
  ports:
    - port: 9090
      name: grpc-web
  selector:
    app: interests
    app.kubernetes.io/name: interests
    app.kubernetes.io/instance: interests

VirtualService:

kind: VirtualService
metadata:
  name: interests
spec:
  hosts:
  - '*'
  gateways:
  - environment-common
  http:
  - match:
    - uri:
        prefix: /<path-to-grpc-package>.interests.Interests
    route:
    - destination:
        host: interests
        port:
          number: 9090
    corsPolicy:
      
      allowOrigins:
        - exact: "*"
      allowMethods:
        - POST
        - GET
        - OPTIONS
        - PUT
        - DELETE
      allowHeaders:
        - grpc-timeout
        - keep-alive
        - user-agent
        - cache-control
        - content-type
        - content-transfer-encoding
        - x-accept-content-transfer-encoding
        - x-accept-response-streaming
        - x-user-agent
        - x-grpc-web
      maxAge: 1728s
      exposeHeaders:
        - grpc-status
        - grpc-message
      allowCredentials: true

What could i miss?

It looks like 1/2 of the requests being made of interests are failing. Try clicking on the edge leading into the service node (triangle) and in the side panel see what you can see. maybe try the Hosts or Flags tab to see a breakdown of error codes.

Hi jshaughn,
Thank for interesting in this post.
It is very strange.
I have 100% success requested on “Traffic” tab, but Health Overview still shows 50% failures.
Overview:


Traffic:

I wonder if it could be a rounding error. Given the low rate of 0.03 requests per second the precision may come into play. Could you confirm the metrics in prometheus by executing the query:

sum(istio_requests_total{destination_service_name=“interests”}) by (reporter, response_code, source_workload, destination_workload)

For info about querying prometheus see https://kiali.io/documentation/staging/faq/#prometheus