How to investigate 503/BlackHoleCluster for outbound request via egress gateway (TLS origination)?

Hi,

I’m stuck with it, so perhaps someone can point me into right direction with this. Appreciated any feedback.

I have simple test App which should connect to external Redis cluster when triggered via Api call:

[redis-tls-app] ---TCP---> [egress gw] ---TLS---> [redis cluster]

The App is making attempt to connect to clustercfg.cccccccccc.cache.amazonaws.com at port 8180.

I managed this to work without egress gateway (TLS is originated via Envoy proxy sidecar), so I’m sure there are no issues with the test App, handling ingress traffic or Redis cluster.

However I’m stuck to make this working with TLS origination via egress gateway. I see in the App proxy following error:

[2023-11-30T13:35:58.559Z] "- - -" 0 UH - - "-" 0 0 0 - "-" "-" "-" "-" "-" BlackHoleCluster - 10.100.251.12:8180 10.100.77.116:44698 - -
[2023-11-30T13:35:58.569Z] "GET /red/ping HTTP/1.1" 503 - via_upstream - "-" 0 0 5 3 "10.100.78.9" "curl/8.4.0" "0e7fe065-400e-4fc1-afbd-eb4876206e0b" "api.redis-tls.k8s.cluster" "10.100.77.116:8080" inbound|8080|| 127.0.0.6:47507 10.100.77.116:8080 10.100.78.9:0 outbound_.8080_._.redis-tls-app.redis-tls.svc.cluster.local default

10.100.251.12:8180 - this is IP/port for Redis cluster which should be redirected to egress GW. It doesn’t even seem to leave the App sidecar.

I’m using Istio 1.15.2 and cannot upgrade to higher version because of other dependencies.

My egress setup for the App:

apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: redis-tls-external-srv
  namespace: redis-tls
spec:
  exportTo:
    - "."
  hosts:
    - clustercfg.cccccccccc.cache.amazonaws.com
  ports:
    - number: 8180
      name: http-for-apps-and-gateway
      protocol: HTTP
    - number: 6379
      name: https-for-redis
      protocol: HTTPS
  resolution: DNS
  location: MESH_EXTERNAL
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: redis-tls-external-vs
  namespace: redis-tls
spec:
  exportTo:
    - "*"
  hosts:
    - clustercfg.cccccccccc.cache.amazonaws.com
  gateways:
    - mesh
    - redis-tls-egressgateway
  http:
    - match:
        - gateways:
            - mesh
          port: 8180
      route:
        - destination:
            host: istio-egressgateway.istio-system.svc.cluster.local
            subset: redis-tls-external-srv
            port:
              number: 8180
          weight: 100
    - match:
        - gateways:
            - redis-tls-egressgateway
          port: 8180
      route:
        - destination:
            host: clustercfg.cccccccccc.cache.amazonaws.com
            port:
              number: 6379
          weight: 100
---
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: redis-tls-egressgateway
  namespace: redis-tls
spec:
  selector:
    istio: egressgateway
  servers:                  # this is configuration from the mesh to the egress gateway
    - port:
        number: 8180
        name: https-egressgateway
        protocol: HTTPS
      hosts:
        - clustercfg.cccccccccc.cache.amazonaws.com
      tls:
        mode: ISTIO_MUTUAL
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: redis-tls-client-to-egressgateway
  namespace: redis-tls
spec:
  exportTo:
    - "*"
  host: istio-egressgateway.istio-system.svc.cluster.local
  subsets:
    - name: redis-tls-external-srv
      labels:
        istio: egressgateway
      trafficPolicy:
        loadBalancer:
          simple: ROUND_ROBIN
        portLevelSettings:
          - port:
              number: 8180
            tls:
              mode: ISTIO_MUTUAL
              sni: clustercfg.cccccccccc.cache.amazonaws.com
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: redis-tls-egressgateway-to-redis-cluster
  namespace: redis-tls
spec:
  exportTo:
    - "*"
  host: clustercfg.cccccccccc.cache.amazonaws.com
  trafficPolicy:
    loadBalancer:
      simple: ROUND_ROBIN
    portLevelSettings:
      - port:
          number: 6379
        tls:
          mode: SIMPLE # initiates TLS for connections to redis cluster
          sni: clustercfg.cccccccccc.cache.amazonaws.com
          caCertificates: /usr/share/ca-certificates/mozilla/Amazon_Root_CA_1.crt

This is cluster setup for the App sidecar:

    {
        "name": "outbound|6379||clustercfg.cccccccccc.cache.amazonaws.com",
        "type": "STRICT_DNS",
        "connectTimeout": "10s",
        "lbPolicy": "LEAST_REQUEST",
        "loadAssignment": {
            "clusterName": "outbound|6379||clustercfg.cccccccccc.cache.amazonaws.com",
            "endpoints": [
                {
                    "locality": {},
                    "lbEndpoints": [
                        {
                            "endpoint": {
                                "address": {
                                    "socketAddress": {
                                        "address": "clustercfg.cccccccccc.cache.amazonaws.com",
                                        "portValue": 6379
                                    }
                                }
                            },
                            "metadata": {
                                "filterMetadata": {
                                    "istio": {
                                        "workload": ";;;;"
                                    }
                                }
                            },
                            "loadBalancingWeight": 1
                        }
                    ],
                    "loadBalancingWeight": 1
                }
            ]
        },
        "circuitBreakers": {
            "thresholds": [
                {
                    "maxConnections": 4294967295,
                    "maxPendingRequests": 4294967295,
                    "maxRequests": 4294967295,
                    "maxRetries": 4294967295,
                    "trackRemaining": true
                }
            ]
        },
        "dnsRefreshRate": "60s",
        "respectDnsTtl": true,
        "dnsLookupFamily": "V4_ONLY",
        "commonLbConfig": {
            "localityWeightedLbConfig": {}
        },
        "metadata": {
            "filterMetadata": {
                "istio": {
                    "default_original_port": 6379,
                    "services": [
                        {
                            "host": "clustercfg.cccccccccc.cache.amazonaws.com",
                            "name": "clustercfg.cccccccccc.cache.amazonaws.com",
                            "namespace": "redis-tls"
                        }
                    ]
                }
            }
        },
        "filters": [
            {
                "name": "istio.metadata_exchange",
                "typedConfig": {
                    "@type": "type.googleapis.com/envoy.tcp.metadataexchange.config.MetadataExchange",
                    "protocol": "istio-peer-exchange"
                }
            }
        ]
    },
    {
        "name": "outbound|8180||clustercfg.cccccccccc.cache.amazonaws.com",
        "type": "STRICT_DNS",
        "connectTimeout": "10s",
        "lbPolicy": "LEAST_REQUEST",
        "loadAssignment": {
            "clusterName": "outbound|8180||clustercfg.cccccccccc.cache.amazonaws.com",
            "endpoints": [
                {
                    "locality": {},
                    "lbEndpoints": [
                        {
                            "endpoint": {
                                "address": {
                                    "socketAddress": {
                                        "address": "clustercfg.cccccccccc.cache.amazonaws.com",
                                        "portValue": 8180
                                    }
                                }
                            },
                            "metadata": {
                                "filterMetadata": {
                                    "istio": {
                                        "workload": ";;;;"
                                    }
                                }
                            },
                            "loadBalancingWeight": 1
                        }
                    ],
                    "loadBalancingWeight": 1
                }
            ]
        },
        "circuitBreakers": {
            "thresholds": [
                {
                    "maxConnections": 4294967295,
                    "maxPendingRequests": 4294967295,
                    "maxRequests": 4294967295,
                    "maxRetries": 4294967295,
                    "trackRemaining": true
                }
            ]
        },
        "dnsRefreshRate": "60s",
        "respectDnsTtl": true,
        "dnsLookupFamily": "V4_ONLY",
        "commonLbConfig": {
            "localityWeightedLbConfig": {}
        },
        "metadata": {
            "filterMetadata": {
                "istio": {
                    "default_original_port": 8180,
                    "services": [
                        {
                            "host": "clustercfg.cccccccccc.cache.amazonaws.com",
                            "name": "clustercfg.cccccccccc.cache.amazonaws.com",
                            "namespace": "redis-tls"
                        }
                    ]
                }
            }
        },
        "filters": [
            {
                "name": "istio.metadata_exchange",
                "typedConfig": {
                    "@type": "type.googleapis.com/envoy.tcp.metadataexchange.config.MetadataExchange",
                    "protocol": "istio-peer-exchange"
                }
            }
        ]
    },

And routes at the App sidecar:

    {
        "name": "8180",
        "virtualHosts": [
            {
                "name": "block_all",
                "domains": [
                    "*"
                ],
                "routes": [
                    {
                        "name": "block_all",
                        "match": {
                            "prefix": "/"
                        },
                        "directResponse": {
                            "status": 502
                        }
                    }
                ],
                "includeRequestAttemptCount": true
            },
            {
                "name": "clustercfg.cccccccccc.cache.amazonaws.com:8180",
                "domains": [
                    "clustercfg.cccccccccc.cache.amazonaws.com",
                    "clustercfg.cccccccccc.cache.amazonaws.com:8180"
                ],
                "routes": [
                    {
                        "match": {
                            "prefix": "/",
                            "caseSensitive": true
                        },
                        "route": {
                            "cluster": "outbound|8180|redis-tls-external-srv|istio-egressgateway.istio-system.svc.cluster.local",
                            "timeout": "0s",
                            "retryPolicy": {
                                "retryOn": "connect-failure,refused-stream,unavailable,cancelled,retriable-status-codes",
                                "numRetries": 2,
                                "retryHostPredicate": [
                                    {
                                        "name": "envoy.retry_host_predicates.previous_hosts",
                                        "typedConfig": {
                                            "@type": "type.googleapis.com/envoy.extensions.retry.host.previous_hosts.v3.PreviousHostsPredicate"
                                        }
                                    }
                                ],
                                "hostSelectionRetryMaxAttempts": "5",
                                "retriableStatusCodes": [
                                    503
                                ]
                            },
                            "maxStreamDuration": {
                                "maxStreamDuration": "0s",
                                "grpcTimeoutHeaderMax": "0s"
                            }
                        },
                        "metadata": {
                            "filterMetadata": {
                                "istio": {
                                    "config": "/apis/networking.istio.io/v1alpha3/namespaces/redis-tls/virtual-service/redis-tls-external-vs"
                                }
                            }
                        },
                        "decorator": {
                            "operation": "istio-egressgateway.istio-system.svc.cluster.local:8180/*"
                        }
                    }
                ],
                "includeRequestAttemptCount": true
            },
            {
                "name": "istio-ingressgateway.istio-system.svc.cluster.local:8180",
                "domains": [
                    "istio-ingressgateway.istio-system.svc.cluster.local",
                    "istio-ingressgateway.istio-system.svc.cluster.local:8180",
                    "istio-ingressgateway.istio-system",
                    "istio-ingressgateway.istio-system:8180",
                    "istio-ingressgateway.istio-system.svc",
                    "istio-ingressgateway.istio-system.svc:8180",
                    "172.20.186.144",
                    "172.20.186.144:8180"
                ],
                "routes": [
                    {
                        "name": "default",
                        "match": {
                            "prefix": "/"
                        },
                        "route": {
                            "cluster": "outbound|8180||istio-ingressgateway.istio-system.svc.cluster.local",
                            "timeout": "0s",
                            "retryPolicy": {
                                "retryOn": "connect-failure,refused-stream,unavailable,cancelled,retriable-status-codes",
                                "numRetries": 2,
                                "retryHostPredicate": [
                                    {
                                        "name": "envoy.retry_host_predicates.previous_hosts",
                                        "typedConfig": {
                                            "@type": "type.googleapis.com/envoy.extensions.retry.host.previous_hosts.v3.PreviousHostsPredicate"
                                        }
                                    }
                                ],
                                "hostSelectionRetryMaxAttempts": "5",
                                "retriableStatusCodes": [
                                    503
                                ]
                            },
                            "maxStreamDuration": {
                                "maxStreamDuration": "0s",
                                "grpcTimeoutHeaderMax": "0s"
                            }
                        },
                        "decorator": {
                            "operation": "istio-ingressgateway.istio-system.svc.cluster.local:8180/*"
                        }
                    }
                ],
                "includeRequestAttemptCount": true
            }
        ],
        "validateClusters": false
    },

Is there anything else which can I check?

Thanks.

1 Like

I see also this in istiod:

2023-11-30T16:05:29.441821Z     warn    constructed http route config for route https.8180.https-egressgateway.redis-tls-egressgateway.redis-tls on port 8180 with no vhosts; Setting up a default 404 vhost
2023-11-30T16:05:29.553376Z     warn    buildGatewayRoutes: no gateways for router istio-egressgateway-6c59875c45-hdf2w.istio-system

I suppose this is the problem, what could be the root cause?

I wonder if I can get any direction for this from community wizards, perhaps @howardjohn, @hzxuzhonghu, anyone else (sorry, I’m new here) - any pointers would be appreciated.

Gateways and the app seem to be in sync:

NAME                                                  CLUSTER        CDS        LDS        EDS        RDS        ECDS         ISTIOD                      VERSION
istio-egressgateway-6c59875c45-hdf2w.istio-system     Kubernetes     SYNCED     SYNCED     SYNCED     SYNCED     NOT SENT     istiod-5985966969-46qq7     1.15.2
istio-ingressgateway-b469f8549-vtmqg.istio-system     Kubernetes     SYNCED     SYNCED     SYNCED     SYNCED     NOT SENT     istiod-5985966969-46qq7     1.15.2
redis-tls-app-5fc67968f9-j6ktm.redis-tls              Kubernetes     SYNCED     SYNCED     SYNCED     SYNCED     NOT SENT     istiod-5985966969-46qq7     1.15.2

I enabled debug log for istiod and I can see this:

2023-12-04T06:54:38.230901Z	debug	buildGatewayListeners: no gateways for router istio-egressgateway-6c59875c45-hdf2w.istio-system

2023-12-04T06:54:40.161237Z     debug   ads     ADS:EDS: RESOURCE CHANGE added map[outbound|15021|redis-tls-external-srv|istio-egressgateway.istio-system.svc.cluster.local:{} outbound|443|redis-tls-external-srv|istio-egressgateway.istio-system.svc.cluster.local:{} outbound|8080|redis-tls-external-srv|istio-egressgateway.istio-system.svc.cluster.local:{} outbound|80|redis-tls-external-srv|istio-egressgateway.istio-system.svc.cluster.local:{} outbound|8180|redis-tls-external-srv|istio-egressgateway.istio-system.svc.cluster.local:{}] removed map[] redis-tls-app-f4d846d65-ct4cz.redis-tls-13093 2023-12-04T06:54:40Z/1148 WPUP9nExEJM=fd2b57b6-6bb6-4600-9645-c920cf7ffa93

However listeners on egress gateway seem to be fine:

$ kubectl istio proxy-config listeners -n istio-system istio-egressgateway-6c59875c45-hdf2w
ADDRESS PORT  MATCH                                           DESTINATION
0.0.0.0 8180  SNI: clustercfg.cccccccccc.cache.amazonaws.com  Route: https.8180.http-egressgateway.redis-tls-egressgateway.redis-tls
0.0.0.0 15021 ALL                                             Inline Route: /healthz/ready*
0.0.0.0 15090 ALL                                             Inline Route: /stats/prometheus*

It looks like I’m running out of ideas what to check in the next step. Is this something which perhaps was corrected in newest versions of Istio?