Egress gateway not working as expected, when installed with mesh expansion option. - v1.6.8

What we did?
Installed istio on 2 clusters to act as single mesh across all 2 clusters, lets name them OPS-Cluster, Data-Cluster. Using the below configs

Env:

  • Kubernete 1.16.13-gke.1
  • GKE Cluster
  • Istio 1.6.8

Installation process:

  1. Created istio-system namespace and secrets in both the clusters

  2. Installed Istio on OPS-Cluster using below config
    OPS-Cluster config

        apiVersion: install.istio.io/v1alpha1
        kind: IstioOperator
        metadata:
          namespace: istio-system
          name: example-istiocontrolplane
        spec:
          profile: default
          tag: 1.6.8
          values:
            global:
              multiCluster:
                clusterName: ops
              network: mesh-network-main
              meshNetworks:
                mesh-network-main:
                  endpoints:
                  - fromRegistry:  ops
                  gateways:
                  - registry_service_name: istio-ingressgateway.istio-system.svc.cluster.local
                    port: 443
                mesh-network:
                  endpoints:
                  - fromRegistry: data
                  gateways:
                  - registry_service_name: istio-ingressgateway.istio-system.svc.cluster.local
                    port: 443
              meshExpansion:
                enabled: true
          meshConfig:
            enableAutoMtls: true
            accessLogFile: "/dev/stdout"
            accessLogEncoding: JSON
            outboundTrafficPolicy:
              mode: REGISTRY_ONLY
          components:
            ingressGateways:
            - name: istio-ingressgateway
              enabled: true
              k8s:
                hpaSpec:
                  minReplicas: 2
                service:
                  type: LoadBalancer
                  ports:
                  - name: http2
                    port: 80
                    targetPort: 8080
                  - name: https
                    port: 443
                    targetPort: 8443
                overlays:
                  - kind: Service
                    name: istio-ingressgateway
                    patches:
                      - path: spec.ports.[name:tcp-citadel-grpc-tls]
                service_annotations:
                  cloud.google.com/load-balancer-type: Internal
            egressGateways:
                - enabled: true
                  k8s:
                    env:
                    - name: ISTIO_META_ROUTER_MODE
                      value: sni-dnat
                    hpaSpec:
                      maxReplicas: 5
                      metrics:
                      - resource:
                          name: cpu
                          targetAverageUtilization: 80
                        type: Resource
                      minReplicas: 1
                      scaleTargetRef:
                        apiVersion: apps/v1
                        kind: Deployment
                        name: istio-egressgateway
                    resources:
                      limits:
                        cpu: 2000m
                        memory: 1024Mi
                      requests:
                        cpu: 100m
                        memory: 128Mi
                    service:
                      ports:
                      - name: http2
                        port: 80
                      - name: https
                        port: 443
                      - name: tls
                        port: 15443
                        targetPort: 15443
                    strategy:
                      rollingUpdate:
                        maxSurge: 100%
                        maxUnavailable: 25%
                  name: istio-egressgateway
          addonComponents:
            kiali:
              enabled: true
              k8s:
                service:
                  type: LoadBalancer
                service_annotations:
                  cloud.google.com/load-balancer-type: Internal
    
  3. Extracted load balancer IP from OPS-Cluster and subtitue with the remotePilotAddress

  4. Installed Istio on Data-Cluster using below config

    apiVersion: install.istio.io/v1alpha1
    kind: IstioOperator
    spec:
      profile: remote
      tag: 1.6.8
      meshConfig:
        accessLogFile: "/dev/stdout"
        accessLogEncoding: JSON
      values:
        global:
          # The remote cluster's name and network name must match the values specified in the
          # mesh network configuration of the main cluster.
          multiCluster:
            clusterName: data
          network: mesh-network
          # Replace ISTIOD_REMOTE_EP with the the value of ISTIOD_REMOTE_EP set earlier.
          remotePilotAddress: xx.xx.xx.xx
          remotePilotCreateSvcEndpoint: true
          controlPlaneSecurityEnabled: true
          proxy:
            resources:
              limits:
                memory: "128Mi"
                cpu: "500m"
              requests:
                memory: "64Mi"
                cpu: "100m"
      ## The istio-ingressgateway is not required in the remote cluster if both clusters are on
      ## the same network. To disable the istio-ingressgateway component, uncomment the lines below.
      #
      components:
        ingressGateways:
        - name: istio-ingressgateway
          enabled: false
    
  5. Verified the pods health and status and they all look good.

  6. Created gateway for cross cluster visibility using below config

    apiVersion: networking.istio.io/v1alpha3
    kind: Gateway
    metadata:
      name: cluster-aware-gateway
      namespace: istio-system
    spec:
      selector:
        istio: ingressgateway
      servers:
      - port:
          number: 443
          name: tls
          protocol: TLS
        tls:
          mode: AUTO_PASSTHROUGH
        hosts:
        - "*.local"
    
  7. Created a remote secret using the istioctl x create-remote-secret for cross cluster communication

  8. Validated installation by creating helloworld pods in OPS, Data clusters and validated cross cluster communication,
    • Received response alternatively, Helloworld V1 and V2

  9. Verified proxy config endpoints proxy-config endpoints, they look healthy

Configure https egress:

Trying route the traffic through the egress gateway using below network configs

  1. Added Service Entry

    apiVersion: networking.istio.io/v1alpha3
    kind: ServiceEntry
    metadata:
      name: cnn
    spec:
      hosts:
      - edition.cnn.com
      ports:
      - number: 443
        name: tls
        protocol: TLS
      resolution: DNS
    
  2. Tested route out through

    istio-proxy, kubectl exec -it $SOURCE_POD -c sleep -- curl -sL -o /dev/null -D - https://edition.cnn.com/politics
    Output – HTTP/2 – 200
    

    Observed the proxy logs

  3. Added Gateway, Destination Rules and Virtual Services

    apiVersion: networking.istio.io/v1alpha3
    kind: Gateway
    metadata:
      name: istio-egressgateway
    spec:
      selector:
        istio: egressgateway
      servers:
      - port:
          number: 443
          name: tls
          protocol: TLS
        hosts:
        - edition.cnn.com
        tls:
          mode: PASSTHROUGH
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: egressgateway-for-cnn
    spec:
      host: istio-egressgateway.istio-system.svc.cluster.local
      subsets:
      - name: cnn
    ---
    apiVersion: networking.istio.io/v1alpha3
    kind: VirtualService
    metadata:
      name: direct-cnn-through-egress-gateway
    spec:
      hosts:
      - edition.cnn.com
      gateways:
      - mesh
      - istio-egressgateway
      tls:
      - match:
        - gateways:
          - mesh
          port: 443
          sniHosts:
          - edition.cnn.com
        route:
        - destination:
            host: istio-egressgateway.istio-system.svc.cluster.local
            subset: cnn
            port:
              number: 443
      - match:
        - gateways:
          - istio-egressgateway
          port: 443
          sniHosts:
          - edition.cnn.com
        route:
        - destination:
            host: edition.cnn.com
            port:
              number: 443
          weight: 100
    
  4. Tested route out through the gateway
    kubectl exec -it $SOURCE_POD -c sleep -- curl -v -sL -o /dev/null -D - https://edition.cnn.com/politics
    Output: Connection closed

    kubectl exec -it $SOURCE_POD -c sleep -- curl -svL -o /dev/null -D - https://edition.cnn.com/politics*   Trying 151.101.193.67:443...
    * Connected to edition.cnn.com (151.101.193.67) port 443 (#0)
    * ALPN, offering h2
    * ALPN, offering http/1.1
    * successfully set certificate verify locations:
    *   CAfile: /etc/ssl/certs/ca-certificates.crt
      CApath: none
    } [5 bytes data]
    * TLSv1.3 (OUT), TLS handshake, Client hello (1):
    } [512 bytes data]
    * OpenSSL SSL_connect: Connection reset by peer in connection to edition.cnn.com:443
    * Closing connection 0
    command terminated with exit code 35
    

What’s expected?
Expected to route out via egress gateway, no log entry in egress gateway

What is happening?
Seeing error in the response

What troubleshooting we did?

Troubleshoot step 1

istioctl proxy-config listeners $SOURCE_POD --address 0.0.0.0 --port 443 -o json

{
                "filterChainMatch": {
                    "serverNames": [
                        "edition.cnn.com"
                    ]
                },
                "filters": [
                    {
                        "name": "istio.stats",
                        "typedConfig": {
                            "@type": "type.googleapis.com/udpa.type.v1.TypedStruct",
                            "typeUrl": "type.googleapis.com/envoy.extensions.filters.network.wasm.v3.Wasm",
                            "value": {
                                "config": {
                                    "configuration": "{\n  \"debug\": \"false\",\n  \"stat_prefix\": \"istio\",\n  \"metrics\": [\n    {\n      \"dimensions\": {\n        \"source_cluster\": \"node.metadata['CLUSTER_ID']\",\n        \"destination_cluster\": \"upstream_peer.cluster_id\"\n      }\n    }\n  ]\n}\n",
                                    "root_id": "stats_outbound",
                                    "vm_config": {
                                        "code": {
                                            "local": {
                                                "inline_string": "envoy.wasm.stats"
                                            }
                                        },
                                        "runtime": "envoy.wasm.runtime.null",
                                        "vm_id": "tcp_stats_outbound"
                                    }
                                }
                            }
                        }
                    },
                    {
                        "name": "envoy.tcp_proxy",
                        "typedConfig": {
                            "@type": "type.googleapis.com/envoy.config.filter.network.tcp_proxy.v2.TcpProxy",
                            "statPrefix": "outbound|443|cnn|istio-egressgateway.istio-system.svc.cluster.local",
                            "cluster": "outbound|443|cnn|istio-egressgateway.istio-system.svc.cluster.local",
                            "accessLog": [
                                {
                                    "name": "envoy.file_access_log",
                                    "typedConfig": {
                                        "@type": "type.googleapis.com/envoy.config.accesslog.v2.FileAccessLog",
                                        "path": "/dev/stdout",
                                        "jsonFormat": {
                                            "authority": "%REQ(:AUTHORITY)%",
                                            "bytes_received": "%BYTES_RECEIVED%",
                                            "bytes_sent": "%BYTES_SENT%",
                                            "downstream_local_address": "%DOWNSTREAM_LOCAL_ADDRESS%",
                                            "downstream_remote_address": "%DOWNSTREAM_REMOTE_ADDRESS%",
                                            "duration": "%DURATION%",
                                            "istio_policy_status": "%DYNAMIC_METADATA(istio.mixer:status)%",
                                            "method": "%REQ(:METHOD)%",
                                            "path": "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%",
                                            "protocol": "%PROTOCOL%",
                                            "request_id": "%REQ(X-REQUEST-ID)%",
                                            "requested_server_name": "%REQUESTED_SERVER_NAME%",
                                            "response_code": "%RESPONSE_CODE%",
                                            "response_flags": "%RESPONSE_FLAGS%",
                                            "route_name": "%ROUTE_NAME%",
                                            "start_time": "%START_TIME%",
                                            "upstream_cluster": "%UPSTREAM_CLUSTER%",
                                            "upstream_host": "%UPSTREAM_HOST%",
                                            "upstream_local_address": "%UPSTREAM_LOCAL_ADDRESS%",
                                            "upstream_service_time": "%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%",
                                            "upstream_transport_failure_reason": "%UPSTREAM_TRANSPORT_FAILURE_REASON%",
                                            "user_agent": "%REQ(USER-AGENT)%",
                                            "x_forwarded_for": "%REQ(X-FORWARDED-FOR)%"
                                        }
                                    }
                                }
                            ]
                        }
                    }
                ],
                "metadata": {
                    "filterMetadata": {
                        "istio": {
                            "config": "/apis/networking.istio.io/v1alpha3/namespaces/default/virtual-service/direct-cnn-through-egress-gateway"
                        }
                    }
                }
            },

and able to see the “cluster” value as istio-egress

Troubleshoot step 2

istioctl proxy-config endpoints $SOURCE_POD --cluster “outbound|443|cnn|istio-egressgateway.istio-system.svc.cluster.local”
output: endpoints seems to be healthy
ENDPOINT STATUS OUTLIER CHECK CLUSTER 10.174.32.7:8443 HEALTHY OK outbound|443|cnn|istio-egressgateway.istio-system.svc.cluster.local 10.174.33.6:8443 HEALTHY OK outbound|443|cnn|istio-egressgateway.istio-system.svc.cluster.local

Troubleshoot step 3

kubectl exec -i -n istio-system $(kubectl get pod -l istio=egressgateway -n istio-system -o jsonpath='{.items[0].metadata.name}')  -- cat /etc/certs/cert-chain.pem | openssl x509 -text -noout  | grep 'Subject Alternative Name' -A 1
cat: /etc/certs/cert-chain.pem: No such file or directory
command terminated with exit code 1
unable to load certificate
4421180864:error:09FFF06C:PEM routines:CRYPTO_internal:no start line:pem/pem_lib.c:694:Expecting: TRUSTED CERTIFICATE

Troubleshoot step 4

kubectl exec -it $SOURCE_POD -c sleep -- openssl s_client -connect edition.cnn.com:443 -servername edition.cnn.comCONNECTED(00000003)
write:errno=104
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 0 bytes and written 317 bytes
Verification: OK
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---
command terminated with exit code 1

Troubleshoot step 5

kubectl exec $(kubectl get pod -l istio=egressgateway -n istio-system -o jsonpath='{.items[0].metadata.name}') -c istio-proxy -n istio-system -- pilot-agent request GET stats | grep edition.cnn.com.upstream_cx_total

Not sure where I went wrong, I have used the below documentation,


Can someone please shed some light on this issue?