Service health check exposed for AWS NLB

Hi,

We’re investigating using Istio on AWS EKS and have a question regarding exposing the health of a service exposed outside of the cluster using an Ingress Gateway with a Network Load Balancer (NLB). At the moment, the health of the NLB would reflect the health of the envoy proxies (ingress proxies). What we need is the health of the NLB to reflect the health of the service behind all the layers (e.g. via NLB -> node_IP:NodePort -> envoy ingress -> Istio config -> service X pods readinessProbe). Is this possible?

The reason we think this is important is that several other AWS services use the health of elastic load balancers to infer the health of the endpoint, e.g. Route53 or Global Accelerator.

We’ve found some old issues about something that sounds similar but unfortunately never have all the detail. E.g. https://github.com/istio/istio/issues/9385 and https://github.com/istio/istio/issues/12503.

We currently use other ingress solutions (using either ALB Ingress Controller or NLB) but they get traffic straight into the service without Istio, meaning we won’t be able to do all the nice things the Istio way, like canary deploys or A/B testing.

FYI we’re on k8s 1.17 and Istio 1.7.1.

Any thoughts/help is appreciated.

1 Like

What kind of health check are you using? if its Http there is no reason why you cant route a request (ex example.foo/healthz) to your pod to respond to. I wouldnt advise the setup you are trying to achieve as an interruption in your application could cause you to lose access to ingress and does not allow you to handle things more gracefully.

1 Like

I thought about the same when I first saw the AWS ELB default health check created by Istio. But then I realized… we have several listeners and targets for that ELB (one per service app)… Which one do you decide to be the one for the health check of the Load Balancer?
I guess it is different with a NLB, since you can configure one HC per target group (per service app).

Does Istio provide this HC capabilities instead?

We’re using the health check that’s automatically setup by k8s in-tree cloud service controller. If the service has externalTrafficPolicy: Cluster then it’s a TCP health check on the designated NodePort. If you change it to externalTrafficPolicy: Local then k8s will create a HTTP health check to a special port which be healthy if one of the pods is running on that node. These health checks, as far as I’m aware, are not configurable (especially when using NLB; the older CLB has a few more knobs but the NLB does not have feature parity).

The problem here is, no matter which externalTrafficPolicy value you use, you’re only reflecting the health of the ingress gateway. Here’s a snippet from the Global Accelerator docs:

For Application Load Balancer or Network Load Balancer endpoints, you configure health checks for the resources by using Elastic Load Balancing configuration options. … Health check options that you choose in Global Accelerator do not affect Application Load Balancers or Network Load Balancers that you’ve added as endpoints.

On a side note, it looks like the ALB ingress controller 2.0 is going to add the ability to wire pod IPs straight into ELB target groups (just as is does now with ALBs). So that’s nice the the NLB could go direct to the pods (discovered via service endpoints).

Yea, we came to the conclusion that we’ll have to use a seperate Istio ingress gateway + NLB for each traffic source. At first we thought we’d be able to use one ingress gateway as a DaemonSet, fronted by a single NLB, but then ran into the health check problem.

This is also from the Global Accelerator docs:

When you have an Application Load Balancer or Network Load Balancer that includes multiple target groups, Global Accelerator considers the load balancer endpoint to be healthy only if each target group behind the load balancer has at least one healthy target. If any single target group for the load balancer has only unhealthy targets, Global Accelerator considers the endpoint to be unhealthy.

So we definately cannot share the same NLB with multiple traffic sources.

The nicest would be if we could use a single ingress gateway (DaemonSet) + NLB, then have custom health checks at the GA end that say pass an HTTP request through with say Host: foo.example.com to test if there’s ultimately a foo running in the cluster, behind all the layers.

It is a combination of layers that you should be using. Starting with the application pod (assuming kubernetes) you should be using kubernetes liveness and readiness probes. any pod that responds negatively will not receive traffic (will be removed from the LB pool) thus you are only left with “healthy” pods behind your ingress controller. For cases where pods are up but returning 5xx errors you should use Istio Outlier detection https://istio.io/latest/docs/reference/config/networking/destination-rule/#OutlierDetection. And finally you should be pointing your aws elb health checks to your ingress controllers /healthz/ready.

Interesting. Thanks for sharing.
We are currently using AWS ALB ingress controller, and one of the reason why we are evaluating Istio is to be able to have just one LB as the ingress-controller, instead of having one LB per service.

Thanks for the info.
Are you recommending to change the externalTrafficPolicy to Local to have that HTTP HC instead of just the TCP port check?

typically yes because the application has more control over what makes it “healthy”. I dont know if istio gateways use it beyond just sending 200 status though.

If you use helm to deploy the ALB ingress controller, the chart is moving from the official (deprecated) incubating repo to the EKS repo. It’s also renamed (and I’m guessing the project will be too) to AWS Load Balancer Controller. Here’s the new chart. Note that it’ll deploy the 2.0.0 version, which is only RC at time of writing and not production ready.

The new version looks pretty promising, it’s going to support NLBs via IP targetting (just like it does with ALBs now), shared ALBs (so across namespaces), and also a new CRD to support adding thing to existing target groups (created outside of k8s, e.g. cloud formation, teraform, cdk, etc).

The main reason for us to use externalTrafficPolicy: Local is source IP preservation. From the k8s service docs:

Unlike Classic Elastic Load Balancers, Network Load Balancers (NLBs) forward the client’s IP address through to the node. If a Service’s .spec.externalTrafficPolicy is set to Cluster , the client’s IP address is not propagated to the end Pods.

By setting .spec.externalTrafficPolicy to Local , the client IP addresses is propagated to the end Pods, but this could result in uneven distribution of traffic. Nodes without any Pods for a particular LoadBalancer Service will fail the NLB Target Group’s health check on the auto-assigned .spec.healthCheckNodePort and not receive any traffic.

In order to achieve even traffic, either use a DaemonSet or specify a pod anti-affinity to not locate on the same node.

Unfortunately the built in service controller for AWS that creates the NLB does not currently support tweaking the health check and it is not very aggressive (I think 3 x 30s to become health/unhealthy). I think I saw a change coming in 1.18 (should be hitting EKS this month) that makes the hard coded health check the most aggressive it can be (2 x 10s IIRC).

This. This is essentially the crux of our problem. If I have an Istio ingress gateway that represents a single traffic source (e.g. foo.example.com), and that gateway (either a Deployment or a DaemonSet) is fronted by an AWS NLB, how do I get the gateway to represent the health of foo, rather than just effectively saying “yes, there are ingress gateway pods running”?

Here is a full example config:

kind: Deployment
metadata:
  name: foo
spec:
  selector:
    matchLabels:
      app: foo
  template:
    metadata:
      labels:
        app: foo
    spec:
      containers:
        - name: foo
          image: nginx:1.18
          ports:
            - containerPort: 8080
              name: http
          readinessProbe:
            httpGet:
              path: /
              port: http
---
apiVersion: v1
kind: Service
metadata:
  name: foo
spec:
  selector:
    app: foo
  ports:
    - name: http
      port: 80
      targetPort: http
---
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
  namespace: istio-system
  name: istiocontrolplane
spec:
  components:
    ingressGateways:
      - enabled: false
        name: istio-ingressgateway
      - enabled: true
        k8s:
          service:
            externalTrafficPolicy: Local
            ports:
              - name: http
                port: 80
                targetPort: 8080
              - name: https
                port: 443
                targetPort: 8080
          serviceAnnotations:
            service.beta.kubernetes.io/aws-load-balancer-type: 'nlb'
            service.beta.kubernetes.io/aws-load-balancer-ssl-cert: 'arn:aws:acm:<region>:<account_id>:certificate/<xyz>'
            service.beta.kubernetes.io/aws-load-balancer-ssl-ports: 'https'
        name: foo-ingressgateway
  profile: default
  values:
    global:
      proxy:
        holdApplicationUntilProxyStarts: true
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: foo
spec:
  selector:
    service.istio.io/canonical-name: foo-ingressgateway
  servers:
    - port:
        number: 80
        name: http
        protocol: HTTP
      hosts:
        - 'foo.example.com'
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: foo
spec:
  hosts:
    - foo.example.com
    - foo.default.svc.cluster.local
  gateways:
    - foo
    - mesh
  http:
    - route:
        - destination:
            host: foo.default.svc.cluster.local
            port:
              number: 80

When I deploy all this, I end up with a NLB that fronts the foo-ingressgateway pod(s), with an HTTP health check that is healthy if there’s a foo-ingressgateway pod running on the node. Note: the NLB is performing TLS termination.

If I scale the foo deployment down to 0, the health of the NLB will still be healthy, despite nothing to sevice requests going through the “chain” to foo.example.com.

OK, I think I’ve found a solution. Using the same Deployment and Service from above but with the following altered Istio resources:

kind: IstioOperator
metadata:
  namespace: istio-system
  name: istiocontrolplane
spec:
  components:
    ingressGateways:
      - enabled: false
        name: istio-ingressgateway
      - enabled: true
        k8s:
          service:
            ports:
              - name: http
                port: 80
                targetPort: 8080
              - name: https
                port: 443
                targetPort: 8080
            # We don't need a type LoadBalancer or even NodePort when using 
            # default networking in EKS. Pod IPs are reachable from NLBs.
            type: ClusterIP
          serviceAnnotations:
            # You must be very careful here, ready the NLB docs about health checks for valid values!
            service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold: '2'
            service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: '10'
            service.beta.kubernetes.io/aws-load-balancer-healthcheck-path: '/some_static_endpoint_of_foo'
            service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol: 'HTTP'
            service.beta.kubernetes.io/aws-load-balancer-healthcheck-timeout: '6'
            service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold: '2'
            service.beta.kubernetes.io/aws-load-balancer-ssl-cert: 'arn:aws:acm:<region>:<account_id>:certificate/<xyz>'
            service.beta.kubernetes.io/aws-load-balancer-ssl-ports: 'https'
            service.beta.kubernetes.io/aws-load-balancer-type: 'nlb-ip'
        label:
          # This label is important, otherwise if you have more than one of these the traffic will get crossed!
          app: foo-ingressgateway
        name: foo-ingressgateway
  profile: default
  values:
    global:
      proxy:
        holdApplicationUntilProxyStarts: true
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: foo
spec:
  selector:
    service.istio.io/canonical-name: foo-ingressgateway
  servers:
    - port:
        number: 80
        name: http
        protocol: HTTP
      hosts:
        # This is key. Do not expect Host headers.
        - '*'
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: foo
spec:
  hosts:
    # If you need to get to foo from within the k8s cluster, then probably use a different virtual service name.
    - '*'
  gateways:
    - foo
  http:
    - route:
        - destination:
            host: foo.default.svc.cluster.local
            port:
              number: 80

Using this config along with the AWS Load Balancer Controller (aka ALB Ingress Controller) v2.0.0 (currently release candidate), you will get:

  • an NLB with one target group (NLB is performing TLS termination).
  • The target group will be populated by IPs of the foo-ingressgateway pods running in the istio-system namespace (the foo Service Endpoints).
  • The target group will be performing an HTTP health check which will get passed through to foo by the envoy proxy.
  • If there are no foos, then you’ll get a 503 and the NLB will become unhealthy.