Custom ingress gateway balanced with NLB

Hello,

I’m running an Istio 1.7.6 installation ( with istio-operator ) and recently there was the need to provision a custom ingress gateway, balanced by a NLB in AWS. I added a new “IngressGateways” block in my IstioOperator CR which did work fine - a new NLB was provisioned in AWS.

The issue is: if I need to add a new port ( which in fact means a new listener in NLB ) , there is downtime in every listeners already configured there, so if I have this in my IstioOperator:

    service:
      ports:
      - name: status-port
        port: 15021
        targetPort: 15021
      - name: tcp
        port: 9000

and decide to add a port 9001:

    service:
      ports:
      - name: status-port
        port: 15021
        targetPort: 15021
      - name: tcp
        port: 9000
      - name: tcp-testing
        port: 9001

The listener in port 9000 will be affected when adding any other listener. Directly in AWS console, I can verify that every listener is registering a new target group at this time and that’s why I have downtime but it does not make sense why a new target group registration is triggered since there was no changes in that listener - in fact, If I edit my “kind: Service” and add manually a new port, a new listener will appear in NLB without affecting any other listeners.

Any clues?

Meanwhile. I realized the nodePort for every listener is changing whenever an update in service does happen (via istio operator ) and that does explain why NLB is registering a new target group. Example:

test-tcp-server-stack-network LoadBalancer 172.20.133.5 a976c5f81987243f2b3f98b1562999a0c0-0080ed2c8b99a281.elb.eu-west-1.amazonaws.com 15021:32296/TCP,9000:32024/TCP,9001:32025/TCP 1h

After removing 9001 port from IstioOperator:

test-tcp-server-stack-network LoadBalancer 172.20.133.5 a976c5f81987243f2b3f98b1562999a0c0-0080ed2c8b99a281.elb.eu-west-1.amazonaws.com 15021:32661/TCP,9000:32474/TCP 1h

I’ve tested this with a Classic Load Balancer and could not reproduce this issue ( in fact the notion of “Target Group” is not applicable in a Classic Load Balancer ), not sure if the problem relies on ELB type

Have you looked at using the out of tree aws-loadbalancer-controller for NLB IP Targets?

Hi

No, I’m not using IP target based NLB, just instance target, I can’t test it right now since my EKS is on 1.16 version, not compatible with NLB IP target.

So far my workaround is to set manually a nodePort in my custom ingress gateway like:

    service:
      ports:

      - name: tcp
        port: 9000
        nodePort: 30000
      - name: tcp-testing
        port: 9001
        nodePort: 30001


Not exactly elegant neither pratical but at least no target group re-registration is triggered since nodePort remains the same

@stevehipwell I’ve just tested the approach you mentioned. In fact, enabling nlb-ip does solve the problem with nodePorts. However it seems a new issue is introduced - I configured AWS Load Balancer Controller to use Readiness gates and considering this, when I provide a new port in my NLB custom ingress gateway, Istio-operator will redeploy the Deployment and Service associated, exactly in this order.That’s a problem because AWS LB Controller is not aware of the new Readiness Gate yet so it somehow will not be “injected” in Deployment and in the end, Deployment shows a Readiness Gate 1/1 when in fact should be 2/2
Redeploying Service before Deployment would fix the thing theoretically because AWS LB Controller would be aware in time of new changes and reflect them in Deployment