Istio Debug AuthorizationPolicy ejabberd - no client connection

Dear friends,

I run istio v1.19.3 deployed with helm charts in a kubernetes cluster. Istiod and istio-gateway are installed with default configurations. On a global level, the following settings are applied:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: allow-nothing
  namespace: istio-system
spec:
  action: ALLOW
  # To disable the rule, uncomment the rules section below.
  # rules:
  # - {}
---
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: "default"
  namespace: istio-system
spec:
  mtls:
    mode: STRICT

I have a gateway pointing to ejabberd:

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: ejabberdhelmgateway
  namespace: istio-system
spec:
  selector:
    istio: gateway # use istio default ingress gateway
  servers:
  - port:
      number: 5222
      name: xmpp-c2s
      protocol: TCP
    hosts:
    - "*"
  - port:
      number: 443
      name: xmpp-c2s-tls
      protocol: TLS
    tls:
      #mode: PASSTHROUGH
      mode: SIMPLE
      credentialName: wildcard-example-tls # must be the same as secret
    hosts:
      - "example.com"
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: ejabberd-helmgateway-virtual-service
  namespace: istio-system
spec:
  hosts:
    - "example.com"
  gateways:
  - ejabberdhelmgateway
  tcp:
  - match:
    - port: 5222
    route:
    - destination:
        host: ejabberd.ejabberd.svc.cluster.local
        port:
          number: 5222
  - match:
    - port: 443
    route:
    - destination:
        host: ejabberd.ejabberd.svc.cluster.local
        port:
          number: 5222

And the following authorization policies:

---
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: ejabberd-ingress-istio
  namespace: istio-system
spec:
  action: ALLOW
  rules:
  - to:
    - operation:
        hosts:
        - "example.com"
    - operation:
         ports:
         - "5222"
  selector:
    matchLabels:
      app: gateway
      istio: gateway
---
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: ejabberd-ingress-allow
  namespace: ejabberd
spec:
  action: ALLOW
  rules:
  - from:
    - source:
        principals:
        - cluster.local/ns/istio-system/sa/gateway
    to:
    - operation:
        hosts:
        - "example.com"
    - operation:
         ports:
         - "5222"

Challenge 1: Enable pod to pod clustering.

The clustering happens through direct connections via the ejabberd pod-name plus headless service name:
e.g. ejabberdctl join_cluster ejabberd@ejabberd-0.ejabberd.ejabberd.svc.cluster.local

From the ejabberd developer I have the feedback that only port 5210 for TCP traffic must be opened/allowed to enable clustering.

Therefore, I tried also with podAnnotations:

traffic.sidecar.istio.io/excludeInboundPorts: "5210"
traffic.sidecar.istio.io/excludeOutboundPorts: "5210"

With the following error:

Error: error
Error: {no_ping,'ejabberd@ejabberd-0.ejabberd-headless.ejabberd.svc.cluster.local'}

So I ended up with this configuration:

apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: ejabberd-allow-ejabberd
  namespace: ejabberd
spec:
  action: ALLOW
  rules:
  - {}
  # rules:
  #   - when:
  #     - key: destination.port
  #       values: ["5210"]
      # - key: source.principal
      #   values: ["cluster.local/ns/ejabberd/sa/ejabberd"]
---
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: "ejabberd-permissive"
  namespace: ejabberd
spec:
  selector:
    matchLabels:
      app: ejabberd
  mtls:
    mode: PERMISSIVE
#  portLevelMtls:
#    "5210":
#      mode: DISABLE

for the peerAuthentication I tried with portLevelMtls as well, but that was denied by the config, so I ended up with PERMISSIVE for the entire workload.

Also the ALLOW rule for ejabberd is not optimal, but it seems to be required as well for the clustering.

Challenge 2: External client connecting via public domain to XMPP server

However, my biggest problem is, that I cannot connect with any client anymore to the running systems, once I created the “allow-nothing” global rule. When I disable it, then all works again.

What happens is, that clients connect to the gateway and the traffic arrives at ejabberd:

# [info] (<0.795.0>) Accepted connection 127.0.0.6:57623 -> 100.64.25.20:5223
# [debug] Running hook c2s_closed: ejabberd_c2s:process_closed/2
# [debug] Running hook c2s_terminated: mod_pubsub:on_user_offline/2
# [debug] Running hook c2s_terminated: ejabberd_c2s:process_terminated/2

However, the connection is immediately closed. This is actually only a TCP connection which becomes established from the client side, ejabberd does not communicate by itself to the outside, so I do not expect any authorization policy to expected. Still it does not work.

And actually with the ejabberd/ejabberd-allow-ejabberd allow all authorization policy in place, this could not be the case anymore, that it would be blocked.

So my questions:

I think I have tried any possible scenario, how can I debug further?
ejabberd can be deployed through this helm-chart with the following values. I enabled sidecar injection on the namespace level:

hosts:
  - example.com
certFiles:
  secretName:
    - wildcard-custom-tls
statefulSet:
  replicas: 3
listen:
  c2s:
    enabled: true
    port: 5222
    expose: true
    exposedPort: 5222
    protocol: TCP
    options:
      ip: "::"
      module: ejabberd_c2s
service:
  type: ClusterIP

To create an account in ejabberd, just run:

kubectl exec ejabberd-0 -n ejabberd -c ejabberd -- ejabberdctl register user example.com pass

Connect to the server with clients supporting for example direct TLS on port 443, e.g. gajim.org

JID: user@example.com
Pass: pass
Hostname: either example.com or IP address of the load-balancer
Port: 443 (for directTLS) or 5222 (TCP)

Here is another connection example using beagleim for macOS

Any hint is appreciated! Thanks in advance!
Cheers,
Saarko

It looks like I am stuck here:

I encounter the following error/blocker on my gateway pod (public IP address: 51.1x8.1xx.x:443 ) when I connect with my XMPP client (IP address: 2x1.x06.2x2.2xx:54801):

[2023-11-03T03:38:01.312Z] "- - -" 0 - - rbac_access_denied_matched_policy[none] "-" 145 0 194 - "-" "-" "-" "-" "100.64.25.98:5223" outbound|5223||ejabberd.ejabberd.svc.cluster.local 100.64.23.221:43680 51.1x8.1xx.x:443 2x1.x06.2x2.2xx:54801 example.com -

I have a allow-nothing Authorization Policy in place:

apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: allow-nothing
  namespace: istio-system
spec:
  action: ALLOW

The client connects to my service with plain TCP or TLS which is routed to my application which also arrives at the istio-proxy :

[2023-11-03T03:38:01.312Z] "- - -" 0 - - - "-" 0 795 193 - "-" "-" "-" "-" "100.64.25.98:5223" inbound|5223|| 127.0.0.6:36119 100.64.25.98:5223 100.64.23.221:43680 outbound_.5223_._.ejabberd.ejabberd.svc.cluster.local -

However, the outgoing traffic which is done in response to the incoming connection is blocked by the gateway. I tried to create another authorization policy to enable it, no success unfortunately:

apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: allow-ejabberd-to-gateway
  namespace: istio-system
spec:
  action: ALLOW
  rules:
  - when:
      - key: source.namespace
        values: ["ejabberd"]
      - key: source.principal
        values: ["cluster.local/ns/ejabberd/sa/ejabberd"]

I tried also to specify a “from” rule with the service account and/or namespace as principal.

Both, istiod and the gateway are in the istio-system namespace, the application is in the ejabberd namespace. If I remove the allow-nothing policy, connections work fine!

Any hints are very much appreciated!