I am trying to enable mTLS in my mesh that I have already working with istio’s sidecars.
The problem I have is that I just get working connections up to one point, and then it fails to connect.
This is how the services are set up right now with my failing implementation of mTLS (simplified):
Istio IngressGateway -> NGINX pod -> API Gateway -> Service A -> [ Database ] -> Service B
First thing to note is that I was using a NGINX pod as a load balancer to proxy_pass my requests to my API Gateway or my frontend page. I tried keeping that without the istio IngressGateway but I wasn’t able to make it work. Then I tried to use Istio IngressGateway and connect directly to the API Gateway with VirtualService but also fails for me. So I’m leaving it like this for the moment because it was the only way that my request got to the API Gateway successfully.
Another thing to note is that Service A first connects to a Database outside the mesh and then makes a request to Service B which is inside the mesh and with mTLS enabled.
NGINX, API Gateway, Service A and Service B are within the mesh with mTLS enabled and “istioctl authn tls-check” shows that status is OK.
NGINX and API Gateway are in a namespace called “gateway”, Database is in “auth” and Service A and Service B are in another one called “api”.
Istio IngressGateway is in namespace “istio-system” right now.
So the problem is that everything work if I set STRICT mode to the gateway namespace and PERMISSIVE to api, but once I set STRICT to api, I see the request getting into Service A, but then it fails to send the request to Service B with a 500.
This is the output when it fails that I can see in the istio-proxy container in the Service A pod:
api/serviceA[istio-proxy]: [2019-09-02T12:59:55.366Z] "- - -" 0 - "-" "-" 1939 0 2 - "-" "-" "-" "-" "10.20.208.248:4567" outbound|4567||database.auth.svc.cluster.local 10.20.128.44:35366 10.20.208.248:4567
10.20.128.44:35364 -
api/serviceA[istio-proxy]: [2019-09-02T12:59:55.326Z] "POST /api/my-call HTTP/1.1" 500 - "-" "-" 74 90 60 24 "10.90.0.22, 127.0.0.1, 127.0.0.1" "PostmanRuntime/7.15.0" "14d93a85-192d-4aa7-aa45-1501a71d4924" "serviceA.api.svc.cluster.local:9090" "127.0.0.1:9090" inbound|9090|http-serviceA|serviceA.api.svc.cluster.local - 10.20.128.44:9090 127.0.0.1:0 outbound_.9090_._.serviceA.api.svc.cluster.local
No messages in ServiceB though.
Currently, I do not have a global MeshPolicy, and I am setting Policy and DestinationRule per namespace
Policy:
apiVersion: "authentication.istio.io/v1alpha1"
kind: "Policy"
metadata:
name: "default"
namespace: gateway
spec:
peers:
- mtls:
mode: STRICT
---
apiVersion: "authentication.istio.io/v1alpha1"
kind: "Policy"
metadata:
name: "default"
namespace: auth
spec:
peers:
- mtls:
mode: STRICT
---
apiVersion: "authentication.istio.io/v1alpha1"
kind: "Policy"
metadata:
name: "default"
namespace: api
spec:
peers:
- mtls:
mode: STRICT
DestinationRule:
apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
name: "mutual-gateway"
namespace: "gateway"
spec:
host: "*.gateway.svc.cluster.local"
trafficPolicy:
tls:
mode: ISTIO_MUTUAL
---
apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
name: "mutual-api"
namespace: "api"
spec:
host: "*.api.svc.cluster.local"
trafficPolicy:
tls:
mode: ISTIO_MUTUAL
---
apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
name: "mutual-auth"
namespace: "auth"
spec:
host: "*.auth.svc.cluster.local"
trafficPolicy:
tls:
mode: ISTIO_MUTUAL
Then I have some DestinationRule to disable mTLS for Database (I have some other services in the same namespace that I want to enable with mTLS) and for Kubernetes API
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: "myDatabase"
namespace: "auth"
spec:
host: "database.auth.svc.cluster.local"
trafficPolicy:
tls:
mode: DISABLE
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: "k8s-api-server"
namespace: default
spec:
host: "kubernetes.default.svc.cluster.local"
trafficPolicy:
tls:
mode: DISABLE
Then I have my IngressGateway like so:
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: ingress-gateway
namespace: istio-system
spec:
selector:
istio: ingressgateway # use istio default ingress gateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- my-api.example.com
tls:
httpsRedirect: true # sends 301 redirect for http requests
- port:
number: 443
name: https
protocol: HTTPS
tls:
mode: SIMPLE
serverCertificate: /etc/istio/ingressgateway-certs/tls.crt
privateKey: /etc/istio/ingressgateway-certs/tls.key
hosts:
- my-api.example.com
And lastly, my VirtualServices:
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: ingress-nginx
namespace: gateway
spec:
hosts:
- my-api.example.com
gateways:
- ingress-gateway.istio-system
http:
- match:
- uri:
prefix: /
route:
- destination:
port:
number: 80
host: ingress.gateway.svc.cluster.local # this is NGINX pod
corsPolicy:
allowOrigin:
- my-api.example.com
allowMethods:
- POST
- GET
- DELETE
- PATCH
- OPTIONS
allowCredentials: true
allowHeaders:
- "*"
maxAge: "24h"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: api-gateway
namespace: gateway
spec:
hosts:
- my-api.example.com
- api-gateway.gateway.svc.cluster.local
gateways:
- mesh
http:
- match:
- uri:
prefix: /
route:
- destination:
port:
number: 80
host: api-gateway.gateway.svc.cluster.local
corsPolicy:
allowOrigin:
- my-api.example.com
allowMethods:
- POST
- GET
- DELETE
- PATCH
- OPTIONS
allowCredentials: true
allowHeaders:
- "*"
maxAge: "24h"
One thing that I don’t understand is why do I have to create a VirtualService for my API Gateway and why do I have to use “mesh” in the gateways block. If I remove this block, I don’t get my request in API Gateway, but if I do, it works and my requests even get to the next service (Service A), but not the next one to that.
Thanks for the help. I am really stuck with this.