[Question] Enable in-mesh pod to communicate with another in-mesh pod using PodIP (mTLS enabled)

What resources do I need to setup to enable communication between two in-mesh pods to communicate with each other over Pod IP.

Specifically I am facing issues with redis-sentinel trying to talk to redis-master pod using the master’s Pod IP

Sentinel logs

Current master is 100.96.238.73
Error: Protocol error, got "\x15" as reply type byte
Connecting to master failed.  Waiting...
Current master is 100.96.238.73
Error: Protocol error, got "\x15" as reply type byte
Connecting to master failed.  Waiting...
Current master is 100.96.238.73
Error: Protocol error, got "\x15" as reply type byte
Connecting to master failed.  Waiting...
$ kubectl -n mynamespace get svc redis-ha-master-svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE redis-ha-
master-svc ClusterIP 100.70.52.175 <none> 6379/TCP 3d23h 

kubectl -n mynamespace get pod -l redis-role=master -o wide 
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE 
redis-ha-server-0 4/4 Running 0 2d22h 100.96.238.73 ip-10-117-6-0.eu-west-1.compute.internal <none>

From within the sentinel pod I see failures when using the Pod IP but it succeeds when using the service IP

export SVC_IP = 100.70.52.175
export POD_IP = 100.96.238.73

$ redis-cli -h $POD_IP -p 6379 PING 
Error: Protocol error, got "\x15" as reply type byte

$ redis-cli -h 100.70.52.175 -p 6379 PING 
PONG

Istio version : 1.0.5

1 Like

Fundamentally, Istio is designed to work with services, not direct Pod IPs. It is a Service Mesh afterall.

If your application is designed to directly use pod IPs you have a couple of options:

  1. Disable Istio for this communication
  2. Add a new ServiceEntry manually that identifies the IP address and associates it with a “service.” In this ServiceEntry, set addresses to the pod IP, and resolution to NONE.

Using (2) for a Pod IP is likely to be brittle. If the pod goes down and is rescheduled with a new IP, connectivity to it will break until you update the ServiceEntry.

@spikecurtis Do I also need a destinationrule setup for the redis-master to receive the request from the redis-sentinel pod ?

I added the ServiceEntry as follows (redis-master = 100.96.249.145),

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  creationTimestamp: null
  name: pod2pod-1
  namespace: here-olp-publish-dev
  resourceVersion: "493397013"
spec:
  addresses:
  - 100.96.249.145
  hosts:
  - 100.96.249.145
  location: MESH_INTERNAL
  ports:
  - name: redis-port
    number: 6379
    protocol: TCP
---

and I can see it in the listeners section,

1110	"100.67.229.91:443"
1111	"100.96.249.145:6379"
1112	"100.70.218.29:6379"
1113	

and in the config_dump as well,

{
          "name": "envoy.tcp_proxy",
          "config": {
           "cluster": "outbound|6379||100.96.249.145",
           "access_log": [
            {
             "name": "envoy.file_access_log",
             "config": {
              "path": "/dev/stdout",
              "format": "[%START_TIME%] %RESPONSE_FLAGS% %BYTES_RECEIVED% %BYTES_SENT% %DURATION% \"%UPSTREAM_HOST%\" %UPSTREAM_CLUSTER% %UPSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_REMOTE_ADDRESS%\n"
             }
            }
           ],
           "stat_prefix": "outbound|6379||100.96.249.145"
          }
         }

yet it fails to connect. Redis fails with the following error,

/ $ redis-cli -p 6379 -h 100.96.249.145 ping
Error: Protocol error, got "\x15" as reply type byte
/ $ command terminated with exit code 137

Opened this issue with stable/redis-ha helm chart deployed on istio service mesh.

You need to set resolution: NONE on your ServiceEntry.

@spikecurtis So I did the serviceentry (with resolution: NONE). I am certain the proxy knows how to route it to the destination, but it seems like it does not use mTLS, even though the Location is set to MESH_INTERNAL. The failure message,

/ $ redis-cli -p 6379 -h 100.96.249.145 ping
Error: Protocol error, got "\x15" as reply type byte

arises due to a failed mTLS negotiation.

Any ideas why MESH_INTERNAL would not use mTLS ?

So I made a breakthrough and got a solution for my last question by creating a destinationrule with mode: ISTIO_MUTUAL but now the problem is, how do we create a DestinationRule that could be a catch-all ???
My ServiceEntry and DestinationRule look as follows (to capture across the
Cluster CIDR)

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: pod2pod-1
spec:
  hosts:
  - 100.96.20.34
  addresses:
  - 100.96.0.0/11
  ports:
  - number: 6379
    name: redis-port
    protocol: TCP
  location: MESH_INTERNAL
  resolution: NONE
---
apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
  name: pod2pod
spec:
  host: "100.96.20.34"
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL

but as you can observe this applies only to 100.96.20.34. I want a DestinationRule that would apply to 100.96.0.0/11. Is that possible ?

I don’t know that it is possible to get DestinationRules to apply to an IP prefix.

I think you’re hitting a bug in the way we handle IP addresses vs hostnames. I believe your ServiceEntry would work with something like:

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: pod2pod-1
spec:
  hosts:
  - redis-cidr-service
  addresses:
  - 100.96.0.0/11
  ports:
  - number: 6379
    name: redis-port
    protocol: TCP
  location: MESH_INTERNAL
  resolution: NONE
---
apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
  name: pod2pod
spec:
  host: redis-cidr-service
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL

In other words, swapping out the IP address “hostname” for a string.

@ZackButcher
I just tried this out. It didn’t work.

I did exactly as you pasted with the hostname instead of the IP and it failed to route to the Pod IP.

I switched it back to the explicit Pod IP in the destinationrule and it worked.
With hostname in DestinationRule and ServiceEntry


/ # (printf "PING\r\n";) | nc 100.96.20.38 6379
/ # (printf "PING\r\n";) | nc 100.96.20.38 6379
/ # (printf "PING\r\n";) | nc 100.96.20.38 6379
/ # (printf "PING\r\n";) | nc 100.96.20.38 6379

With explicit IP address of the master pod in DestinationRule and ServiceEntry host field,

/ # (printf "PING\r\n";) | nc 100.96.20.38 6379
+PONG
/ # (printf "PING\r\n";) | nc 100.96.20.38 6379
+PONG

I think the problem is that this is a TCP service (Non HTTP) - https://istio.io/docs/reference/config/istio.networking.v1alpha3/#ServiceEntry

For non-HTTP protocols such as mongo/opaque TCP/even HTTPS, the hosts will be ignored. If one or more IP addresses are specified, the incoming traffic will be identified as belonging to this service if the destination IP matches the IP/CIDRs specified in the addresses field.

Why do you care about IP addresses? In normal circumstances you are interested to route traffic to the pods that are implementing/serving a certain services. In your DestinationRule you could define a subset which would match only the pods implementing the given service and would route traffic to them no matter what is their IP address.

@lbudai Could you possibly elaborate on that, with an example ? I am not sure I follow/understand what subsets are.
My understanding was that I need to specify a host address (in this case the Pod IP, at the very least, to create a DestinationRule.

As it was already mentioned before a service mesh like Istio cares about services. For routing traffic between clients (consumers of the service) and providers (the pods that implements the service) Istio is using the following logic: the VirtualService or the ServiceEntry objects defines that a rquest for a certain service as it is requested by the consumer should be directed to a certain service as it is in the service registry. Then the destination rule allows us to specify how to reach the pods that implement that service.

here you can read about subsets: istio. io/docs/reference/config/istio.networking.v1alpha3/#Subset but basically the subset is used to identify a subset of the endpoints of a service, but in your case there is no service defined.

For your case, I think the situation is a bit delicate. I suppose you are trying to build a HA redis with multiple redis instances and sentinel as described here: https://redislabs.com/redis-features/high-availability

If you’re trying to run redis on kubernetes, then have a look at this also: https://redislabs.com/redis-features/kubernetes

I am trying to achieve the same following your comment,I am getting “no healthy upstream” error.

I was able to get mTLS working with our cluster which uses direct pod<->pod communication by using option 2 per @spikecurtis’s reply above ([Question] Enable in-mesh pod to communicate with another in-mesh pod using PodIP (mTLS enabled))

Are there any scaling concerns here with this approach/workaround? For example, if there were many pods in the cluster. I’m digging into the envoy internals and it looks like the routes/endpoints are all going through the InboundPassthroughClusterIpv4 which seems fine and good. But just wanted to see if anyone has found a practical limit to this approach with TLS enabled and using pod2pod comms.

I posted this. Want to understand what you think about my approach.