I’m trying to spin up a new set of K8S clusters on AWS (using kops, cluster at 1.14.1) with Istio 1.3.1 and I’m having a few issues getting pods to communicate.
The TLDR on this issue is the following:
I have two separate namespaces, in one I have a mongo cluster with replica sets across multiple shards, in a second namespace I have a simpler set of mongo replicas (not sharded).
It seems which ever one I setup first works properly, and the other one fails to connect (ie. I cannot run mongo’s rs.initiate()
command with success).
I cannot find any supporting details on how to debug such an issue and I would love some direction on this, I can provide additional logs as needed.
Setup
Without being overly verbose here is what I’m doing:
- I follow the multi-cluster, replicated control plane setup b/c eventually this will be one of many clusters - link.
- I created the istio-system namespace, CA certs (and confirmed that cluster-to-cluster communication via mTLS worked with the ServiceEntry with sleep/httpbin example so I know istio is working in some capacity), and then loaded in istio-init and the istio helm generated files:
i. I got istio via:curl -L https://git.io/getLatestIstio | ISTIO_VERSION=1.3.1 sh -
ii. I setup the init properly and confirmed the crds, etc…
iii. I used the following config for setup:helm template install/kubernetes/helm/istio --name istio --namespace istio-system -f install/kubernetes/helm/istio/example-values/values-istio-multicluster-gateways.yaml --set global.mtls.enabled=true --set grafana.enabled=true --set tracing.enabled=true --set tracing.provider=zipkin > istio-multicluster.yaml
iv. I wait for all pods in istio-system to be in a ready state - I setup the DNS for eventual ServiceEntry stubbing:
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: kube-dns
namespace: kube-system
data:
stubDomains: |
{"global": ["$(kubectl get svc -n istio-system istiocoredns -o jsonpath={.spec.clusterIP})"]}
EOF
- I setup both namespaces with
istio-injection: enabled
set properly - I launch a replica stateful set for mongo
Config for statefulset and service here
apiVersion: apps/v1 kind: StatefulSet metadata: name: mongod-capture labels: app: mongod-capture namespace: capt-db spec: serviceName: mongod-capture selector: matchLabels: app: mongod-capture replicas: 3 updateStrategy: type: RollingUpdate template: metadata: labels: app: mongod-capture replicaset: rs0 cluster: mendota spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: replicaset operator: In values: - rs0 topologyKey: kubernetes.io/hostname terminationGracePeriodSeconds: 10 containers: - name: main image: mongo:4.0.10 command: - "mongod" - "--port" - "27017" - "--bind_ip" - "0.0.0.0" - "--auth" - "--wiredTigerCacheSizeGB" - "0.5" - "--replSet" - "rs0" - "--keyFile" - "/etc/db-keys/keys" resources: requests: cpu: 50m memory: 100Mi ports: - containerPort: 27017 volumeMounts: - name: mongod-capture-persistent-storage mountPath: /data/db - name: db-keys-internal-capt mountPath: "/etc/db-keys" readOnly: true nodeSelector: kops.k8s.io/instancegroup: capt-db priorityClassName: db-config volumes: - name: db-keys-internal-capt secret: secretName: db-keys-internal-capt defaultMode: 0400 volumeClaimTemplates: - metadata: name: mongod-capture-persistent-storage annotations: volume.beta.kubernetes.io/storage-class: aws-hdd-db spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 50Gi # Service def: apiVersion: v1 kind: Service metadata: name: mongod-capture labels: name: mongod-capture cluster: mendota namespace: capt-db spec: ports: - name: "mongo" port: 27017 targetPort: 27017 clusterIP: None selector: app: mongod-capture
The config above is for the simpler replicated but not sharded cluster. There is a separate set of configs I use for sharding the other mongo cluster (mostly just more stateful sets, mongos
routers, config instances, etc…).
Essentially, when I run the rs.initiate
command (which looks like this for example: rs.initiate({"configsvr": true, "_id": "ConfigDBRepSet", "members": [{"host": "mongod-configdb-0.mongod-configdb.prod-db.svc.cluster.local:27017", "_id": 0}, {"host": "mongod-configdb-1.mongod-configdb.prod-db.svc.cluster.local:27017", "_id": 1}, {"host": "mongod-configdb-2.mongod-configdb.prod-db.svc.cluster.local:27017", "_id": 2}]})
) the first one I run it on works, the second one fails with the following message:
{
"ok" : 0,
"errmsg" : "replSetInitiate quorum check failed because not all proposed set members responded affirmatively: mongod-configdb-2.mongod-configdb.prod-db.svc.cluster.local:27017 failed with Connection reset by peer, mongod-configdb-1.mongod-configdb.prod-db.svc.cluster.local:27017 failed with Connection reset by peer",
"code" : 74,
"codeName" : "NodeNotFound",
"$gleStats" : {
"lastOpTime" : Timestamp(0, 0),
"electionId" : ObjectId("000000000000000000000000")
},
"lastCommittedOpTime" : Timestamp(0, 0)
}
If you look at the mongo logs after doing this the following happens:
2019-10-13T19:24:41.480+0000 I NETWORK [listener] connection accepted from 127.0.0.1:54152 #1 (1 connection now open)
2019-10-13T19:24:41.483+0000 I NETWORK [conn1] end connection 127.0.0.1:54152 (0 connections now open)
2019-10-13T19:24:41.692+0000 I NETWORK [listener] connection accepted from 127.0.0.1:54158 #2 (1 connection now open)
2019-10-13T19:24:41.693+0000 I NETWORK [conn2] end connection 127.0.0.1:54158 (0 connections now open)
2019-10-13T19:24:41.814+0000 I NETWORK [listener] connection accepted from 127.0.0.1:54162 #3 (1 connection now open)
2019-10-13T19:24:41.817+0000 I NETWORK [conn3] end connection 127.0.0.1:54162 (0 connections now open)
2019-10-13T19:24:42.015+0000 I NETWORK [listener] connection accepted from 127.0.0.1:54164 #4 (1 connection now open)
2019-10-13T19:24:42.017+0000 I NETWORK [conn4] end connection 127.0.0.1:54164 (0 connections now open)
2019-10-13T19:24:42.027+0000 I NETWORK [listener] connection accepted from 127.0.0.1:54166 #5 (1 connection now open)
Whereas the one that works properly looks like this:
2019-10-13T19:20:19.268+0000 I REPL [conn3] replSetInitiate config object with 3 members parses ok
2019-10-13T19:20:19.268+0000 I ASIO [Replication] Connecting to mongod-capture-1.mongod-capture.capt-db.svc.cluster.local:27017
2019-10-13T19:20:19.269+0000 I ASIO [Replication] Connecting to mongod-capture-2.mongod-capture.capt-db.svc.cluster.local:27017
2019-10-13T19:20:19.280+0000 I NETWORK [listener] connection accepted from 127.0.0.1:58148 #8 (2 connections now open)
When I exec into the one that doesn’t work, and try to access the other replicas via mongo
here is what it says:
mongo mongodb://mongod-configdb-0.mongod-configdb.prod-db.svc.cluster.local:27017
MongoDB shell version v4.0.10
connecting to: mongodb://mongod-configdb-0.mongod-configdb.prod-db.svc.cluster.local:27017/?gssapiServiceName=mongodb
2019-10-13T20:29:49.991+0000 E QUERY [js] Error: network error while attempting to run command 'isMaster' on host 'mongod-configdb-0.mongod-configdb.prod-db.svc.cluster.local:27017' :
connect@src/mongo/shell/mongo.js:344:17
@(connect):2:6
exception: connect failed
However DNS between them is working properly, when I dig I get the proper IP:
dig mongod-configdb-0.mongod-configdb.prod-db.svc.cluster.local
; <<>> DiG 9.10.3-P4-Ubuntu <<>> mongod-configdb-0.mongod-configdb.prod-db.svc.cluster.local
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30833
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;mongod-configdb-0.mongod-configdb.prod-db.svc.cluster.local. IN A
;; ANSWER SECTION:
mongod-configdb-0.mongod-configdb.prod-db.svc.cluster.local. 2 IN A 100.96.101.11
Also I have confirmed that for the mongo cluster that doesn’t work, if I destroy the whole thing, remove istio from that namespace and start it back up, it works properly (however if I try to then add istio to the namespace and perform a rolling update to convert it into istio-enabled, this doesn’t work either - but that could be for other reasons I’m not thinking about for istio to non-istio pod communication).
Also I have tried this 2 times so far, the first time I setup the primary sharded cluster and that worked perfectly and the replicated one failed, and then I destroyed the whole cluster (istio and all) and did it in reverse and found the replicated one worked and the sharded one failed, so not sure if there is something odd going on there either.