Expired root certificate

Hi,

we are currently running Istio v1.0.4 on our kubernetes cluster and recently had issue whenever we tried to deploy new charts via helm:

Error: release aged-bumblebee failed: Internal error occurred: failed calling admission webhook “pilot.validation.istio.io”: Post …/admitpilot?timeout=30s: x509: certificate has expired or is not yet valid

I assume(!) it was because the certificate generated by citadel expired since the istio-ca-secret was older than > 1y:

NAME TYPE DATA AGE
istio-ca-secret istio.io/ca-root 2 1y

In order to solve the problem I removed istio and redeployed the chart (yeah - I was desperate :wink: ). But the error still occured. I finally ended up with removing the istio-ca-secret (which obviously doesn’t get removed when removing the chart) causing Citadel to recreate the istio-ca-secret.

Now my questions are:

  1. Are my assumptions correct?
  2. If yes, shouldn’t be there something like an automatic certificate renewal process offered by istio?

Thanks in advance.

Best regards
Sebastian

Interesting. The default self signed root is valid for 90 days I believe?

Root rotation is not a high priority for now, but something we need to support to reduce the friction as you mentioned here.

As workaround, you can customized a installation with longer root TTL, which I believe, real prod root is definitely greater than 90 days.

@Oliver fyi

Hi Seb,

Sorry for the late reply. Have you run the Istio Citadel for more than 1 year? We had plan to enable the automatic root cert renewal process, but it’s not implemented yet due to our limited bandwidth. Currently, removing the istio-ca-cert and redeploying Citadel solves the problem (with done time to the workloads).

Actually, we consider using the self-signed certs to be less common in production environments, where users usually maintain root certs offline and plug in the intermediate cert into Citadel. In that case, the users just inject a new intermediate cert in order to do rotation.

Does this make sense?

Hi Oliver,

sorry for my late response. Yes, this totally makes sense. Actually I just wanted to have some kind of confirmation that you are at least aware of this. Like you have mentioned it, removing the istio-ca-cert solved the issue. And yes, Citadel was running for more than one year :wink:

Best regards,
Sebastian

Hi Seb,

Just FYI, we have made the announcement to renew the root certificate without downtime. And the new root certificate is for 1 year:

Hi Oliver
So as i understood when i generate a new root certificate ( before it will be expired) and restart the Citadel it should be with 100% high availability and without any downtime.
Can you please how the sidecar get the updated root-certificate? how this happened without restarting it? cause i saw that the sidcar container in a pod stay up

Best regards,
Hany

The Envoy sidecar will do a hot restart https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/operations/hot_restart. You can read more here https://istio.io/docs/ops/security/root-transition/#root-transition-procedure