-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TiDB operator cannot update nor contact the TiDB cluster after the TLS config is changed #5728
Comments
@kos-team in the above example, the root cause is: after changed from TLS to non-TLS, the operator is still using a client with TLS enabled to connect to PD, right? can you workaround it after restarting tidb-controller-manager? |
@csuzhangxc The root cause is because the operator switched to use a client with non-TLS, while the PD is still running with TLS. This causes the calls to the PD to fail. I believe restarting the tidb-operator won't help. |
In your case, after updating the TidbCluster CR from TLS to non-TLS, is PD still using TLS? Is the ConfigMap of PD updated? if the ConfigMap is updated, this may be related to #5708 |
The ConfigMap is not updated.
which is before where the ConfigMap is updated:
And as mentioned in the issue report, the reason that |
Got it! Let's think about how to resolve it with minimal changes. |
Bug Report
What version of Kubernetes are you using?
1.28.1
What version of TiDB Operator are you using?
1.6.0
What storage classes exist in the Kubernetes cluster and what are used for PD/TiKV pods?
standard
What's the status of the TiDB cluster pods?
What did you do?
We tried to change the TLS configuration of the TiDB cluster, but found that the operator fails to contact the TiDB cluster after we change the
tlsCluster.enabled
option. This bug happens both when we change it fromtrue
tofalse
andfalse
totrue
.The root cause of this bug is due to the mismatch of operator's pd client's TLS configuration and the server's config.
To reproduce this bug:
tlsCluster.enabled
totrue
tlsCluster.enabled
tofalse
What did you expect to see?
The operator should be able to reconfigure the TLS config for the TiDB cluster.
What did you see instead?
The operator fails with the error logs:
The root cause is the pd client's TLS configuration. After the tlsCluster configuration gets changed, the operator uses a different client with a different TLS configuration than what the TiDB cluster is currently using. This causes the healthcheck to fail due to the client error, although the cluster is already healthy.
https://github.com/pingcap/tidb-operator/blob/ff467a6e7a0563b31a3ace2fe5d060774012780d/pkg/pdapi/pd_control.go#L241C1-L252C3
The text was updated successfully, but these errors were encountered: