Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for for reconfiguration of ssl properties for metrics reporter #2206

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

countableSet
Copy link

Summary

  1. Why: To be able to reconfigure ssl truststore and keystore certificates when certificates need to be renewed without having to restart the broker to uptake new files.
  2. What: Adds reconfiguration support through the Reconfigurable interface, which is extended by MetricsReporter and implemented by CruiseControlMetricsReporter.

Expected Behavior

To clear up producer failure like

[2020-03-17 10:10:35,126] ERROR [Producer clientId=CruiseControlMetricsReporter] Connection to node 1026 (${broker}/${broker_ip}:9093) failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient)

Running kafka-configs.sh script with alter command should update the ssl properties for the metrics reporter producer.

kafka-configs.sh \
--bootstrap-server ${bootstrap-server} \
--command-config ssl.client.properties \
--entity-type brokers \
--entity-name 123 \
--alter \
--add-config listener.name.ssl.truststore.location=/opt/kafka/config/kafka-keystore.jks,listener.name.ssl.keystore.location=/opt/kafka/config/kafka-keystore.jks,cruise.control.metrics.reporter.ssl.keystore.location=/opt/kafka/config/kafka-keystore.jks,cruise.control.metrics.reporter.ssl.truststore.location=/opt/kafka/config/kafka-keystore.jks

Actual Behavior

Nothing happens and the producer keeps on failing with ssl handshake errors.

Steps to Reproduce

  1. Configure kafka with ssl listener and cruise control with ssl properties
  2. Rotate the listener certificates
  3. Producer starts failing as original certificates become invalid

Known Workarounds

Restart the broker to pick up any new changes.

Additional evidence

logs

[2020-03-17 10:10:35,126] ERROR [Producer clientId=CruiseControlMetricsReporter] Connection to node 1026 (${broker}/${broker_ip}:9093) failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient)

Categorization

  • documentation
  • bugfix
  • new feature (?)
  • refactor
  • security/CVE
  • other

This PR resolves #1148


some clarification on the changes

  1. why new property cruise.control.metrics.reporter.force.reconfigure=1728690883 is added

Even if support is added to cruise control to reload the configuration, the reload mechanism in kafka doesn't reload in place configurations (aka the key and value are the same as the current configuration)

the listener update flow only considers listener.name.whatever properties and doesn't apply anything else; https://github.com/apache/kafka/blob/2.6.1/core/src/main/scala/kafka/server/DynamicBrokerConfig.scala#L312-L331 and will always update even if there is no config difference

/**
* All config updates through ZooKeeper are triggered through actual changes in values stored in ZooKeeper.
* For some configs like SSL keystores and truststores, we also want to reload the store if it was modified
* in-place, even though the actual value of the file path and password haven't changed. This scenario alone
* is handled here when a config update request using admin client is processed by AdminManager. If any of
* the SSL configs have changed, then the update will not be done here, but will be handled later when ZK
* changes are processed. At the moment, only listener configs are considered for reloading.
*/
private[server] def reloadUpdatedFilesWithoutConfigChange(newProps: Properties): Unit = CoreUtils.inWriteLock(lock) {

thus, configuration can't be forced the update without the value of the config actually changing, too, so in place configuration wouldn't trigger the reconfigure since the location/filename or password didn't change, but that doesn't mean the contents of the truststore or keystore didn't change

kafka-configs \
--bootstrap-server localhost:29093 \
--command-config ./server.ssl.properties \
--entity-type brokers \
--entity-name 1 \
--alter \
--add-config listener.name.ssl.ssl.keystore.location=/etc/pki/server.keystore.jks,listener.name.ssl.ssl.keystore.password=password,listener.name.ssl.key.password=password,listener.name.ssl.ssl.truststore.location=/etc/pki/server.truststore.jks,listener.name.ssl.ssl.truststore.password=password,cruise.control.metrics.reporter.ssl.keystore.location=/etc/pki/server.keystore.jks,cruise.control.metrics.reporter.ssl.keystore.password=password,cruise.control.metrics.reporter.ssl.truststore.location=/etc/pki/server.truststore.jks,cruise.control.metrics.reporter.ssl.truststore.password=password

this logic is gated by updatedConfigs logic here: https://github.com/apache/kafka/blob/67df232bec296eaf76650b01088049c11aa2a168/core/src/main/scala/kafka/server/DynamicBrokerConfig.scala#L531

/**
* Returns the change in configurations between the new props and current props by returning a
* map of the changed configs, as well as the set of deleted keys
*/
private def updatedConfigs(newProps: java.util.Map[String, _], currentProps: java.util.Map[String, _]):

and doesn't run the update flow unless there is a config difference, therefore adding a new field that can change without consequences to trigged the reconfiguration to run if an in place change is done.

  1. why is only ssl properties implemented and not sasl too?

I don't know too much about configuration for sasl (or testing it), so I only implemented changes for ssl. But I don't see why sasl couldn't be added on top of these changes.

  1. no lock on the producer?

Went back and forth on this, I suppose can add something since closing and re-creating the producer can cause issues in the reporting thread (aka producer closed exception) but everything is wrapped in a try/catch so it should just continue on and try again. Up to you guys what how you feel about this.

  1. when running kafka-configs.sh alter cmd, the listener.name.etc is also required All sensitive broker config entries must be specified for --alter, missing entries

I don't think there is a way around this.

❯ ./alter-config
++ dirname ./alter-config
+ PKI=.
+ CRUISE_CONFIG=cruise.control.metrics.reporter.ssl.keystore.location=/etc/pki/server.keystore.jks,cruise.control.metrics.reporter.ssl.keystore.password=password,cruise.control.metrics.reporter.ssl.truststore.location=/etc/pki/server.truststore.jks,cruise.control.metrics.reporter.ssl.truststore.password=password
+ CONFIG=listener.name.ssl.ssl.keystore.location=/etc/pki/server.keystore.jks,listener.name.ssl.ssl.keystore.password=password,listener.name.ssl.key.password=password,listener.name.ssl.ssl.truststore.location=/etc/pki/server.truststore.jks,listener.name.ssl.ssl.truststore.password=password
+ kafka-configs --bootstrap-server localhost:29093 --command-config ./server.ssl.properties --entity-type brokers --entity-name 1 --alter --add-config cruise.control.metrics.reporter.ssl.keystore.location=/etc/pki/server.keystore.jks,cruise.control.metrics.reporter.ssl.keystore.password=password,cruise.control.metrics.reporter.ssl.truststore.location=/etc/pki/server.truststore.jks,cruise.control.metrics.reporter.ssl.truststore.password=password
All sensitive broker config entries must be specified for --alter, missing entries: Set(listener.name.ssl.ssl.keystore.password, listener.name.ssl.ssl.truststore.password, listener.name.ssl.key.password)

or the other way around too.

❯ ./alter-config
++ dirname ./alter-config
+ PKI=.
++ date +%s
+ CRUISE_CONFIG=cruise.control.metrics.reporter.force.reconfigure=1728691256,cruise.control.metrics.reporter.ssl.keystore.location=/etc/pki/server.keystore.jks,cruise.control.metrics.reporter.ssl.keystore.password=password,cruise.control.metrics.reporter.ssl.truststore.location=/etc/pki/server.truststore.jks,cruise.control.metrics.reporter.ssl.truststore.password=password
+ CONFIG=listener.name.ssl.ssl.keystore.location=/etc/pki/server.keystore.jks,listener.name.ssl.ssl.keystore.password=password,listener.name.ssl.key.password=password,listener.name.ssl.ssl.truststore.location=/etc/pki/server.truststore.jks,listener.name.ssl.ssl.truststore.password=password
+ kafka-configs --bootstrap-server localhost:29093 --command-config ./server.ssl.properties --entity-type brokers --entity-name 1 --alter --add-config listener.name.ssl.ssl.keystore.location=/etc/pki/server.keystore.jks,listener.name.ssl.ssl.keystore.password=password,listener.name.ssl.key.password=password,listener.name.ssl.ssl.truststore.location=/etc/pki/server.truststore.jks,listener.name.ssl.ssl.truststore.password=password
All sensitive broker config entries must be specified for --alter, missing entries: Set(cruise.control.metrics.reporter.ssl.keystore.location, cruise.control.metrics.reporter.ssl.truststore.password, cruise.control.metrics.reporter.ssl.keystore.password, cruise.control.metrics.reporter.force.reconfigure, cruise.control.metrics.reporter.ssl.truststore.location)

so this change could break users alter command, now needing to include cruise control ssl properties now too; https://github.com/apache/kafka/blob/dd71437de7675d92ad3e4ed01ac3ee11bf5da99d/core/src/main/scala/kafka/admin/ConfigCommand.scala#L343-L346

@jlisam
Copy link

jlisam commented Nov 14, 2024

@mhratson @viktorsomogyi 👋 would you able to take a look at this?

@akaltsikis
Copy link

Also interested in this fix. Is there anything we can do to make this happen ?

@akaltsikis
Copy link

@mhratson @viktorsomogyi bumping the above 🙏

@Ownercz
Copy link

Ownercz commented Jan 10, 2025

Hi, we would also appreciate this feature to be merged. 🙏

@akaltsikis
Copy link

In the future maybe the https://cwiki.apache.org/confluence/display/KAFKA/KIP-1119%3A+Add+support+for+SSL+hot+reload will useful for that (but that needs to be agreed and to bemerged in Kafka, before we even consider this useful) . I would prefer to have this solution merged till then and then we reconsider

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dynamically reload metricsreporter keystore (and truststore)
4 participants