Description
After recently hitting both #1310 and #1306 in production, I want to find a way to test kafka-python
against a constantly cycling Kafka cluster.
Those problems are both race conditions that prevent kafka-python
from successfully recovering when a broker is restarted. Unfortunately, unit tests rarely catch this, so I want to create a Kafka cluster that is constantly killing a broker, moving partition leadership to another broker, then restarting the killed broker, etc.
Ideally I'd run every new release for a couple of days against this before deploying it.
I can cobble something together with shell scripts and Docker, but wondered if anyone had better ideas?
I am not aware of a CI service that currently supports this model of running N number of cycles of cluster failures.