Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sentry pods intermittently loses connection with kafka/redis pods #1639

Open
1 task done
milosara78 opened this issue Dec 17, 2024 · 0 comments
Open
1 task done

Sentry pods intermittently loses connection with kafka/redis pods #1639

milosara78 opened this issue Dec 17, 2024 · 0 comments

Comments

@milosara78
Copy link

Issue submitter TODO list

  • I've searched for an already existing issues here

Describe the bug (actual behavior)

I have my self hosted sentry deployed through helm chart . I am using helm chart version sentry-21.6.3 and app version as 24.2.0 .
So intermittenly when i check in the sentry web page i will not be having any events . In turn if i access the pods and logs , few pods like ingestion occurrences or replay pods etc will have error like connection issue or sometimes suddenly as no partition assigned.

%4|1734382758.710|CONFWARN|rdkafka#producer-1| [thrd:app]: Configuration property allow.auto.create.topics is a consumer property and will be ignored by this producer instance
20:59:20 [INFO] sentry.batching-kafka-consumer: New partitions assigned: []
21:01:41 [INFO] arroyo.processing.processor: Waiting for <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c05733650> to exit...
21:01:41 [INFO] arroyo.processing.processor: <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c05733650> exited successfully, releasing assignment.
21:01:41 [INFO] arroyo.processing.processor: Partition revocation complete.
21:01:42 [INFO] arroyo.processing.processor: New partitions assigned: {Partition(topic=Topic(name='metrics-subscription-results'), index=0): 0}
%6|1734297531.447|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Disconnected (after 1043223ms in state UP)
%3|1734297531.449|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Connect to ipv4#10.0.194.37:9092 failed: Connection refused (after 1ms in state CONNECT)
21:19:24 [INFO] arroyo.processing.processor: Partitions to revoke: [Partition(topic=Topic(name='metrics-subscription-results'), index=0)]
21:19:24 [INFO] arroyo.processing.processor: Closing <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040abd50>...
21:19:24 [INFO] arroyo.processing.processor: Waiting for <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040abd50> to exit...
21:19:24 [INFO] arroyo.processing.processor: <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040abd50> exited successfully, releasing assignment.
21:19:24 [INFO] arroyo.processing.processor: Partition revocation complete.
21:19:24 [INFO] arroyo.processing.processor: New partitions assigned: {Partition(topic=Topic(name='metrics-subscription-results'), index=0): 0}
21:21:33 [INFO] arroyo.processing.processor: Partitions to revoke: [Partition(topic=Topic(name='metrics-subscription-results'), index=0)]
21:21:33 [INFO] arroyo.processing.processor: Closing <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040ab390>...
21:21:33 [INFO] arroyo.processing.processor: Waiting for <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040ab390> to exit...
21:21:33 [INFO] arroyo.processing.processor: <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040ab390> exited successfully, releasing assignment.
21:21:33 [INFO] arroyo.processing.processor: Partition revocation complete.
21:21:33 [INFO] arroyo.processing.processor: New partitions assigned: {Partition(topic=Topic(name='metrics-subscription-results'), index=0): 0}
%6|1734377921.254|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Disconnected (after 81431473ms in state UP)
%3|1734377921.256|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Connect to ipv4#10.0.196.45:9092 failed: Connection refused (after 2ms in state CONNECT)
%3|1734377921.460|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Connect to ipv4#10.0.196.45:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
%3|1734377922.334|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Connect to ipv4#10.0.196.45:9092 failed: No route to host (after 3ms in state CONNECT)
%4|1734377974.420|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Connection setup timed out in state CONNECT (after 30031ms in state CONNECT)
%3|1734377974.429|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 8ms in state CONNECT)
%3|1734377975.435|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 5ms in state CONNECT, 1 identical error(s) suppressed)
%3|1734378011.472|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 9ms in state CONNECT, 8 identical error(s) suppressed)
%3|1734378137.854|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 9ms in state CONNECT, 2 identical error(s) suppressed)
%5|1734378141.394|PARTCNT|rdkafka#consumer-1| [thrd:main]: Topic metrics-subscription-results partition count changed from 1 to 0
20:57:29 [INFO] arroyo.processing.processor: Partitions to revoke: [Partition(topic=Topic(name='metrics-subscription-results'), index=0)]
20:57:29 [INFO] arroyo.processing.processor: Closing <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040abd50>...
20:57:29 [INFO] arroyo.processing.processor: Waiting for <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040abd50> to exit...
20:57:29 [INFO] arroyo.processing.processor: <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040abd50> exited successfully, releasing assignment.
20:57:29 [INFO] arroyo.processing.processor: Partition revocation complete.
20:57:29 [INFO] arroyo.processing.processor: New partitions assigned: {Partition(topic=Topic(name='metrics-subscription-results'), index=0): 0}
%6|1734382662.221|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Disconnected (after 4521316ms in state UP)
%3|1734382662.224|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Connect to ipv4#10.0.198.45:9092 failed: Connection refused (after 2ms in state CONNECT)
%3|1734382662.419|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Connect to ipv4#10.0.198.45:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
%4|1734382662.739|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Disconnected (after 84822158ms in state UP)
%3|1734382662.741|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Connect to ipv4#10.0.198.33:9092 failed: Connection refused (after 1ms in state CONNECT)
%3|1734382662.741|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Connect to ipv4#10.0.198.33:9092 failed: Connection refused (after 1ms in state CONNECT)
%3|1734382662.983|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Connect to ipv4#10.0.198.33:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
%3|1734382662.984|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Connect to ipv4#10.0.198.33:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
%3|1734382664.144|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 1ms in state CONNECT)
%3|1734382664.381|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
%3|1734382669.648|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: No route to host (after 3ms in state CONNECT)
%3|1734382672.490|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Connect to ipv4#10.0.198.33:9092 failed: No route to host (after 3108ms in state CONNECT)
%3|1734382676.395|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 7ms in state CONNECT)
%4|1734382676.395|METADATA|rdkafka#consumer-1| [thrd:main]: GroupCoordinator/1: Metadata request failed: broker down: Local: Host resolution failure (0ms): Permanent
%4|1734382676.395|METADATA|rdkafka#consumer-1| [thrd:main]: GroupCoordinator/1: Metadata request failed: refresh unavailable topics: Local: Host resolution failure (0ms): Permanent
%3|1734382677.180|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 7ms in state CONNECT)
%4|1734382688.651|SESSTMOUT|rdkafka#consumer-1| [thrd:main]: Consumer group session timed out (in join-state steady) after 30264 ms without a successful response from the group coordinator (broker 1, last error was Local: Broker transport failure): revoking assignment and rejoining group
20:58:08 [INFO] arroyo.processing.processor: Partitions to revoke: [Partition(topic=Topic(name='metrics-subscription-results'), index=0)]
20:58:08 [INFO] arroyo.processing.processor: Closing <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040aa690>...
20:58:08 [INFO] arroyo.processing.processor: Waiting for <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040aa690> to exit...
20:58:08 [INFO] arroyo.processing.processor: <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040aa690> exited successfully, releasing assignment.
20:58:08 [INFO] arroyo.processing.processor: Partition revocation complete.
%3|1734382695.558|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 10ms in state CONNECT, 2 identical error(s) suppressed)
%3|1734382697.178|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 6ms in state CONNECT, 2 identical error(s) suppressed)
%4|1734382701.670|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connection setup timed out in state CONNECT (after 30026ms in state CONNECT)
%3|1734382702.151|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT)
%3|1734382702.652|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
%4|1734382704.174|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Connection setup timed out in state CONNECT (after 30030ms in state CONNECT)
%3|1734382704.399|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 5ms in state CONNECT)
%3|1734382705.658|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 6ms in state CONNECT, 1 identical error(s) suppressed)
%3|1734382728.157|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 5ms in state CONNECT, 23 identical error(s) suppressed)
%3|1734382733.152|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 34 identical error(s) suppressed)
%3|1734382735.551|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 6ms in state CONNECT, 4 identical error(s) suppressed)
%3|1734382736.404|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 5ms in state CONNECT, 22 identical error(s) suppressed)
%3|1734382758.158|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 6ms in state CONNECT, 21 identical error(s) suppressed)
%3|1734382763.687|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 1035ms in state CONNECT, 30 identical error(s) suppressed)
%3|1734382768.159|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 7ms in state CONNECT, 22 identical error(s) suppressed)
%3|1734382772.560|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 5ms in state CONNECT, 4 identical error(s) suppressed)
%3|1734382788.659|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 6ms in state CONNECT, 22 identical error(s) suppressed)
%3|1734382793.703|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 1049ms in state CONNECT, 30 identical error(s) suppressed)
%3|1734382799.168|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 15ms in state CONNECT, 22 identical error(s) suppressed)
%3|1734382810.689|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 8ms in state CONNECT, 4 identical error(s) suppressed)
%3|1734382819.163|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 9ms in state CONNECT, 21 identical error(s) suppressed)
%3|1734382824.154|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 31 identical error(s) suppressed)
%3|1734382829.416|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 5ms in state CONNECT, 21 identical error(s) suppressed)
%3|1734382854.151|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 10ms in state CONNECT, 10 identical error(s) suppressed)
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/arroyo/processing/processor.py", line 319, in run
    self._run_once()
  File "/usr/local/lib/python3.11/site-packages/arroyo/processing/processor.py", line 381, in _run_once
    self.__message = self.__consumer.poll(timeout=1.0)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/arroyo/backends/kafka/consumer.py", line 394, in poll
    message: Optional[ConfluentMessage] = self.__consumer.poll(
                                          ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/arroyo/backends/kafka/consumer.py", line 282, in assignment_callback
    self.__resolve_partition_starting_offset(partition)
  File "/usr/local/lib/python3.11/site-packages/arroyo/backends/kafka/consumer.py", line 233, in __resolve_partition_offset_latest
    low, high = self.__consumer.get_watermark_offsets(partition)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
cimpl.KafkaException: KafkaError{code=_BAD_MSG,val=-199,str="Failed to get watermark offsets: Local: Bad message format"}
21:00:54 [ERROR] arroyo.processing.processor: Caught exception, shutting down...
21:00:54 [INFO] arroyo.processing.processor: Closing <arroyo.backends.kafka.consumer.KafkaConsumer object at 0x7c5c04060790>...
21:00:54 [INFO] arroyo.processing.processor: Partitions to revoke: [Partition(topic=Topic(name='metrics-subscription-results'), index=0)]
21:00:54 [INFO] arroyo.processing.processor: Partition revocation complete.
21:00:54 [WARNING] arroyo.backends.kafka.consumer: failed to delete offset for unknown partition: Partition(topic=Topic(name='metrics-subscription-results'), index=0)

similarly for redis we can see

2024-12-16 20:58:16,207 Initializing Snuba...
2024-12-16 20:58:21,177 Snuba initialization took 4.9323599310000645s
%3|1734382701.228|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT)
%3|1734382702.227|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
%3|1734382732.229|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 30 identical error(s) suppressed)
%3|1734382762.384|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 30 identical error(s) suppressed)
%3|1734382792.492|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 66ms in state CONNECT, 30 identical error(s) suppressed)
%3|1734382822.501|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 31 identical error(s) suppressed)
2024-12-16 21:00:38,582 New partitions assigned: {Partition(topic=Topic(name='event-replacements'), index=0): 64}
2024-12-16 21:21:37,443 Error 111 connecting to sentry-review-sentry-redis-master:6379. Connection refused.
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 611, in connect
    sock = self.retry.call_with_retry(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/retry.py", line 46, in call_with_retry
    return do()
           ^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 612, in <lambda>
    lambda: self._connect(), lambda error: self.disconnect(error)
            ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 677, in _connect
    raise err
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 665, in _connect
    sock.connect(socket_address)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/snuba/snuba/state/__init__.py", line 216, in get_raw_configs
    all_configs = rds.hgetall(config_key)
                  ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/commands/core.py", line 4776, in hgetall
    return self.execute_command("HGETALL", name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sentry_redis_tools/failover_redis.py", line 28, in wrapper
    return get_wrapped_fn()(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sentry_sdk/integrations/redis/__init__.py", line 235, in sentry_patched_execute_command
    return old_execute_command(self, name, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/client.py", line 1235, in execute_command
    conn = self.connection or pool.get_connection(command_name, **options)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 1387, in get_connection
    connection.connect()
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 617, in connect
    raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to sentry-review-sentry-redis-master:6379. Connection refused.
2024-12-16 21:22:23,268 Error 111 connecting to sentry-review-sentry-redis-master:6379. Connection refused.
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 611, in connect
    sock = self.retry.call_with_retry(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/retry.py", line 46, in call_with_retry
    return do()
           ^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 612, in <lambda>
    lambda: self._connect(), lambda error: self.disconnect(error)
            ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 677, in _connect
    raise err
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 665, in _connect
    sock.connect(socket_address)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/snuba/snuba/state/__init__.py", line 216, in get_raw_configs
    all_configs = rds.hgetall(config_key)
                  ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/commands/core.py", line 4776, in hgetall
    return self.execute_command("HGETALL", name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sentry_redis_tools/failover_redis.py", line 28, in wrapper
    return get_wrapped_fn()(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sentry_sdk/integrations/redis/__init__.py", line 235, in sentry_patched_execute_command
    return old_execute_command(self, name, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/client.py", line 1235, in execute_command
    conn = self.connection or pool.get_connection(command_name, **options)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 1387, in get_connection
    connection.connect()
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 617, in connect
    raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to sentry-review-sentry-redis-master:6379. Connection refused.

Expected behavior

Sentry pods should automatically retry the connection and get connected

values.yaml

hooks:
activeDeadlineSeconds: 600
dbCheck:
nodeSelector:
env: production
dbInit:
nodeSelector:
env: production
snubaInit:
nodeSelector:
env: production
snubaMigrate:
nodeSelector:
env: production

snuba:
api:
nodeSelector:
env: production
resources:
limits:
memory: 512Mi

consumer:
nodeSelector:
env: production
resources:
limits:
memory: 512Mi

outcomesConsumer:
nodeSelector:
env: production
resources:
limits:
memory: 512Mi

replacer:
nodeSelector:
env: production
resources:
limits:
memory: 512Mi

subscriptionConsumerEvents:
nodeSelector:
env: production
resources:
limits:
memory: 512Mi

subscriptionConsumerTransactions:
nodeSelector:
env: production
resources:
limits:
memory: 512Mi

sessionsConsumer:
nodeSelector:
env: production
resources:
limits:
memory: 512Mi

transactionsConsumer:
nodeSelector:
env: production
resources:
limits:
memory: 512Mi
cpu: 500m

replaysConsumer:
nodeSelector:
env: production
resources:
limits:
memory: 512Mi

subscriptionConsumerMetrics:
nodeSelector:
env: production
resources:
limits:
memory: 800Mi
cpu: 500m

subscriptionConsumerSessions:
nodeSelector:
env: production
resources:
limits:
memory: 800Mi
cpu: 400m

dbInitJob:
env: []

migrateJob:
env: []

cleanupErrors:
activeDeadlineSeconds: 600

cleanupTransactions:
activeDeadlineSeconds: 600

relay:
resources:
limits:
memory: 2Gi
cpu: 500m

sentry:
web:
nodeSelector:
env: production
resources:
limits:
memory: 2Gi

cron:
nodeSelector:
env: production

worker:
nodeSelector:
env: production
resources:
limits:
memory: 1800Mi

postProcessForward:
nodeSelector:
env: production
resources:
limits:
memory: 512Mi

ingestConsumer:
nodeSelector:
env: production
resources:
limits:
memory: 612Mi
cpu: 400m

ingestMetricsConsumerPerf:
nodeSelector:
env: production
resources:
limits:
memory: 800Mi

ingestMetricsConsumerRh:
nodeSelector:
env: production
resources:
limits:
memory: 800Mi

billingMetricsConsumer:
nodeSelector:
env: production

ingestReplayRecordings:
nodeSelector:
env: production
resources:
limits:
memory: 512Mi

metricsConsumer:
nodeSelector:
env: production
resources:
limits:
memory: 512Mi

genericMetricsConsumer:
nodeSelector:
env: production
resources:
limits:
memory: 800Mi

postgresql:
master:
nodeSelector:
env: production
persistence:
size: 48Gi
resources:
requests:
memory: 7680Mi
cpu: 250m

redis:
master:
nodeSelector:
env: production
resources:
requests:
memory: 8Gi
limits:
memory: 10Gi
persistence:
size: 150Gi
extraFlags:
- "--maxmemory-policy volatile-ttl"
- "--maxmemory 10G"
replica:
nodeSelector:
env: production
resources:
requests:
memory: 8Gi
limits:
memory: 10Gi
persistence:
size: 20Gi

ingress:
ingressClassName: ingress-production
annotations:
kci/setupdns: "true"
nginx.ingress.kubernetes.io/client-body-buffer-size: 10M
nginx.ingress.kubernetes.io/proxy-body-size: 10M
nginx.ingress.kubernetes.io/use-regex: "true"
nginx.ingress.kubernetes.io/proxy-buffer-size: "32k"
nginx.ingress.kubernetes.io/proxy-buffers-number: "16 16k"
hostname: sentry.devops.kci.rocks
tls:
- secretName: wildcard.*****
hosts:
- sentry.********

metrics:
enabled: true
nodeSelector:
env: production

clickhouse:
clickhouse:
nodeSelector:
env: production
persistentVolumeClaim:
enabled: true
dataPersistentVolume:
enabled: true
accessModes:
- "ReadWriteOnce"
storage: "80Gi"

filestore:
gcs:
secretName: sentry-production-gcs
bucketName: devops-sentry-production
credentialsFile: credentials.json

storage class for redis/postgres (no mount issues, related statefulsets)

global:
nodeSelector:
env: production

kafka:
provisioning:
enabled: true
topics:
- name: ingest-replay-recordings
persistence:
size: 12Gi
nodeSelector:
env: production
zookeeper:
nodeSelector:
env: production
replicaCount: 3
defaultReplicationFactor: 3
offsetsTopicReplicationFactor: 3
transactionStateLogReplicationFactor: 3
transactionStateLogMinIsr: 3

zookeeper:
persistence:
size: 12Gi

memcached:
extraEnvVarsCM: "sentry-production-memcached"
sourcemaps:
enabled: false

Helm chart version

21.6.3

Steps to reproduce

Deploy the app in this version and intermittently the connection gets lost . Am not super expert with Sentry , would be glad if someone can help me here

Screenshots

No response

Logs

No response

Additional context

2024-12-16 20:58:16,207 Initializing Snuba...
2024-12-16 20:58:21,177 Snuba initialization took 4.9323599310000645s
%3|1734382701.228|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT)
%3|1734382702.227|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
%3|1734382732.229|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 30 identical error(s) suppressed)
%3|1734382762.384|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 30 identical error(s) suppressed)
%3|1734382792.492|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 66ms in state CONNECT, 30 identical error(s) suppressed)
%3|1734382822.501|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 31 identical error(s) suppressed)
2024-12-16 21:00:38,582 New partitions assigned: {Partition(topic=Topic(name='event-replacements'), index=0): 64}
2024-12-16 21:21:37,443 Error 111 connecting to sentry-review-sentry-redis-master:6379. Connection refused.
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 611, in connect
    sock = self.retry.call_with_retry(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/retry.py", line 46, in call_with_retry
    return do()
           ^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 612, in <lambda>
    lambda: self._connect(), lambda error: self.disconnect(error)
            ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 677, in _connect
    raise err
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 665, in _connect
    sock.connect(socket_address)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/snuba/snuba/state/__init__.py", line 216, in get_raw_configs
    all_configs = rds.hgetall(config_key)
                  ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/commands/core.py", line 4776, in hgetall
    return self.execute_command("HGETALL", name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sentry_redis_tools/failover_redis.py", line 28, in wrapper
    return get_wrapped_fn()(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sentry_sdk/integrations/redis/__init__.py", line 235, in sentry_patched_execute_command
    return old_execute_command(self, name, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/client.py", line 1235, in execute_command
    conn = self.connection or pool.get_connection(command_name, **options)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 1387, in get_connection
    connection.connect()
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 617, in connect
    raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to sentry-review-sentry-redis-master:6379. Connection refused.
2024-12-16 21:22:23,268 Error 111 connecting to sentry-review-sentry-redis-master:6379. Connection refused.
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 611, in connect
    sock = self.retry.call_with_retry(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/retry.py", line 46, in call_with_retry
    return do()
           ^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 612, in <lambda>
    lambda: self._connect(), lambda error: self.disconnect(error)
            ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 677, in _connect
    raise err
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 665, in _connect
    sock.connect(socket_address)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/src/snuba/snuba/state/__init__.py", line 216, in get_raw_configs
    all_configs = rds.hgetall(config_key)
                  ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/commands/core.py", line 4776, in hgetall
    return self.execute_command("HGETALL", name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sentry_redis_tools/failover_redis.py", line 28, in wrapper
    return get_wrapped_fn()(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sentry_sdk/integrations/redis/__init__.py", line 235, in sentry_patched_execute_command
    return old_execute_command(self, name, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/client.py", line 1235, in execute_command
    conn = self.connection or pool.get_connection(command_name, **options)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 1387, in get_connection
    connection.connect()
  File "/usr/local/lib/python3.11/site-packages/redis/connection.py", line 617, in connect
    raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to sentry-review-sentry-redis-master:6379. Connection refused.

========

21:01:41 [INFO] arroyo.processing.processor: Waiting for <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c05733650> to exit...
21:01:41 [INFO] arroyo.processing.processor: <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c05733650> exited successfully, releasing assignment.
21:01:41 [INFO] arroyo.processing.processor: Partition revocation complete.
21:01:42 [INFO] arroyo.processing.processor: New partitions assigned: {Partition(topic=Topic(name='metrics-subscription-results'), index=0): 0}
%6|1734297531.447|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Disconnected (after 1043223ms in state UP)
%3|1734297531.449|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Connect to ipv4#10.0.194.37:9092 failed: Connection refused (after 1ms in state CONNECT)
21:19:24 [INFO] arroyo.processing.processor: Partitions to revoke: [Partition(topic=Topic(name='metrics-subscription-results'), index=0)]
21:19:24 [INFO] arroyo.processing.processor: Closing <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040abd50>...
21:19:24 [INFO] arroyo.processing.processor: Waiting for <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040abd50> to exit...
21:19:24 [INFO] arroyo.processing.processor: <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040abd50> exited successfully, releasing assignment.
21:19:24 [INFO] arroyo.processing.processor: Partition revocation complete.
21:19:24 [INFO] arroyo.processing.processor: New partitions assigned: {Partition(topic=Topic(name='metrics-subscription-results'), index=0): 0}
21:21:33 [INFO] arroyo.processing.processor: Partitions to revoke: [Partition(topic=Topic(name='metrics-subscription-results'), index=0)]
21:21:33 [INFO] arroyo.processing.processor: Closing <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040ab390>...
21:21:33 [INFO] arroyo.processing.processor: Waiting for <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040ab390> to exit...
21:21:33 [INFO] arroyo.processing.processor: <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040ab390> exited successfully, releasing assignment.
21:21:33 [INFO] arroyo.processing.processor: Partition revocation complete.
21:21:33 [INFO] arroyo.processing.processor: New partitions assigned: {Partition(topic=Topic(name='metrics-subscription-results'), index=0): 0}
%6|1734377921.254|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Disconnected (after 81431473ms in state UP)
%3|1734377921.256|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Connect to ipv4#10.0.196.45:9092 failed: Connection refused (after 2ms in state CONNECT)
%3|1734377921.460|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Connect to ipv4#10.0.196.45:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
%3|1734377922.334|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Connect to ipv4#10.0.196.45:9092 failed: No route to host (after 3ms in state CONNECT)
%4|1734377974.420|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Connection setup timed out in state CONNECT (after 30031ms in state CONNECT)
%3|1734377974.429|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 8ms in state CONNECT)
%3|1734377975.435|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 5ms in state CONNECT, 1 identical error(s) suppressed)
%3|1734378011.472|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 9ms in state CONNECT, 8 identical error(s) suppressed)
%3|1734378137.854|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 9ms in state CONNECT, 2 identical error(s) suppressed)
%5|1734378141.394|PARTCNT|rdkafka#consumer-1| [thrd:main]: Topic metrics-subscription-results partition count changed from 1 to 0
20:57:29 [INFO] arroyo.processing.processor: Partitions to revoke: [Partition(topic=Topic(name='metrics-subscription-results'), index=0)]
20:57:29 [INFO] arroyo.processing.processor: Closing <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040abd50>...
20:57:29 [INFO] arroyo.processing.processor: Waiting for <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040abd50> to exit...
20:57:29 [INFO] arroyo.processing.processor: <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040abd50> exited successfully, releasing assignment.
20:57:29 [INFO] arroyo.processing.processor: Partition revocation complete.
20:57:29 [INFO] arroyo.processing.processor: New partitions assigned: {Partition(topic=Topic(name='metrics-subscription-results'), index=0): 0}
%6|1734382662.221|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Disconnected (after 4521316ms in state UP)
%3|1734382662.224|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Connect to ipv4#10.0.198.45:9092 failed: Connection refused (after 2ms in state CONNECT)
%3|1734382662.419|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Connect to ipv4#10.0.198.45:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
%4|1734382662.739|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Disconnected (after 84822158ms in state UP)
%3|1734382662.741|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Connect to ipv4#10.0.198.33:9092 failed: Connection refused (after 1ms in state CONNECT)
%3|1734382662.741|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Connect to ipv4#10.0.198.33:9092 failed: Connection refused (after 1ms in state CONNECT)
%3|1734382662.983|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Connect to ipv4#10.0.198.33:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
%3|1734382662.984|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Connect to ipv4#10.0.198.33:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
%3|1734382664.144|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 1ms in state CONNECT)
%3|1734382664.381|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
%3|1734382669.648|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: No route to host (after 3ms in state CONNECT)
%3|1734382672.490|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Connect to ipv4#10.0.198.33:9092 failed: No route to host (after 3108ms in state CONNECT)
%3|1734382676.395|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 7ms in state CONNECT)
%4|1734382676.395|METADATA|rdkafka#consumer-1| [thrd:main]: GroupCoordinator/1: Metadata request failed: broker down: Local: Host resolution failure (0ms): Permanent
%4|1734382676.395|METADATA|rdkafka#consumer-1| [thrd:main]: GroupCoordinator/1: Metadata request failed: refresh unavailable topics: Local: Host resolution failure (0ms): Permanent
%3|1734382677.180|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 7ms in state CONNECT)
%4|1734382688.651|SESSTMOUT|rdkafka#consumer-1| [thrd:main]: Consumer group session timed out (in join-state steady) after 30264 ms without a successful response from the group coordinator (broker 1, last error was Local: Broker transport failure): revoking assignment and rejoining group
20:58:08 [INFO] arroyo.processing.processor: Partitions to revoke: [Partition(topic=Topic(name='metrics-subscription-results'), index=0)]
20:58:08 [INFO] arroyo.processing.processor: Closing <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040aa690>...
20:58:08 [INFO] arroyo.processing.processor: Waiting for <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040aa690> to exit...
20:58:08 [INFO] arroyo.processing.processor: <arroyo.processing.strategies.run_task_with_multiprocessing.RunTaskWithMultiprocessing object at 0x7c5c040aa690> exited successfully, releasing assignment.
20:58:08 [INFO] arroyo.processing.processor: Partition revocation complete.
%3|1734382695.558|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 10ms in state CONNECT, 2 identical error(s) suppressed)
%3|1734382697.178|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 6ms in state CONNECT, 2 identical error(s) suppressed)
%4|1734382701.670|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connection setup timed out in state CONNECT (after 30026ms in state CONNECT)
%3|1734382702.151|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT)
%3|1734382702.652|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 1 identical error(s) suppressed)
%4|1734382704.174|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Connection setup timed out in state CONNECT (after 30030ms in state CONNECT)
%3|1734382704.399|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 5ms in state CONNECT)
%3|1734382705.658|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 6ms in state CONNECT, 1 identical error(s) suppressed)
%3|1734382728.157|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 5ms in state CONNECT, 23 identical error(s) suppressed)
%3|1734382733.152|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 34 identical error(s) suppressed)
%3|1734382735.551|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 6ms in state CONNECT, 4 identical error(s) suppressed)
%3|1734382736.404|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 5ms in state CONNECT, 22 identical error(s) suppressed)
%3|1734382758.158|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 6ms in state CONNECT, 21 identical error(s) suppressed)
%3|1734382763.687|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 1035ms in state CONNECT, 30 identical error(s) suppressed)
%3|1734382768.159|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 7ms in state CONNECT, 22 identical error(s) suppressed)
%3|1734382772.560|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 5ms in state CONNECT, 4 identical error(s) suppressed)
%3|1734382788.659|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 6ms in state CONNECT, 22 identical error(s) suppressed)
%3|1734382793.703|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 1049ms in state CONNECT, 30 identical error(s) suppressed)
%3|1734382799.168|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 15ms in state CONNECT, 22 identical error(s) suppressed)
%3|1734382810.689|FAIL|rdkafka#consumer-1| [thrd:GroupCoordinator]: GroupCoordinator: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 8ms in state CONNECT, 4 identical error(s) suppressed)
%3|1734382819.163|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 9ms in state CONNECT, 21 identical error(s) suppressed)
%3|1734382824.154|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka:9092/bootstrap]: sentry-review-kafka:9092/bootstrap: Connect to ipv4#10.3.254.245:9092 failed: Connection refused (after 0ms in state CONNECT, 31 identical error(s) suppressed)
%3|1734382829.416|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092/1: Failed to resolve 'sentry-review-kafka-1.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 5ms in state CONNECT, 21 identical error(s) suppressed)
%3|1734382854.151|FAIL|rdkafka#consumer-1| [thrd:sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.]: sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092/0: Failed to resolve 'sentry-review-kafka-0.sentry-review-kafka-headless.staging.svc.cluster.local:9092': Name or service not known (after 10ms in state CONNECT, 10 identical error(s) suppressed)
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/arroyo/processing/processor.py", line 319, in run
    self._run_once()
  File "/usr/local/lib/python3.11/site-packages/arroyo/processing/processor.py", line 381, in _run_once
    self.__message = self.__consumer.poll(timeout=1.0)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/arroyo/backends/kafka/consumer.py", line 394, in poll
    message: Optional[ConfluentMessage] = self.__consumer.poll(
                                          ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/arroyo/backends/kafka/consumer.py", line 282, in assignment_callback
    self.__resolve_partition_starting_offset(partition)
  File "/usr/local/lib/python3.11/site-packages/arroyo/backends/kafka/consumer.py", line 233, in __resolve_partition_offset_latest
    low, high = self.__consumer.get_watermark_offsets(partition)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
cimpl.KafkaException: KafkaError{code=_BAD_MSG,val=-199,str="Failed to get watermark offsets: Local: Bad message format"}
21:00:54 [ERROR] arroyo.processing.processor: Caught exception, shutting down...
21:00:54 [INFO] arroyo.processing.processor: Closing <arroyo.backends.kafka.consumer.KafkaConsumer object at 0x7c5c04060790>...
21:00:54 [INFO] arroyo.processing.processor: Partitions to revoke: [Partition(topic=Topic(name='metrics-subscription-results'), index=0)]
21:00:54 [INFO] arroyo.processing.processor: Partition revocation complete.
21:00:54 [WARNING] arroyo.backends.kafka.consumer: failed to delete offset for unknown partition: Partition(topic=Topic(name='metrics-subscription-results'), index=0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant