Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

Fix topic lookup failed exceptionally with ServiceUnitNotReady #1995

Conversation

BewareMyPower
Copy link
Collaborator

@BewareMyPower BewareMyPower commented Aug 9, 2023

Motivation

testMultiBrokerUnloadReload is flaky with the Pulsar dependency upgraded to 3.1.0-SNAPSHOT. Here are the related logs:

02:37:33.275 [pulsar-ph-kafka-644-3:io.streamnative.pulsar.handlers.kop.KafkaTopicLookupService@105] ERROR io.streamnative.pulsar.handlers.kop.KafkaTopicLookupService - [io.streamnative.pulsar.handlers.kop.storage.PartitionLog@131e450c] Failed t
o getTopic persistent://public/default/kopMultiBrokerUnloadReload10-partition-3. exception:
java.util.concurrent.CompletionException: org.apache.pulsar.broker.service.BrokerServiceException$ServiceUnitNotReadyException:

It's because the exception passed from BrokerService#getTopic is CompletionException, not ServiceUnitNotReadyException, while KoP does not get the cause of a CompletionException in handleGetTopicException. And then an UNKNOWN_SERVER_ERROR is returned to the client.

02:37:33.277 [pulsar-ph-kafka-644-3:io.streamnative.pulsar.handlers.kop.storage.ReplicaManager$PendingProduceCallback@121] DEBUG io.streamnative.pulsar.handlers.kop.storage.ReplicaManager - Complete handle appendRecords. {kopMultiBrokerUnloadReload10-3={error: UNKNOWN_SERVER_ERROR,offset: -1,logAppendTime: -1, logStartOffset: -1, recordErrors: [], errorMessage: null}}

Modifications

Unwrap the CompletionException in handleGetTopicException.

Documentation

Check the box below.

Need to update docs?

  • doc-required

    (If you need help on updating docs, create a doc issue)

  • no-need-doc

    (Please explain why)

  • doc

    (If this PR contains doc changes)

### Motivation

`testMultiBrokerUnloadReload` is flaky with the Pulsar dependency
upgraded to 3.1.0-SNAPSHOT. Here are the related logs:

```
02:37:33.275 [pulsar-ph-kafka-644-3:io.streamnative.pulsar.handlers.kop.KafkaTopicLookupService@105] ERROR io.streamnative.pulsar.handlers.kop.KafkaTopicLookupService - [io.streamnative.pulsar.handlers.kop.storage.PartitionLog@131e450c] Failed t
o getTopic persistent://public/default/kopMultiBrokerUnloadReload10-partition-3. exception:
java.util.concurrent.CompletionException: org.apache.pulsar.broker.service.BrokerServiceException$ServiceUnitNotReadyException:
```

It's because the exception passed from `BrokerService#getTopic` is
`CompletionException`, not `ServiceUnitNotReadyException`, while KoP
does not get the cause of a `CompletionException` in
`handleGetTopicException`. And then a `UNKNOWN_SERVER_ERROR` is returned
to the client.

```
02:37:33.277 [pulsar-ph-kafka-644-3:io.streamnative.pulsar.handlers.kop.storage.ReplicaManager$PendingProduceCallback@121] DEBUG io.streamnative.pulsar.handlers.kop.storage.ReplicaManager - Complete handle appendRecords. {kopMultiBrokerUnloadReload10-3={error: UNKNOWN_SERVER_ERROR,offset: -1,logAppendTime: -1, logStartOffset: -1, recordErrors: [], errorMessage: null}}
```

### Modifications

Unwrap the `CompletionException` in `handleGetTopicException`.
@BewareMyPower BewareMyPower self-assigned this Aug 9, 2023
@github-actions github-actions bot added the no-need-doc This pr does not need any document label Aug 9, 2023
@codecov
Copy link

codecov bot commented Aug 9, 2023

Codecov Report

Merging #1995 (300a8d5) into master (26980a3) will decrease coverage by 0.01%.
The diff coverage is 0.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1995      +/-   ##
============================================
- Coverage     17.05%   17.05%   -0.01%     
  Complexity      728      728              
============================================
  Files           191      191              
  Lines         14235    14236       +1     
  Branches       1332     1333       +1     
============================================
  Hits           2428     2428              
- Misses        11631    11632       +1     
  Partials        176      176              
Files Changed Coverage Δ
...e/pulsar/handlers/kop/KafkaTopicLookupService.java 1.92% <0.00%> (-0.04%) ⬇️

@BewareMyPower BewareMyPower merged commit 844adb4 into streamnative:master Aug 9, 2023
20 of 22 checks passed
@BewareMyPower BewareMyPower deleted the bewaremypower/fix-lookup-exception branch August 9, 2023 13:37
Demogorgon314 pushed a commit to Demogorgon314/kop that referenced this pull request Oct 30, 2023
…mnative#1995)

### Motivation

`testMultiBrokerUnloadReload` is flaky with the Pulsar dependency
upgraded to 3.1.0-SNAPSHOT. Here are the related logs:

```
02:37:33.275 [pulsar-ph-kafka-644-3:io.streamnative.pulsar.handlers.kop.KafkaTopicLookupService@105] ERROR io.streamnative.pulsar.handlers.kop.KafkaTopicLookupService - [io.streamnative.pulsar.handlers.kop.storage.PartitionLog@131e450c] Failed t
o getTopic persistent://public/default/kopMultiBrokerUnloadReload10-partition-3. exception:
java.util.concurrent.CompletionException: org.apache.pulsar.broker.service.BrokerServiceException$ServiceUnitNotReadyException:
```

It's because the exception passed from `BrokerService#getTopic` is
`CompletionException`, not `ServiceUnitNotReadyException`, while KoP
does not get the cause of a `CompletionException` in
`handleGetTopicException`. And then an `UNKNOWN_SERVER_ERROR` is
returned to the client.

```
02:37:33.277 [pulsar-ph-kafka-644-3:io.streamnative.pulsar.handlers.kop.storage.ReplicaManager$PendingProduceCallback@121] DEBUG io.streamnative.pulsar.handlers.kop.storage.ReplicaManager - Complete handle appendRecords. {kopMultiBrokerUnloadReload10-3={error: UNKNOWN_SERVER_ERROR,offset: -1,logAppendTime: -1, logStartOffset: -1, recordErrors: [], errorMessage: null}}
```

### Modifications

Unwrap the `CompletionException` in `handleGetTopicException`.

(cherry picked from commit 844adb4)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
no-need-doc This pr does not need any document type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants