Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-36296][Connectors/DynamoDB] Add support for incremental shard discovery #166

Closed
wants to merge 1 commit into from

Conversation

gguptp
Copy link
Contributor

@gguptp gguptp commented Sep 17, 2024

Purpose of the change

Adds incremental shard discovery for DDB Streams source. Since full describestream discovery can take a longer time, having incremental shard discovery ensures we have children shards earlier than completing the full discovery

Verifying this change

Please make sure both new and modified tests in this PR follows the conventions defined in our code quality guide: https://flink.apache.org/contributing/code-style-and-quality-common.html#testing

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

  • Added unit tests
  • Manually verified by running the Kinesis connector on a local Flink cluster.

Significant changes

(Please check any boxes [x] if the answer is "yes". You can first publish the PR and check them afterwards, for convenience.)

  • Dependencies have been added or upgraded
  • Public API has been changed (Public API is any class annotated with @Public(Evolving))
  • Serializers have been changed
  • New feature has been introduced
    • If yes, how is this documented? (not applicable / docs / JavaDocs / not documented)

Comment on lines +189 to +190
+ "This indicates that either there might be some data loss in the "
+ "application or the child shard hasn't been discovered yet",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we see this very often? If so, this is worrying, and we should create a JIRA to track an improvement. We know our setup is resilient against this, because the child will get picked up eventually, but it would be good to not worry people unnecessarily!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeh, we see this often if we are always at the tip of the log and child shards haven't been discovered yet :)

Comment on lines +268 to +269
.sequenceNumberRange(SequenceNumberRange.builder().build())
.build())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work with the inconsistency tracker? Don't we use the sequence number to detect open/closed shards?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you're right, here we are under assumption that there's no inconsistent shard returend from splitTracker so all leaf nodes are open. The shard returned from incremental shard discovery will remove the open leaf node and add itself as a leaf node

@@ -250,6 +252,24 @@ public Set<String> getKnownSplitIds() {
return knownSplits.keySet();
}

public List<Shard> getKnownShards() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not unit tested!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will add UT.

@@ -101,6 +105,13 @@ public String getEarliestClosedLeafNode() {
return closedLeafNodes.first();
}

public String getLatestLeafNode() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not unit tested!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will add UT.

Copy link
Contributor

@hlteoh37 hlteoh37 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @gguptp ! Left some comments

@gguptp gguptp closed this Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants