-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature][Connector-Paimon] Support dynamic bucket splitting improves Paimon writing efficiency #7335
Conversation
...ink-13/src/main/java/org/apache/seatunnel/translation/flink/sink/FlinkSinkWriterContext.java
Show resolved
Hide resolved
...common/src/main/java/org/apache/seatunnel/translation/flink/sink/FlinkSinkWriterContext.java
Show resolved
Hide resolved
1b445f0
to
a5d18ee
Compare
Please retrigger the ci. |
.../java/org/apache/seatunnel/connectors/seatunnel/paimon/sink/bucket/PaimonBucketAssigner.java
Show resolved
Hide resolved
50764df
to
c93f7b8
Compare
Thinks @hawk9821 . Good job. I think your e2e case needs to be added to the case of multi-parallelism, the current case is all single parallelism. In this way, we can effectively verify whether the dynamic bucketing will change depending on the degree of parallelism of the job. Also, I think you should check the bucket count in every case instead of making a separate case. In addition, each of your cases should verify that the dynamic-bucket.target-row-num argument works as expected. |
...n-e2e/src/test/java/org/apache/seatunnel/e2e/connector/paimon/PaimonSinkDynamicBucketIT.java
Show resolved
Hide resolved
974e481
to
af318d5
Compare
seatunnel-api/src/main/java/org/apache/seatunnel/api/sink/SinkWriter.java
Show resolved
Hide resolved
af318d5
to
3874aae
Compare
ad8281b
to
e0cd7d8
Compare
dc92c7b
to
a1b351d
Compare
* [Improve] Update snapshot version to 2.3.8 * [Improve] Update snapshot version to 2.3.8
…mon writing efficiency
[Feature][CONNECTORS-V2-Paimon] spark task parallelism
[Feature][CONNECTORS-V2-Paimon] update doc [Feature][CONNECTORS-V2-Paimon] write to dynamic bucket table , spark flink e2e
…mon writing efficiency
b34a78a
to
9a7dab1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if ci passes. Thanks @hawk9821
Purpose of this pull request
Support dynamic bucket splitting improves Paimon writing efficiency
Does this PR introduce any user-facing change?
no
How was this patch tested?
e2e: PaimonSinkDynamicBucketIT
UT: PaimonBucketAssignerTest#bucketAssigner
e2e case: PaimonSinkDynamicBucketIT#testPaimonBucketCountOnSparkAndFlink ,because spark and Flink engine can not auto create paimon table on worker node in local file, this e2e case work on local hdfs environment.
Check list
New License Guide
release-note
.