Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature][Connector-Paimon] Support dynamic bucket splitting improves Paimon writing efficiency #7335

Merged
merged 5 commits into from
Sep 20, 2024

Conversation

hawk9821
Copy link
Contributor

@hawk9821 hawk9821 commented Aug 7, 2024

Purpose of this pull request

Support dynamic bucket splitting improves Paimon writing efficiency

Does this PR introduce any user-facing change?

no

How was this patch tested?

e2e: PaimonSinkDynamicBucketIT
UT: PaimonBucketAssignerTest#bucketAssigner
e2e case: PaimonSinkDynamicBucketIT#testPaimonBucketCountOnSparkAndFlink ,because spark and Flink engine can not auto create paimon table on worker node in local file, this e2e case work on local hdfs environment.
17266244003976

Check list

@Hisoka-X Hisoka-X changed the title [Feature][CONNECTORS-V2-Paimon] Support dynamic bucket splitting improves Paimon writing efficiency [Feature][Connector-Paimon] Support dynamic bucket splitting improves Paimon writing efficiency Aug 7, 2024
@Hisoka-X
Copy link
Member

Hisoka-X commented Aug 7, 2024

cc @dailai and @TaoZex

@github-actions github-actions bot removed the flink label Aug 8, 2024
@hawk9821 hawk9821 force-pushed the paimon_dynamic_bucket branch 2 times, most recently from 1b445f0 to a5d18ee Compare August 21, 2024 00:44
@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label Aug 21, 2024
@dailai
Copy link
Contributor

dailai commented Aug 21, 2024

Please retrigger the ci.

@hawk9821 hawk9821 force-pushed the paimon_dynamic_bucket branch 5 times, most recently from 50764df to c93f7b8 Compare August 23, 2024 01:11
@github-actions github-actions bot removed the dependencies Pull requests that update a dependency file label Aug 23, 2024
@dailai
Copy link
Contributor

dailai commented Aug 26, 2024

Thinks @hawk9821 . Good job. I think your e2e case needs to be added to the case of multi-parallelism, the current case is all single parallelism. In this way, we can effectively verify whether the dynamic bucketing will change depending on the degree of parallelism of the job. Also, I think you should check the bucket count in every case instead of making a separate case. In addition, each of your cases should verify that the dynamic-bucket.target-row-num argument works as expected.

@github-actions github-actions bot added dependencies Pull requests that update a dependency file CI&CD core SeaTunnel core module and removed paimon labels Aug 29, 2024
docs/en/connector-v2/sink/Paimon.md Outdated Show resolved Hide resolved
docs/en/connector-v2/sink/Paimon.md Outdated Show resolved Hide resolved
docs/en/connector-v2/sink/Paimon.md Outdated Show resolved Hide resolved
docs/en/connector-v2/sink/Paimon.md Outdated Show resolved Hide resolved
docs/en/connector-v2/sink/Paimon.md Outdated Show resolved Hide resolved
@hawk9821 hawk9821 force-pushed the paimon_dynamic_bucket branch 2 times, most recently from 974e481 to af318d5 Compare September 3, 2024 05:51
@Hisoka-X Hisoka-X self-assigned this Sep 4, 2024
@github-actions github-actions bot added core SeaTunnel core module flink and removed paimon labels Sep 12, 2024
@hawk9821 hawk9821 force-pushed the paimon_dynamic_bucket branch 3 times, most recently from ad8281b to e0cd7d8 Compare September 12, 2024 17:11
@hawk9821 hawk9821 force-pushed the paimon_dynamic_bucket branch 4 times, most recently from dc92c7b to a1b351d Compare September 18, 2024 08:39
wuchunfu and others added 5 commits September 20, 2024 10:08
* [Improve] Update snapshot version to 2.3.8

* [Improve] Update snapshot version to 2.3.8
[Feature][CONNECTORS-V2-Paimon] spark task parallelism
[Feature][CONNECTORS-V2-Paimon] update doc

[Feature][CONNECTORS-V2-Paimon] write to dynamic bucket table , spark flink e2e
Copy link
Member

@Hisoka-X Hisoka-X left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if ci passes. Thanks @hawk9821

@hailin0 hailin0 merged commit bc0326c into apache:dev Sep 20, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants