-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add Option to Prevent Overlapping Replications of the Same Artifact in Harbor #21347
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: bupd <[email protected]> Co-authored-by: Maksym Trofimenko <[email protected]>
Signed-off-by: bupd <[email protected]> Co-authored-by: Maksym Trofimenko <[email protected]>
* Adds Skip If Runnning to Replication Policy Signed-off-by: bupd <[email protected]>
* Adds Checkbox in replication policy UI * Updates swagger & sql schema Signed-off-by: bupd <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #21347 +/- ##
==========================================
+ Coverage 45.36% 46.12% +0.75%
==========================================
Files 244 247 +3
Lines 13333 13883 +550
Branches 2719 2875 +156
==========================================
+ Hits 6049 6403 +354
- Misses 6983 7141 +158
- Partials 301 339 +38
Flags with carried forward coverage won't be shown. Click here to find out more. |
Signed-off-by: bupd <[email protected]>
6763a2c
to
a997836
Compare
Thanks, @bupd, for your contribution! I believe this is a valuable new feature for Harbor that should follow the proposal process. There are several questions we need to briefly discuss:
|
@@ -7488,6 +7488,10 @@ definitions: | |||
type: boolean | |||
description: Whether to enable copy by chunk. | |||
x-isnullable: true | |||
skip_if_running: | |||
type: boolean | |||
description: Whether to enable skip, if replication already running. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This text is misleading, clearly explain what this option does, providing enough context, background, and result of enabling this future: e.g "Don't start a new replication until the previous one is terminated"
@@ -76,6 +76,7 @@ | |||
"PULL_BASED": "Pull the resources from the remote registry to the local Harbor.", | |||
"DESTINATION_NAMESPACE": "Specify the destination namespace. If empty, the resources will be put under the same namespace as the source.", | |||
"OVERRIDE": "Specify whether to override the resources at the destination if a resource with the same name exists.", | |||
"SKIP_IF_RUNNING": "Specify whether to skip the execution when replication already running.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bad wording, intention and outcome not clear
@@ -560,6 +561,7 @@ | |||
"ALLOWED_CHARACTERS": "Allowed special characters", | |||
"TOTAL": "Total", | |||
"OVERRIDE": "Override", | |||
"SKIP_IF_RUNNING": "Skip if running", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bad wording
We were planning to use
Indeed PolicyID was never needed before. We can add it anyway to the job arguments so we can scan observations, parse job arguments and match policyID with the current.
I believe even checking an artefact for existence produces an additonal HTTP request. So, we still adding some extra efficiency. |
Regarding that statement. there is a fundamental deficit with harbor replications, when the layer is in the copy process, new replication will try to copy the same blob again. If you have a slow connection like DSL the same blob will be copied over and over again. This results in the domino effect. Each new replication halves the throughput, resulting in slower transfers, and because all replications run in parallel, the same effect happens with the next blob/layer. We have made an experiment with:
|
Summary
This PR addresses a long-standing issue where overlapping replications of the same artifact can occur in Harbor, leading to unnecessary resource consumption and poor performance. By introducing a "Disable parallel replication" checkbox in the replication policy, it ensures that replication tasks for the same policy do not run in parallel, preventing bandwidth overload and queue backups, especially for large artifacts.
Similar Issues
Related Issues
Why do we need this
Changes Made
Disable parallel replication
Checkbox in Replication UI.skip_if_running
column in sql.Screenshots
Todo
Please indicate you've done the following: