-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong package download after on-demand sync #5725
Comments
We can adjust the order_by_acs() function or add a new sorting: 1. ACS wins if available, 2. remote associated with repo/distribution, 3. the rest is random (randomizing all the remotes). The discussion around copying content will come. |
Hey @dralley,
Currently I'm working on (1), by looking into http 1.1 and our stream implementation details and writing unit tests using the content.Handler. edit: In the end, (1) is about fixing #5012 first |
An alternative approach to randomizaton is add a cache to the ContentArtifact about the health it's remotes. I guess that was mentioned in the f2f but was not very well received. Anyway, I'll share my naive idea here:
|
@dralley has this idea for a similar case, but I think it fits even better here.
IHO this is definitely better than randomization. The idea of streaming is optimizing speed/transfer for the best case, when we can get the right package. If we fail with that, I believe we should aim to minimize the worst case, which currently is the client don't getting the package (even if it's somehow reachable). I believe with that approach and some improvements we can make the worst case be: (a) ensure the client get the package on the second request (at most), if the package is reachable (b) report a meaningful error if it's unreachable. Worst CaseA Pessimistic ScenarioThis still doesnt cover the worst case, where an ACS is corrupted and the client give-up before the task completes (this is very pessimistic case, but I've learned those usually happen with us). Improvement ideaAnd idea to make it more robust is: |
I definitely agree with what's alluded to in your diagram - that we do need to handle the unreachable case in such a way that it's not going to consume needlessly large amounts of resources constantly if the RAs are unreachable. |
Originally reported here.
I'm bringing it here for visibility in the RemoteArtifact bucket of issues.
Version
Confirmed to happen with (at least).
Describe the bug
When:
Then:
To Reproduce
This is a flaky problem. Most likely it depends on sorting RemoteArtifact via uuid4, which is random.
I'll write a Pulp test and share it here to make it easier to re-run.
Expected behavior
The right package should be delivered to the client, as its present and reachable in Remote A.
Additional context
The text was updated successfully, but these errors were encountered: