Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-31824: Add experimental mtransfer class method #107

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

timj
Copy link
Member

@timj timj commented Mar 4, 2025

Checklist

  • ran Jenkins
  • added a release note for user-visible changes to doc/changes

Copy link

codecov bot commented Mar 4, 2025

Codecov Report

Attention: Patch coverage is 94.81481% with 7 lines in your changes missing coverage. Please review.

Project coverage is 86.96%. Comparing base (2cf0f8f) to head (aa90387).

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
python/lsst/resources/_resourcePath.py 89.58% 2 Missing and 3 partials ⚠️
python/lsst/resources/gs.py 50.00% 1 Missing ⚠️
python/lsst/resources/tests.py 97.82% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #107      +/-   ##
==========================================
+ Coverage   86.80%   86.96%   +0.15%     
==========================================
  Files          27       27              
  Lines        4690     4793     +103     
  Branches      566      578      +12     
==========================================
+ Hits         4071     4168      +97     
- Misses        474      477       +3     
- Partials      145      148       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@timj timj force-pushed the tickets/DM-31824 branch from b1d09d8 to b821655 Compare March 4, 2025 19:37
@timj timj force-pushed the tickets/DM-31824 branch from b821655 to edee89d Compare March 4, 2025 20:29
@timj timj force-pushed the tickets/DM-31824 branch from f1f6b31 to 2eb4cef Compare March 5, 2025 21:21
@timj timj force-pushed the tickets/DM-31824 branch from 2eb4cef to de1ecf0 Compare March 5, 2025 21:32
timj added 4 commits March 6, 2025 12:44
Instead of using tmpdir and moving to the new location which
might then involve a copy (and in some cases fill up tmpdir)
Only go over the loop once.
Do not download directly to destination directory if the
destination directory does not exist.
@timj timj force-pushed the tickets/DM-31824 branch from 66bad34 to 1ceaecb Compare March 6, 2025 19:44
@timj timj force-pushed the tickets/DM-31824 branch from c781ae9 to 79f29fa Compare March 6, 2025 19:52

Returns
-------
copy_status : `dict` [ `ResourcePath`, `bool` ]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhirving once sticking point for mtransfer over transfer_from in a loop is that the latter raises an exception and tells you the problem immediately and you stop doing the transfers. mtransfer as written lets everything complete and then tells you a simple yes/no when the caller would like to know some reasons for failures (such as FileExistsError). I could store the Exceptions in the returned dict. Or I could let the first failure raise (is that allowed in concurrent futures?)? Or at the end of the transfers all the number of failures could be counted and the final exception encountered could be raised (with a note saying how many other failures there were). Should there be a parameter to indicate whether the caller wants a raise vs dict returned? I would be interested in your opinion on this. (butler does need to have some idea as to why a failure happened to try to help in its error message)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea might be to raise an ExceptionGroup with every failure.

I think your existing dict thing would work too, I wonder if it could be like

class TransferResult(NamedTuple):
    success: bool
    exception: Exception | None

dict[ResourcePath, TransferResult]

or something like that.

Bailing immediately on the first failure is potentially problematic because you will still have concurrent transfers going in the background... I think you can cancel the unscheduled ones but you would want the already-executing ones to finish before throwing. I think your idea of adding a parameter to choose "bail on first failure" vs "continue as far as possible" is decent.

@timj timj force-pushed the tickets/DM-31824 branch from 9f8e277 to 6977e01 Compare March 6, 2025 23:21
@timj timj force-pushed the tickets/DM-31824 branch from 6977e01 to aa90387 Compare March 6, 2025 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants