-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: download_directory local path incorrectly stored #54
Conversation
Warning Rate limit exceeded@jjjermiah has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 20 minutes and 41 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughThe pull request introduces significant enhancements to the Google Cloud Storage (GCS) storage provider within the Snakemake framework. Key additions include the Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake_storage_plugin_gcs/init.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake_storage_plugin_gcs/__init__.py (1)
Pattern
**/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theself
argument of methods.
Do not suggest type annotation of thecls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturn
statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
snakemake_storage_plugin_gcs/__init__.py (1)
528-529
: LGTM! Path construction fixed as intended.The change correctly addresses the issue of downloading directories to the wrong location. By using
self.provider.local_prefix
, the files will now be stored in the.snakemake/storage/gcs
directory as intended.For improved clarity, consider adding a comment explaining the purpose of this path construction:
# Construct the local path ensuring files are stored in .snakemake/storage/gcs local_name = self.provider.local_prefix / self.bucket.name / blob.nameThis will help future maintainers understand the importance of this line.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake_storage_plugin_gcs/init.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake_storage_plugin_gcs/__init__.py (1)
Pattern
**/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theself
argument of methods.
Do not suggest type annotation of thecls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturn
statement.
🔇 Additional comments (1)
snakemake_storage_plugin_gcs/__init__.py (1)
528-529
: Summary: PR objective achieved, with room for further improvementThe changes successfully address the main issue of incorrect directory downloads. The implementation looks good and should resolve the problem described in the PR objectives.
To further improve the code:
- Consider adding a clarifying comment for the new path construction.
- Explore the possibility of unifying path construction logic across download and upload methods.
These suggestions aim to enhance code clarity and maintainability for future development.
…y in StorageObject class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake_storage_plugin_gcs/init.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake_storage_plugin_gcs/__init__.py (1)
Pattern
**/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theself
argument of methods.
Do not suggest type annotation of thecls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturn
statement.
🔇 Additional comments (1)
snakemake_storage_plugin_gcs/__init__.py (1)
33-33
: Importingtransfer_manager
is appropriateThe import statement correctly includes
transfer_manager
fromgoogle.cloud.storage
, enabling batch download functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
snakemake_storage_plugin_gcs/__init__.py (1)
525-536
: Approve changes with suggestions for improvementThe refactoring of
_download_directory
to usetransfer_manager.download_many_to_path
is a good improvement that should enhance performance through batch downloads. The error handling for failed downloads is also a positive addition.However, there are two areas for potential improvement:
Error Logging: As per a previous review comment, using
self.logger.error
is not recommended. Consider using a more appropriate logging method or raising an exception for failed downloads.Path Construction: The path construction (
self.provider.local_prefix / self.bucket.name
) might not be consistent with other parts of the code. Consider unifying the path construction logic across methods, as suggested in a previous review comment.Consider implementing these improvements:
- Replace the error logging with a more appropriate method or exception handling.
- Create a helper function for path construction to ensure consistency across the codebase.
Would you like assistance in implementing these suggestions?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake_storage_plugin_gcs/init.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake_storage_plugin_gcs/__init__.py (1)
Pattern
**/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theself
argument of methods.
Do not suggest type annotation of thecls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturn
statement.
🔇 Additional comments (1)
snakemake_storage_plugin_gcs/__init__.py (1)
33-33
: LGTM: New import for transfer_managerThe addition of
transfer_manager
import fromgoogle.cloud.storage
is appropriate and necessary for the changes in the_download_directory
method.
After some minor benchmarking, iterating over blobs to download in When running on GKE for ~10k jobs a minute of downloading could add up against our costs. the in a sample case the time taken is now ~6 seconds (92% improvement) if this change works and is approved I can also update the |
🤖 I have created a release *beep* *boop* --- ## [1.1.2](v1.1.1...v1.1.2) (2024-10-07) ### Bug Fixes * download_directory local path incorrectly stored ([#54](#54)) ([bc3d739](bc3d739)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
when specifying a directory for input, or using the output from a previous rule that is marked with
directory()
the directory downloads directory into theworkdir
instead of into.snakemake/storage/gcs
This fix should hopefully fix that.
Summary by CodeRabbit
New Features
Bug Fixes