Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CZID-8699] Multipart uploads #87

Merged
merged 6 commits into from
Oct 9, 2023

Conversation

robertaboukhalil
Copy link
Contributor

@robertaboukhalil robertaboukhalil commented Oct 9, 2023

Currently, the entities service supports file uploads through 1 signed URL upload, which limits file sizes to 5GB.

As we discussed in our syncs recently, we need to support larger file uploads (the web app and CLI rely on multipart uploads). This should hopefully make it easy to switch to the NextGen upload token generation 😀

2023-10-09-nextgen-file-upload-mutation

expiration: int = 3600,
sts_client: STSClient = Depends(get_sts_client),
) -> MultipartUploadCredentials:
policy = {
Copy link
Contributor Author

@robertaboukhalil robertaboukhalil Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any dis/advantage to supporting both multipart file uploads and single-file uploads limited to 5gb? I think the main advantage to supporting both would be if there are some clients that have a hard time with the more complicated flow (ex: does the JS AWS client support multipart uploads?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the JS client supports multipart uploads according to this, but I remember Jerry patched that library but not sure for what exactly. But we could totally support both, I don't think it hurts

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a huge fan of supporting One Right Way to do something w/this project. So I'm fine with merging this for now, and adding support back in for single-part-uploads if/when we run into problems with clients. 🤷‍♀️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

@robertaboukhalil robertaboukhalil marked this pull request as ready for review October 9, 2023 17:51
@@ -89,7 +89,7 @@ async def test_invalid_fastq(
assert fileinfo["status"] == "FAILED"


# Test generating signed URLs for file upload
Copy link
Collaborator

@jgadling jgadling Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO at some point - it would be nice to have a test that does the whole flow:

  1. create a sequencing read
  2. generate a signed upload url to upload a sequencing read file
  3. Upload the file
  4. Run the "upload complete endpoint"
  5. Download the file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I love that!

@robertaboukhalil robertaboukhalil merged commit 984d1e9 into main Oct 9, 2023
2 checks passed
@robertaboukhalil robertaboukhalil deleted the raboukhalil/multipart-uploads branch October 9, 2023 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants