Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-file write support to the js and python sdks #451

Open
wants to merge 35 commits into
base: main
Choose a base branch
from

Conversation

0div
Copy link
Contributor

@0div 0div commented Oct 4, 2024

Description

  • Allow the Filesystem.write method to accept multiple files
  • Add tests to assert that envd supports multipart with multiple files out of the box

Test

# Test js-sdk
cd packages/js-sdk
pnpm test

# Test python-sdk
cd packages/python-sdk
pnpm test

@0div 0div requested a review from ValentaTomas October 4, 2024 18:01
@0div 0div self-assigned this Oct 4, 2024
Copy link

linear bot commented Oct 4, 2024

Copy link

changeset-bot bot commented Oct 4, 2024

🦋 Changeset detected

Latest commit: 724fbc6

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@e2b/python-sdk Minor
e2b Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@0div 0div changed the title Added multi-file write support Add multi-file write support Oct 4, 2024
@0div 0div added the Improvement Improvement for current functionality label Oct 4, 2024
@0div 0div changed the title Add multi-file write support Add multi-file write support to the js-sdk Oct 4, 2024
@ValentaTomas
Copy link
Member

ValentaTomas commented Oct 5, 2024

I pushed an edit that should sketch a type cleanup.

There are two unfinished parts:

  1. It should be possible to typeguard this so there is no type assumption when dealing with the parameters.
    const { path, writeOpts, writeFiles } = typeof pathOrFiles === 'string'
      ? { path: pathOrFiles, writeFiles: [{ data: dataOrOpts }], writeOpts: opts }
      : { path: undefined, writeFiles: pathOrFiles, writeOpts: dataOrOpts }

    const blobs = await Promise.all(writeFiles.map(f => new Response(f.data).blob()))
  1. The problem here was actually there even in the original SDK code — the EntryInfo type assumption works because the returned types are exactly of the shape of EntryInfo, but I think we should property type these by creating a mapping. Also, this will not correctly return the array if I just passed an array with one file.
    return files.length === 1 ? files[0] : files

Some edits I did:

  • Changed filename to path in the list of files to make it the same as the argument name in a method that takes only one
  • The method returned array for > 1, but if you passed an empty array as the files, it would throw an error. There is still the problem I mentioned in 2.

Also, your previous code was mostly correct; this is more of an improvement.

@ValentaTomas
Copy link
Member

ValentaTomas commented Oct 5, 2024

I also think the path might be redundant as an API argument here because we can just send the file with the filename as usual. (I read the response in the envd infra PR, so not that sure here.)

Let's keep it in the envd API though, because one of the uses was that you can use it to ensure that the path people upload files to is fixated.

@0div
Copy link
Contributor Author

0div commented Oct 7, 2024

@ValentaTomas I addressed your comments and added extra tests for some edge cases.

@0div
Copy link
Contributor Author

0div commented Oct 7, 2024

One important thing to note, and I've added it as a code comment, is that we can't expect specified directories in path of multipart filename to be taken into consideration; I've tested it with and sure enough only file name is used as path, the rest of the path is stripped by the std lib's "mime/multipart" when calling

pathToResolve = part.FileName()

@ValentaTomas
Copy link
Member

One important thing to note, and I've added it as a code comment, is that we can't expect specified directories in path of multipart filename to be taken into consideration; I've tested it with and sure enough only file name is used as path, the rest of the path is stripped by the std lib's "mime/multipart" when calling

pathToResolve = part.FileName()

Ok, this is a very good find — how do you think we should handle this? We want to be able to upload files with any path, but the stripping of paths might make sense to preserve, because it allows people to upload by link easily. This might require some changes to the envd.

@0div
Copy link
Contributor Author

0div commented Oct 7, 2024

One important thing to note, and I've added it as a code comment, is that we can't expect specified directories in path of multipart filename to be taken into consideration; I've tested it with and sure enough only file name is used as path, the rest of the path is stripped by the std lib's "mime/multipart" when calling

pathToResolve = part.FileName()

Ok, this is a very good find — how do you think we should handle this? We want to be able to upload files with any path, but the stripping of paths might make sense to preserve, because it allows people to upload by link easily. This might require some changes to the envd.

If it's really important not to break the current API spec, we could add a custom field to the multipart dataform and look for it with some changes to envd

@0div 0div changed the title Add multi-file write support to the js-sdk Add multi-file write support to the js and python sdks Oct 7, 2024
@0div
Copy link
Contributor Author

0div commented Oct 7, 2024

@ValentaTomas I started by adding multi file write support for sandbox_sync to get your opinion on this approach before moving forward with _async.

Considerations

I went the OR route when it comes to typing, as function overload is not natively supported in python. Thate being said, with some non-negligible overhead, it is kindof achievable, but im not enthralled by the idea, especially when thinking about previous work generating the api-ref automatically: i have a feeling it wouldn't play well with it.

@ValentaTomas
Copy link
Member

ValentaTomas commented Oct 7, 2024

For the overload I think you can use the same system as we already have with https://github.com/e2b-dev/E2B/blob/beta/packages/python-sdk/e2b/sandbox_sync/process/main.py#L106

It should be the same thing, right?

@ValentaTomas
Copy link
Member

ValentaTomas commented Oct 7, 2024

I also suggest naming and exporting all the types (from both SDKs). What do you say about having:

  • WriteData — this is then union type for the data (string, bytes, etc)
  • WriteEntry — this is the type that contains both the path and data: WriteData fields.

@ValentaTomas
Copy link
Member

I'm thinking about what to do when you try to invoke write for multiple files and provide an empty array.

Logically, you might want to notify the user that nothing was written, but throwing an error might not be optimal. If you are generating the field to write, you need to explicitly check if the array is empty; otherwise, you will get an error.

In contrast to this, isn't writing 0 files a valid operation and you will also get an array with 0 results so everything is ok?

@ValentaTomas
Copy link
Member

One important thing to note, and I've added it as a code comment, is that we can't expect specified directories in path of multipart filename to be taken into consideration; I've tested it with and sure enough only file name is used as path, the rest of the path is stripped by the std lib's "mime/multipart" when calling

pathToResolve = part.FileName()

Ok, this is a very good find — how do you think we should handle this? We want to be able to upload files with any path, but the stripping of paths might make sense to preserve, because it allows people to upload by link easily. This might require some changes to the envd.

If it's really important not to break the current API spec, we could add a custom field to the multipart dataform and look for it with some changes to envd

Yeah, I'm thinking that we should probably do this, because people are already using the Beta SDK.

@0div
Copy link
Contributor Author

0div commented Oct 8, 2024

I'm thinking about what to do when you try to invoke write for multiple files and provide an empty array.

Logically, you might want to notify the user that nothing was written, but throwing an error might not be optimal. If you are generating the field to write, you need to explicitly check if the array is empty; otherwise, you will get an error.

In contrast to this, isn't writing 0 files a valid operation and you will also get an array with 0 results so everything is ok?

From the perpective that there will likely be less control over which files, if any, are generated, your point makes sense. I will allow empty arrays

@mlejva
Copy link
Member

mlejva commented Dec 13, 2024

Yes, I left a comment there. We would merge the envd change, then rebuild base and code interpreter templates, then release the SDK changes.

I think you might have forgotten the Desktop Sandbox template

These client changes require changes in the envd which is in this PR. This will require us to rebuild our sandbox templates so they have the new envd.

  1. Have we rebuilt all the required templates? Base, code interpreter, and desktop
  2. Users need to rebuild their custom templates after they update to newer version, right?

@0div 0div requested review from ValentaTomas and jakubno December 13, 2024 01:16
@mlejva
Copy link
Member

mlejva commented Dec 17, 2024

Is this waiting for review and ready otherwise?

@0div
Copy link
Contributor Author

0div commented Dec 17, 2024

Is this waiting for review and ready otherwise?

It's ready for release and will be once the envd template is built — if I understood the order of things correctly cc @ValentaTomas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request sdk Improvements or additions to SDKs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants