Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dealing with directories #37

Open
DenisaCG opened this issue Dec 6, 2024 · 1 comment
Open

Dealing with directories #37

DenisaCG opened this issue Dec 6, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@DenisaCG
Copy link
Member

DenisaCG commented Dec 6, 2024

Problem description

The backend contents manager uses the obstore package to list the contents given a path inside a drive, retrieve the contents of an object and for all other functionalities to manipulate content (create, save, rename, copy, delete, download).

Unfortunately, the package does not include the concept of a directory. As such, when listing the contents given a path, it is impossible to differentiate between an empty directory or an empty file.

It is important to note that even if the S3 provider doesn't have a typical directory, they create a zero-length object with a key ending in / to mimic one (see red box marked important here).

The obstore package removes these trailing slashes when dealing with the key to an object, so it is impossible to properly interact with a directory. For example, when creating a directory, it actually creates a broken file, and only once you put other objects into that supposed directory, it creates it, but you are still left with the initial broken file.

image

As such, it becomes problematic to:

  • identify empty folders (compared to empty files)
  • rename, delete or copy the directories created outside of the DriveBrowser (the zero-length object itself, not the objects inside of it)

Current state of extension v0.0.1

The current logic to identify directories is based on the name not including ., as files which have extensions do (.txt, .ipynb, etc). If they do include a ., we check if the string after the character is not one of the registered file extensions in JupyterLab, and then consider it a directory.

The logic is faulty as directories can include the . character in whatever format. It also does not solve the issue of interacting with already created directories, which contain the trailing slash.

Possible solution

Use another package for the backend content manager to perform content manipulation operations, but keep the obstore package for its paginated listing abilities.

@DenisaCG DenisaCG added the bug Something isn't working label Dec 6, 2024
@DenisaCG
Copy link
Member Author

Update for v0.1.0

We are using the s3fs package to perform all content manipulation functionalities, while keeping the obstore package for listing the contents given a path and retrieving contents of a file, as it supports pagination.

The s3fs package has the concept of a directory, but still fails to perform operations such as create, rename, copy and delete, as they all remove the trailing slash when formatting the path, which is essential to dealing with directories in the expected file browser experience.

Fix for manipulating directories

While we are using the function isdir to identify when we are dealing with a directory object, regardless of its name, we need another solution in order to benefit from all other functionalities. For this, we are attaching a suffix to directory objects, namely /.jupyter_drives_fix_dir. This way we can artificially create the directory, while placing a hidden file inside the folder, which won't be seen in the file browser.

Moreover, when encountering a folder created using the AWS console (which doesn't contain this suffix), we delete that object using aiobotocore and then use s3fs to create the object using the needed suffix, then proceed with the operation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant