Skip to content

Name external store files with primary key when downloaded #1099

Open
@ghost

Description

Feature Request

Problem

In many pipelines, external store files may have identical names. For example, an experimenter may name all their raw electrophysiology data 'data.bin'. When fetching this data, it downloads the external files with just their name into the working directory. This leads to overwriting of all files with identical names.

Note I am referring to the name of the file as it is stored on the local system before inserting and after fetching, not the name of the file in the store itself which is always unique as it is a hash.

Requirements

Provide an option in fetch to include the primary key for that entry in the downloaded file name. Instead of 'data.bin' the downloaded file from the store will be 'PRIMARY-KEY-data.bin'.

Justification

This will allow user to fetch data and download files from the external store that have identical file names.

Alternative Considerations

The alternative would be to force users to name all files uniquely. This is not helpful for some use cases. For example, in an electrophysiology pipeline, raw output may always be named 'data.bin' by the equipment, and the user may then directly upload this to their datajoint pipeline. It would be inconvenient to have to rename these files first.

Screenshots

In this screenshot, I show the result of a fetch on my database. Note that our equipment always names our raw electrophysiology data 'data.bin', but these are different files, which were stored in different directories before being uploaded to our datajoint pipeline. Here, they get overwritten, and my downloads folder only has one 'data.bin' file.

Screenshot from 2023-07-14 13-48-57

Additional Research and Context

Note i have only tested this with an S3 external store not a file store.

Metadata

Metadata

Labels

enhancementIndicates new improvementsstaleIndicates issues, pull requests, or discussions are inactive

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions