Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using IDs instead of names to identify resources #45

Open
pablo-de-andres opened this issue Sep 26, 2022 · 10 comments
Open

Using IDs instead of names to identify resources #45

pablo-de-andres opened this issue Sep 26, 2022 · 10 comments

Comments

@pablo-de-andres
Copy link
Member

Names are used in the path parameters for the operations with datasets and collections (dataset_name, collection_name). However, these might not be unique.

Raised by @Kirankumaraswamy.

@pablo-de-andres
Copy link
Member Author

@csadorf thoughts?

@csadorf
Copy link
Collaborator

csadorf commented Sep 26, 2022

Could you provide an example of the issue? It is not immediately clear to me what the problem is.

@pablo-de-andres
Copy link
Member Author

pablo-de-andres commented Sep 26, 2022

For instance here, to get a dataset the parameters are collection_name and dataset_name.
But I actually see in the datamodel that this is actually an id.

The confusion came from the usage of "name", that does not necessarily transmit the need for them to be unique.
I would say the point is then whether it shoulb renamed or left as is.

@csadorf
Copy link
Collaborator

csadorf commented Sep 26, 2022

Why don't we just add a note to the description clarifying that the collection name and data set name are unique? There is no way to accidentally use the API in a wrong way here, but maybe it helps to clear up any confusion?

@Kirankumaraswamy
Copy link
Contributor

Kirankumaraswamy commented Sep 27, 2022

Here the collection names and dataset names are folder names and file names in a directory. Hence they don't need to be unique. We can have a file with the same name in some other directory.

@csadorf
Copy link
Collaborator

csadorf commented Sep 27, 2022

Here the collection names and dataset names are folder names and file names in a directory. Hence they don't need to be unique. We can have a file with the same name in some other directory.

Yes, but that would be ok? Different collections can have data sets with the same name. They just have to be unique within the same collection.

@Kirankumaraswamy
Copy link
Contributor

You are right but when we retrieve a single dataset ,we will retrieve it by using either dataest_id or dataset_title. We are not using collection_id to retrieve it. When we request for a single dataset with a title then we have to send all the dataset with matching titles across all collections.

@pablo-de-andres
Copy link
Member Author

@csadorf is not working on this anymore.

Who is we in "when we retrieve"? To me it makes sense that the collection is needed, similar to how the path/folder is needed to access a file.

@Kirankumaraswamy
Copy link
Contributor

We in the sense DataSink. Currently python-sdk doesn't support using both collection ID and dataset ID to retrieve the dataset. Same with Datasink side. We need to change the python-sdk code again.

@pablo-de-andres
Copy link
Member Author

The python SDK does support it, does it not?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants