Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit the use of "dataset" #38

Open
danielballan opened this issue Oct 6, 2022 · 2 comments
Open

Revisit the use of "dataset" #38

danielballan opened this issue Oct 6, 2022 · 2 comments

Comments

@danielballan
Copy link
Contributor

We use "dataset" to group all the NMC data, but elsewhere we use it to group different batches of data (iss, iss-raw). It seems like we need mechanisms for grouping data by its {source, origin, contributor, batch?} and other for grouping data that we are interested in analyzing together (NMC).

@danielballan
Copy link
Contributor Author

Decision (discussion with @jmaruland and @danielballan):

  • Stick to using dataset to mean "batch of data we were given to ingest", as it was originally used for heald and newville and aimm_core and so on.
  • Introduce the key projects, with the value being a list that may contain "nmc" for example.

At some point, take the nodes that currently have dataset="nmc" and edit their dataset (probably directly in a mongo shell at this point) to refer to a batch and add {"projects": ["nmc"]} to everything that relates to NMC.

@danielballan
Copy link
Contributor Author

We also want to capture measurement type, like hard XAS or soft XAS or a simulated version of those....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant