Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support data objects belong to a collection #396

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

perolavsvendsen
Copy link
Member

Solve #394. See issue for details.

Summary, this PR:

  • Enables the collection_name argument to ExportData
  • When collection_name is given, add it to relations.collections in outgoing metadata
  • Add definition of relations.collections in schema

This allows for multiple objects to be exported with a common tag defining a collection they all belong to. The purpose of this is for clients to quickly identify other data objects related to a data object. The produced relations.collections will be uuid4 and identical within a case.

This is done by taking the provided collections_name argument and combining it with the current fmu.case.uuid.

Use case: When looking at a volume table, identify if the same volumes are represented as a 3D parameter or a surface.

Clients would do this through the following workflow: Given a data object, if "relations.collections" is present, find other data objects with the same reference.

There are some unanswered questions:

  • A requirement is that a data object can belong to several collections. Hence, relations.collections is a list. But does this work in reality? E.g. if a data object has 3 entries in relations.collections - how will the consumer know which one he wants to unravel?
  • What about pre-processed data that belongs to a collection? The fmu.case.uuid does not exist when these data are made, hence relations.collections must probably be remade when dealing with preprocessed data.

@perolavsvendsen perolavsvendsen linked an issue Nov 18, 2023 that may be closed by this pull request
@jcrivenaes
Copy link
Collaborator

What about pre-processed data that belongs to a collection? The fmu.case.uuid does not exist when these data are made, hence relations.collections must probably be remade when dealing with preprocessed data.

I think that is solvable

@daniel-sol
Copy link
Contributor

I have problems understanding how we are actually going to solve the multicollection part. In theory I think the idea is good, but in practice I cannot really see how this will be done. Imagine that you have a specific grid property. This can be part of several collections:

  • As part of the export of a 3D grid and related properties
  • As part of properties used as input to inplace calculations
  • As part of input to seismic forward modelling
    Not thinking properly about this one would set these exports as three different scripts utilizing ExportData, should we then allow for checking that the object is exported already and then just writing to the connected metadata file? Or do we imagine people thinking this through upfront, and then adding all these x number of collection names in one script. I foresee that this will not scale all that well...

The alternative is that one exports this grid property three times with different collection name, or something similar..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support data object is member of a collection
3 participants