Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dealing with identical instances when joining the knowledge graph of multiple experiments #58

Open
joergfunger opened this issue Sep 29, 2022 · 1 comment
Labels
knowledge graph question Further information is requested

Comments

@joergfunger
Copy link
Member

joergfunger commented Sep 29, 2022

In todays meetings of the knowledge graph group, there was a discussion on how to deal with multiple instances that are presumably identical, but not in reality. In the example, we have a compression test and a Youngs modulus tests that are performed for the same mixture. In the current approach, we first generate an instance of the mixture based on the id, create an instance of the compression test and then add that to the (empty) graph. Now we want to add another subsequent experiment (Youngs modulus test). In theory, we could create another instance of the mixture (with the same ID) and an instance of the compression test and then join the two graphs (or upload the new triples into the triple store). If all information is really identical (with the same id), the instances are just replaced. However, if there are (even only slightly) differences, e.g. because the raw excel sheet of the second mixture instance was slightly modified (e.g. the name of the additives was slightly changed), we might end up with an inconsistent dataset (e.g. if we have a list of ingredients, we would then have the additive twice). How do we handle that situation? Do we distinguish between using an existing object and double check with a list of metadata to verify if the data in the metadata.yaml and the graph (using a query) is identical, otherwise return an error? Or do we add an instance of the second mix and verify, if the number of triples hasn't changed (essentially the same as before, but then there is no manual check required)? It would be quite interesting to discuss that not only from a technical implementation, but also from how we are going to use it. Is a mix (design) a constant (a single instance) or is it different for all tests (for each test another instance) that are performed. This holds true for many similar entities (e.g. a mix contains a list of ingredients such as specific aggregates, additives, water) - I guess we do not have to specify the density of water each time we add another mix. That's why I would hope to involve both @firmao and @PoNeYvIf from a technical point of view, but a users perspective on that by @eriktamsen @JulietteWinkler @StephanPirskawetz would be also very valuable.

@eriktamsen
Copy link
Member

I am wondering, is there value in defining spefic concrete recipies (mix designs) as a kind of mix template which are unique and then create seperate instances for each actual mix?
I think this would depend a bit if where are a limited number of recipies which are used alot or if each new mix changes the recipe slightly. @JulietteWinkler, @StephanPirskawetz, what would you say would be more practical?

@eriktamsen eriktamsen added question Further information is requested knowledge graph labels Apr 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
knowledge graph question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants