Description
Assume we requested an original model and both derived models for validation and typesystem via bulk API (#173, #25).
We want nodes from all three models, and relate the original model nodes to the derived ones. How to do this?
NOTE: "base node" is the node in the original model that we want the derivations for.
Option A: Request derived nodes per base node
A special API receives one node id, and all requested derived languages.
It returns all derived nodes for the one input node.
Pro:
- Very simple
Con:
- Horribly inefficient
- Could batch several ids in one request -> more efficient
- Does not play along with delta API (Repo API: Change-based event notifications #28)
Option B: Derived nodes must implement interface with back-reference
Each derived node must implement IDerived
. IDerived.original
refers to the corresponding original node.
Pro:
- No special structures needed in M3 or serialization
Con:
- Wrong direction: We want to find the derived node from the original, not vice versa.
Option C: Separate list in serialization
We add a field derived
to the node serialization format (#37).
Each node then looks like this:
{
"id": "base-node-id",
"concept": {
"language": "myLanguage",
"version": "2",
"key": "myConceptId"
},
"properties": {},
"children": {},
"references": {},
"annotations": [],
"derived": [
"constraints-info-id",
"calculated-type-id"
],
"parent": null
}
The derived
array contains the id of all derived nodes based on this node.
derived
and annotations
are not interchangeable: annotations
are part of the original model, i.e. they are the same for every request. The contents of derived
depend on the requested derived languages, thus they might be different for each request.
Pro:
- Efficient
Con:
- Needs serialization format change
- All derivations need to be known to repository
- All derivations need to be available when original model is requested, if not:
- We might wait until all processors are done calculating the derivations
- The second request for the same node might contain more derivations than the first request (because e.g. the sophisticated typesystem by now finished calculating the inferred type)
Option D: Separate field in serialization
We add a field derivedFor
to the node serialization format.
Each node then looks like this:
{
"id": "constraints-info-id",
"concept": {
"language": "constraintLanguage",
"version": "2",
"key": "can-be-child"
},
"properties": {},
"children": {},
"references": {},
"annotations": [],
"derivedFor": "id-of-base-node",
"parent": null
}
Pro:
- Derived languages don't need special interface
Con:
- Wrong direction: We want to find the derived node from the original, not vice versa.
- Needs serialization format change
Option E: Additional mapping in serialization
We add a mapping of derived nodes to serialization format, but outside the actual nodes.
The details what to map and in which direction might change.
{
"serializationFormat": "2024.1",
"languages": [...],
"nodes":
[
{
"id": "base-node-id",
"classifier": {
"language": "myLanguage",
"version": "2",
"key": "myConceptId"
},
"properties": [],
"children": [],
"references": [],
"annotations": []
"parent": null
},
{
"id": "constraints-info-id",
"classifier": {
"language": "constraintsLang",
"version": "2",
"key": "myConstraint"
},
"properties": [],
"children": [],
"references": [],
"annotations": []
"parent": "some-id"
},
{
"id": "calculated-type-id",
"classifier": {
"language": "typeLang",
"version": "1",
"key": "StringType"
},
"properties": [],
"children": [],
"references": [],
"annotations": []
"parent": "type-partition-id"
}
],
"derivations":
[
"base-node-id":
[
"constraints-info-id",
"calculated-type-id"
]
]
}
Pro:
- No change to core serialization format, only addition
- Correct direction (base node -> derivations)
Con:
- Cannot identify derived elements on M2 level
Option F: Separate M3 concept
In M3, add a new specialization of Classifier
named DerivedConcept
(next to Concept
, Annotation
, Interface
).
Pro:
- Very clear distinction
Con:
- Doesn't immediately solve the M1 problem (how to relate a derived node to its base)
Option G: Annotation on derived nodes
Have an annotation on each derived node that refers to the base node.
{
"serializationFormat": "2024.1",
"languages": [...],
"nodes":
[
{
"id": "base-node-id",
"classifier": {
"language": "myLanguage",
"version": "2",
"key": "myConceptId"
},
"properties": [],
"children": [],
"references": [],
"annotations": [],
"parent": null
},
{
"id": "constraints-info-id",
"classifier": {
"language": "constraintsLang",
"version": "2",
"key": "myConstraint"
},
"properties": [],
"children": [],
"references": [],
"annotations":
[
"derived-annotation"
],
"parent": "some-id"
},
{
"id": "derived-annotation",
"classifier": {
"language": "builtins",
"version": "1",
"key": "DerivedFrom"
},
"properties": [],
"children": [],
"references": [
{
"reference": {
"language": "builtins",
"version": "1",
"key": "DerivedFrom-base"
},
"targets":
[
{
"resolveInfo": null,
"target": "base-node-id"
}
]
}
],
"annotations": [],
"parent": "constraints-info-id"
}
]
}
Pro:
- Derived models can use the same concepts as original models (example: Derived
JavaClass
fromEntity
)
Con:
- Two nodes for each derivation
Option H: Annotation on base node
Have an annotation on base node that refers to derivations.
{
"serializationFormat": "2024.1",
"languages": [...],
"nodes":
[
{
"id": "base-node-id",
"classifier": {
"language": "myLanguage",
"version": "2",
"key": "myConceptId"
},
"properties": [],
"children": [],
"references": [],
"annotations":
[
"derived-annotation"
],
"parent": null
},
{
"id": "constraints-info-id",
"classifier": {
"language": "constraintsLang",
"version": "2",
"key": "myConstraint"
},
"properties": [],
"children": [],
"references": [],
"annotations": [],
"parent": "some-id"
},
{
"id": "derived-annotation",
"classifier": {
"language": "builtins",
"version": "1",
"key": "Derived"
},
"properties": [],
"children": [],
"references": [
{
"reference": {
"language": "builtins",
"version": "1",
"key": "Derived-derivation"
},
"targets":
[
{
"resolveInfo": null,
"target": "constraints-info-id"
}
]
}
],
"annotations": [],
"parent": "base-node-id"
}
]
}
Pro:
Con:
- Two nodes for each derivation
- Need to know derivations when serializing base model
Option I: Node nature
Introduce new field nature
in serialization for each node.
Possible values:
"original"
{ "derivationKey": "validation", "base": "baseNodeId" }
{
"serializationFormat": "2024.1",
"languages": [...],
"nodes":
[
{
"id": "base-node-id",
"classifier": {
"language": "myLanguage",
"version": "2",
"key": "myConceptId"
},
"nature": "original",
"properties": [],
"children": [],
"references": [],
"annotations": [],
"parent": null
},
{
"id": "constraints-info-id",
"classifier": {
"language": "constraintsLang",
"version": "2",
"key": "myConstraint"
},
"nature": {
"derivationKey": "constraints",
"base": "base-node-id"
},
"properties": [],
"children": [],
"references": [],
"annotations": [],
"parent": "some-id"
}
]
}
Pro:
- Clear distinction of original and derived nodes
- Can use the same classifiers for original and derived nodes
Con:
- New kind of node
- Change in serialization format
- Field sometimes has string value, other times object value