Repo API: Node IDs #31

enikao · 2022-10-28T12:34:24Z

Each LIonWeb node has an internal node id. The node id represents the identity of that node. This means:

For the whole existence of the node, its id remains unchanged.
Two nodes with the same id are considered identical.
If a node would change its id, we would consider the before state and after state to be two different nodes (even if the node is the identical object in terms of memory location or similar implementation-language terms).

Valid characters

ids can only contain these symbols:

lowercase latin characters: a..z
uppercase latin characters: A..Z
arabic numerals: 0..9
underscore: _
hyphen: -

This is the same charater set as Base64url variant.

Representation

ids are represented by a string, containing only valid characters (as defined above).
An id string is NOT padded, also not by whitespaces.
An id string does NOT contain any terminating symbols (compared to some BASE64 variants); this does not affect internal representation in a specific implementation language, e.g. C-style \0-terminated strings.

Scope

Node ids MUST be unique within their id-space.

Id-space

An id-space is a realm that guarantees the uniqueness of all ids within.
Typically, this means one node repository instance.

An id-space has an id as defined above.
Uniqueness of id-space ids is out of scope of LIonWeb specification.

In LIonWeb (the protocol), id-spaces are NOT hierarchical.
An implementation might chose to use hierarchical id-spaces internally.

Identification

A node can be identified relative to its id-space by the node's id.
To globally identify a node, we use the combination of the id-space id and the node id.

For now, we don't consider the global case (see #25).
Thus, we use only the node id in LIonWeb protocol.

(This issue description has been updated to reflect the consolidated decision on node ids.)

The text was updated successfully, but these errors were encountered:

enikao · 2022-12-02T09:15:55Z

I think they should be strings, because they ought to be serializable without fuss. Anything more complicated, and chances are that other concerns (such as namespace identification and such) are leaking into their purpose.

Originally posted by @dslmeinte in #46 (comment)

I strongly agree ids should be strings, it's more about the limitations we set on them.
Some possible limitations are laid out in this issue, @dslmeinte mentioned some others:

Identifiers should be unique (within the namespace)
An identifier should be a non-empty, non-whitespace-only string

enikao · 2022-12-09T11:35:55Z

Assumptions

id can be represented as one string
ids must be unique only per namespace

Allowed characters

Arguments for limiting to small set (e.g. ASCII):

Safe
Complex names can be carried in name field
As soon as ids are too readable, people will try to parse them

Arguments for big set (e.g. UTF-8):

Direct representation of FQN from e.g. programming languages as stable IDs
- Especially for representing references by their FQN

Arguments for excluded characters:

If id unique only per namespace, need to concat namespaces for globally unique id

Proposed allowed characters

a-z
A-Z
0-9
_ (underscore)
- (hyphen)

Compatible with Base64url variant

Namespace separator character

`.` (dot):

Pro:

Used in lots of programming languages

Con:

Hard to see visually

`/` (slash)

Pro:

Used in e.g. file systems, XPath

Con:

Might clash with URLs

enikao · 2022-12-09T12:56:23Z

How to use fully qualified names as IDs

A fully qualified name (FQN) is often used in programming languages to uniquely identify an element, e.g. C# class System.String or Java method java.lang.String.toString().
As the programming language guarantees the fqn's uniqueness, they are suited as id.
However, they contain invalid characters (e.g. . or ().
There are at least three obvious ways to deal with this issue:

Base64url encoding

Base64 is a mechanism to encode arbitrary data in 64 characters that can be safely transmitted by its carrier (e.g. traditional e-mail). In the url variant, the 64 characters used are exactly the allowed characters for IDs in LIonWeb. Thus, we can encode and decode anything (including FQNs) to/from an id without loss.

Mapping table

Keep a map (aka dictionary) between fqns and randomly created ids.

Hash function

Feed the fqn to a hash function, and use the output as id.
Cryptographic hash functions pretty much guarantee the result's uniqueness.
Additionally, the ids typically are shorter than the fqn.

We can combine a mapping table with hashed fqns to achieve stable ids (without additional storage) and bi-directional lookup.

enikao · 2022-12-23T13:55:50Z

Namespace vs. id-space

We prefer the term id-space. People might be less tempted to use names as ids if we avoid the term name at all in this context.

Also, namespaces are very often hierarchical, whereas our id-space is not.

enikao · 2022-12-23T14:00:16Z

Do we need to define separator char?

No. If required, each application can use their own representation.

Examples:

Use a datastructure like list of ids.
Use a character outside the valid character range for ids, as fitting for the application (e.g. slash (/) for REST services, dot (.) for programming languages).
Use a character inside the valid character range and introduce escaping (e.g. dash (-), and escape any dash that's part of the id with double-dash)

enikao · 2022-12-23T14:31:00Z

Once we look into versioning/branching (#26), we need to amend this decision w.r.t. ids of nodes across branches.

joswarmer · 2023-01-16T13:41:07Z

If this is ready for closing, can we make the choices we are making explicit here?

enikao · 2023-01-16T13:44:09Z

If this is ready for closing, can we make the choices we are making explicit here?

I updated the description of this issue, it should reflect the choices.

joswarmer · 2023-01-16T14:12:08Z

Ok, tnx, I was looking at the last comment to find the final choices.

enikao · 2023-01-20T08:52:10Z

Closing as accepted, because there's no objection.

ftomassetti · 2023-04-07T08:04:31Z

I would add that the ID must not be empty

enikao added the repo label Oct 28, 2022

enikao mentioned this issue Oct 28, 2022

Repo API: Bulk read/write #25

Closed

enikao mentioned this issue Nov 11, 2022

Repo API: Node serialization #37

Closed

enikao mentioned this issue Nov 28, 2022

Add id field to MetamodelElement and Metamodel #46

Closed

enikao mentioned this issue Jan 13, 2023

Ids for M3 Elements #53

Closed

enikao added the ready for closing label Jan 13, 2023

enikao closed this as completed Jan 20, 2023

enikao mentioned this issue Feb 2, 2023

We don't care about serialization verbosity #73

Closed

enikao mentioned this issue Feb 10, 2023

Node update: do we allow concept change? #69

Closed

enikao added the serialization label Feb 21, 2023

This was referenced Mar 6, 2023

Rename M3 property id -> key #90

Closed

Requirements on metamodel keys #91

Closed

Can Repositories have stricter requirements on node IDs than LIonWeb (e.g. only longs)? #70

Closed

enikao mentioned this issue Jul 1, 2023

adjusted M3 docs #140

Merged

enikao mentioned this issue Oct 27, 2023

Provide id mapping API #94

Open

enikao mentioned this issue Mar 8, 2024

A client must identify to repository with unique id #241

Closed

enikao mentioned this issue Aug 9, 2024

Each command has a unique id #305

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repo API: Node IDs #31

Repo API: Node IDs #31

enikao commented Oct 28, 2022 •

edited

Loading

enikao commented Dec 2, 2022

enikao commented Dec 9, 2022 •

edited

Loading

enikao commented Dec 9, 2022 •

edited

Loading

enikao commented Dec 23, 2022

enikao commented Dec 23, 2022

enikao commented Dec 23, 2022

joswarmer commented Jan 16, 2023

enikao commented Jan 16, 2023

joswarmer commented Jan 16, 2023

enikao commented Jan 20, 2023

ftomassetti commented Apr 7, 2023

Repo API: Node IDs #31

Repo API: Node IDs #31

Comments

enikao commented Oct 28, 2022 • edited Loading

Valid characters

Representation

Scope

Id-space

Identification

enikao commented Dec 2, 2022

enikao commented Dec 9, 2022 • edited Loading

Assumptions

Allowed characters

Proposed allowed characters

Namespace separator character

. (dot):

/ (slash)

enikao commented Dec 9, 2022 • edited Loading

How to use fully qualified names as IDs

Base64url encoding

Mapping table

Hash function

enikao commented Dec 23, 2022

Namespace vs. id-space

enikao commented Dec 23, 2022

Do we need to define separator char?

enikao commented Dec 23, 2022

joswarmer commented Jan 16, 2023

enikao commented Jan 16, 2023

joswarmer commented Jan 16, 2023

enikao commented Jan 20, 2023

ftomassetti commented Apr 7, 2023

enikao commented Oct 28, 2022 •

edited

Loading

enikao commented Dec 9, 2022 •

edited

Loading

`.` (dot):

`/` (slash)

enikao commented Dec 9, 2022 •

edited

Loading