Skip to content

Data model

der edited this page Sep 24, 2015 · 1 revision

Intermediate data model

The api has an intermediate data model which is then mapped to different source formats and different serializations.

The intermediate data model is a list of trees.

In the case of an item point there is a single entry in the list, for a list endpoint there can be arbitrary many entries. Lists may be ordered by some sort operation.

Each tree is essentially JSON using JSON-LD conventions to map to RDF:

  • a URI resource has an @id field whose value is the URI
  • the json field name is a short name representing an RDF property, the shortname is defined in a mapping file but defaults to the local name
  • strings, booleans and number values are mapped to the corresponding JSON types
  • a lang-tagged string value may be mapped to a plain string or to am object {@value:v, @language:l}
  • multiple values are mapped to JSON arrays, a property can be marked as multi-valued to force use of an array even if there is only one value present

It is up to the API configuration to define (via the query templates) which properties and linked resources are included in the tree.

Note that this model does not support serialization of graphs of bNodes. The root or linked resources may be bNodes (in which case there is simply no @id field), so trees of bNodes can be present but any loops cannot be represented.

The set of short field names used in the tree form a set of keys that are used:

  • as the column names when serializing to CSV
  • as parameters names in the query for filtering or sorting in the API

Typically we try to given unique field names across the tree so there is no ambiguity of column names when flattened to a CSV. This means that the same RDF property might have different shortnames in different contexts.

The mapping from the RDF model to the tree model is specified in JSON/YAML as a ViewMap. The ViewMap gives short names, the tree structure and annotations for type and presentation information where needed.

Processing pipeline

A item endpoint over RDF data is implemented as a DESCRIBE query. The resulting RDF graph is unwrapped to a tree in the intermediate model using either an explicit ViewMap or a default recursive unwrapping (which breaks cycles).

A list endpoint over RDF is implemented as a SELECT query (which can be arbitrarily complex inside). Each field short name corresponds to a variable name returned from the select query. Neighbouring rows of result bindings are coalesced to merge multiple values (relies on sorting or assumed order from backend).

A list endpoint over a document store assumes the documents are stored in a format that can be mapped to the intermediate representation.

Clone this wiki locally