Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table instance aliases and sharing in sourcedefs #846

Open
karlcz opened this issue Jun 18, 2020 · 2 comments
Open

Table instance aliases and sharing in sourcedefs #846

karlcz opened this issue Jun 18, 2020 · 2 comments
Labels
annotation Anything related the annotations (adding new one or changing existing one)

Comments

@karlcz
Copy link
Contributor

karlcz commented Jun 18, 2020

It would be useful for the DBA to be able to define a set of sourcedefs with shared/common parts to their source path. When building an ermrest query, you would instantiate the common parts only once and reuse the same table instance aliases to construct the remaining (distinct) parts of each source that is being used in a recordset query.

I think there are two reasonable ways to do this:

  1. Add DBA-controlled instance aliases to the inbound/outbound clauses (following syntax in ermrest acl bindings)
  2. Add a new {"sourcekey": "key"} element that can be used as the first element in a source path

In the first approach, each sourcedef would have its own copy of the shared structure. In the second approach, the DBA would have to factor out the shared prefix part as another named sourcedef, then reference it from each of the sourcedefs that should overlap.

Either way, you have to implement some kind of idempotence to allow a named table instance or shared sourcedef to be added to a query at-most-once and then splice the remaining non-shared parts into the query using context-reset and/or explicit alias-qualifiers in the constructed ermrest query path.

@RFSH
Copy link
Member

RFSH commented Nov 18, 2021

With the recent changes, we added support for the path prefix (the second method that is mentioned above). So in this comment I'm going to focus mainly on addition of DBA-controlled aliases.

The following are the two new properties that we're going to have to support while defining a path in source definition:

  • alias (right alias): Is used to name a table instance. It can be added to the outbound or inbound nodes in the source definition.
  • context (left alias): is used to refer to any of the defined (whether by user or ERMrestJS) aliases. This can be used in outbound or inbound nodes, as well as filter nodes (the syntax for filter nodes is a bit different).

In case of filter nodes, we're going to have to allow [ <leftalias>, <column> ] format which allows us to signal that we want a column from the defined <leftalias> (context).

The following is not finalized and is more like a note that I wanted to capture. The content might not be complete or easily readable and the purpose is just to have them in the issue so we can discuss it. I will keep updating/changing its content

Concerns/Things to discuss:

  • The only use case that comes to my mind is combining this feature with filter:

    • Example 1: Writing disjunctive filters based on columns of different table instances
      {
        "source": [
          {"outbound": ["schema", "const1"], "alias": "o1"},
          {"inbound": ["schema", "const2"]},
          {
             "or": [
               {"filter": ["o1", "o1_c1"], "operand": "v1"},
               {"filter": "o2_c1", "operand": "v2"}
          }
        ]
      }
    • Example 2: Filtering based on a column in another table and then reseting the path
      {
        "source": [
          {"inbound": ["schema", "const2"]},
          {"filter": "o2_c1", "operand": "v2"},
          {"context": "M", "outbound": ["schema", "const2"]},
          "RID"
        ]
      }
      • We might also want to allow the last element in the source definition to have context.
  • I'm not entirely sure what are the other use cases for reusing table instances, but the advantage of sharing path prefix was mainly when multiple sources with same prefix are used in the same context (while parsing facets or generating the main request containing the all outbound paths). And in that syntax, the two place that were using the same prefix would just refer to the same sourcekey. But with alias/context syntax, the definition must come before the usage and given that data-modelers don't have any control over the order of how these sources are used in the URL, I'm not sure how these would play out when we're combining multiple paths in one request.

  • Should ERMrestJS try to validate these paths? The current syntax goes over the syntax to make sure it is defining a valid path, and also categorizing them.

  • When a source is defined, depending on its category, ERMrestJS might use that source in different ways and make some modifications:

    • For related entities, the source is reversed to produce the proper "Explore" link.
    • For null facet, the source is reversed and used in a right outer join path.
    • For aggregates, the source is used to determine the path from main table to aggregate column and extra aliases are added to get the response.
    • For all outbound paths, an alias is added to the last foreign key in the path to be used in the projection list.
    • For pure and binary association, an alias is added to the association table to be able to properly get trs information for unlink.
    • For path prefix, the prefix is replaced either with the whole definition or just a table instance that is already defined.
    • For all the requests the "M" alias is used for the projected table. Therefore in the request to get the facet values where we're creating the URL like this s:main_table/<other-facets>/<path-from-main-to-this-facet>, we're using alias "T" for the <other-facets> parts. I don't think we want to "leak" these implementation details and we should just allow them to refer to the existing "M"/"T" and then in the code we should just handle this (let them use something like "alias": "base" to just hide this "M" vs. "T").
    • What about the conflicts with the existing aliases?

@karlcz
Copy link
Contributor Author

karlcz commented Nov 18, 2021 via email

@RFSH RFSH added annotation Anything related the annotations (adding new one or changing existing one) and removed enhancement labels Apr 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
annotation Anything related the annotations (adding new one or changing existing one)
Projects
None yet
Development

No branches or pull requests

2 participants