Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Policies and orchestration (of decentralized networks and inboxes) #11

Open
Dexagod opened this issue Feb 9, 2021 · 1 comment
Open

Comments

@Dexagod
Copy link

Dexagod commented Feb 9, 2021

Orchestration and policies

The goal of using decentralized orchestrators and policies is to support orchestration of the functions of scholarly communication in decentralized networks, while putting both institutions and researchers in control of the information flow.
The orchestrator component itself is envisioned as a "dumb component".
It has the responsability of retrieving and processing a set of policies, and executing the relevant policies for received input.
(It can also be made responsible for smaller tasks such as logging the events, depeding on the network requirements).

In this use-case, it is very similar to an inbox processing agent.
This agent is also responsible for executing a set of policies for received inputs (which in this case are inbox filters).
These filters are very similar to the policies we currently envision for the decentralized network (both in the case of the Mellon project, as well as the Netwerk Digitaal Erfgoed project.)

At its core, a policy defines a set of actions that must be executed if a given condition is met.

  • In the case of scholarly artifacts, a policy can define that for a created artifacts of a specific type first the registration service must be called, after which the artifacts can be sent to the awareness and archiving services.
  • in the case of an inbox agent, a policy can define that all notifications from a specific sender, a function should be executed.

In both cases, a policy can be used defines a set of actions to be executed if a specific condition is met for the input.
A policy consists of two parts:

  • The condition - The condition an input has to fulfill to execute the policy actions.
  • The actions - A set of one or more actions to be executed for the input.

Condition

The policy condition must be flexible enough to enable many different use cases and inputs to viable to trigger the policy actions.
An obvious choice for this in a Linked Data context is data shapes.
Shape expressions such as SHEX or SHACL can be used to match specific input, and trigger an action if the input shape matches the policy shape.

Actions

The policy actions are a very broad concept.
E.g. In the case of Gmail Policies can be set with as condition the regexes for the header, title, sender, ... fields.
The actions are a defined set of predefined rules (mark as read, flag as X, move to folder).

The actions to be executed will differ greatly for different use cases:

  • In the case of the notification agent, this problem was mitigated by returning an iterator of all notifications that match the given condition (which can be thought of as a policy condition), and the developer is responsible for creating an actions for the given condition.
  • In the case of the Mellon decentralized network, the current actions require mainly to call the different actors in the network for a given input (e.g. call the registration service for an artifact, call the researcher pod / orchestrator and awareness service if a peer review is submitted on a registered artifact, ...)

Either these actions should be "hardcoded" according certain agreements in the network, or an abstraction layer could be added.
A possible abstraction could be that all actions in the network are provided by a call to a service (e.g. registration of an artifact is handled by calling the registration service with the artifact as input. Filtering the emails to a new folder can be handled by a mock "email moving" service that is provided on localhost.
This approach enables to dynamically add functionality to the network and incorporate this functionality in policies.

Further research on the topic

Further research in the subject mainly goes the way of even more decentralized orchestration, using service invocation triggers, where the triggers can be stored directly with the relevant services.

These further approaches stray further from the core architecture that is currently envisioned, and further decentralization of the orchestration brings extra difficulties to retrieve the necessary policies from both the relevant researchers and institutions.
Additionally for scholarly communication, the steps of the scholarly functions are not very time critical.

@phochste
Copy link

phochste commented Mar 25, 2021

What are the case for making the Orchestration (O) component "dumb" or "smart"? In the issues here there seems to be a suggestion that there are two types of components:

  1. The Orchestrator component that maintains a list of policies that a Scholarly Dashboard (D) should execute (contact the Registration Hub, contact the Awareness hub etc ...)
  2. The Inbox component that filters the inboxes of the POD and automates some actions.

In the document of the Mellon Proposal the Orchestrator component seems to be an active/smart actor that combines a bit the two functions of dumb orchestration and smart inbox execution:

  • It reads the Inboxes of the researcher pods
  • It sends out notifications to Registration , Awareness, etc services
  • It retrieves at a later date responses from these services and forwards it to the researcher pod (or not)

There are also cases in the Mellon Proposal document where it is not clear if the Orchestrator acts on behalf of the researcher or not:

  • In the use case on page 20 of the Mellon proposal the Orchestrator is the compent that is the gateway to the researcher pod and probably required read/write access to not only update in the inbox but also add (technical, provenance) metadata to the artifacts in the researcher pod.
  • In the use case on page 22 it seems that the Awareness service has direct access to Alice's research pod .. and something needs to be done there to do something with the data in her inbox (and update the research pod metadata).

There can be cases made for creating a "dump" and a "smart" orchestration component. I don't find any references why it has to be a "dumb" component.

This also need to be seen in light of the fact that there can be very many service hubs to talk to. E.g. Awareness services can be a search engine of the university, of the faculty, a subject database, a twitter channel. One doesn't want all of these services to ping back to Alice's pod about the fact that Bob created a comment on her publication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants