Determining query containment for the registered queries to improve the scalability of the solid stream aggregator.

## Pitch

This challenge is an extension of the challenge [84](https://github.com/SolidLabResearch/Challenges/issues/84) and part of the scenario [16](https://github.com/SolidLabResearch/Challenges/issues/16). The [solid stream aggregator](https://github.com/argahsuknesib/solid-stream-aggregator) enables a query agent to maintain a continuous view of the stream stored in the solid pod by registering a query. In the scenario, there can be multiple query clients requesting a continuous view of the stream. A naive approach would be to execute each and every query registered by the query agent. However, this approach is not scalable. As the queries to be processed by the aggregator are similar but different queries over common data, it is vital to find the similarities in the queries and execute only the unique queries to improve the scalability of the aggregator. We will use the [DAHCC dataset](https://dahcc.idlab.ugent.be/dataset.html) and the solid stream aggregator to test employ the query containment algorithm.

## Desired solution

The desired solution is to implement a query containment algorithm to determine which queries are contained in other already registered queries. The query containment algorithm should be able to determine the containment of the queries registered in the [RSP-QL](https://www.igi-global.com/article/rsp-ql-semantics/129761) syntax. The RSP-QL syntax can be simplified to SPARQL syntax by removing the expressivity required for stream based queries such as window, step, range etc. Therefore, the query containment algorithm should also be able to work with SPARQL queries. The developed algorithm should be able to assist in managing multiple views in the solid project.

## Acceptance criteria

To employ the developed query containment algorithm in the query registry of the solid stream aggregator to determine if a newly registered query by a query agent is contained in already registred or executed queries of the query registry.

## Pointers

As the topic of aggregation is still a novel research topic, a number of assumptions were taken:

- [Long term server-side authenticated sessions](https://github.com/SolidLabResearch/Challenges/issues/13) have been solved and therefore the authentication part of this challenge is not taken into account.
- The containment problem is undecidable over the full SPARQL syntax. Therefore, only a part of the SPARQL syntax is considered.
- The registered queries are in either in RSP-QL syntax or are SPARQL SELECT queries.

## Scenarios

The challenge is part of a larger scenario on Aggregated view on sensitive personal health data streams. The scenario is described in [issue 16](https://github.com/SolidLabResearch/Challenges/issues/16)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Determining query containment for the registered queries to improve the scalability of the solid stream aggregator. #103

Pitch

Desired solution

Acceptance criteria

Pointers

Scenarios

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Determining query containment for the registered queries to improve the scalability of the solid stream aggregator. #103

Description

Pitch

Desired solution

Acceptance criteria

Pointers

Scenarios

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions