Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermediate result sharing among data streams for aggregation #104

Open
argahsuknesib opened this issue Apr 20, 2023 · 1 comment
Open
Assignees
Labels
challenge technical problem applied to a use case proposal: pending ❓

Comments

@argahsuknesib
Copy link

argahsuknesib commented Apr 20, 2023

Pitch

This challenge is an extension of the challenge on query containment. Upon the completion of the query containment challenge, we will have an algorithm to determine if the registered query is contained in an already registered query. To improve the scalability of the solid stream aggregator, it is crucial to share resources between the streams. The queries are similar but different queries over the data.

Desired solution

The desired solution is an algorithm / approach to use the similarities in the queries over multiple streams. In streaming scenarios, the data stream is chopped up into a particular window for processing over. Therefore, the common data over which the queries differs on the size of the window over the two queries. The approach of sharing should be able to share resources over the following scenarios:

Window Queries
Same Different
Different Same
Different Different

Acceptance criteria

The acceptance criteria for this challenge is to implement the sharing of resources between the streams in the solid stream aggregator and show the improvement in query execution time when comparing the execution time of the queries with and without sharing of resources. The data set used for the evaluation is the DAHCC dataset.

Pointers

As the topic of aggregation is still a novel research topic, a number of assumptions were taken:

Scenarios

The challenge is part of a larger scenario on Aggregated view on sensitive personal health data streams. The scenario is described in issue 16

@argahsuknesib argahsuknesib added challenge technical problem applied to a use case proposal: pending ❓ labels Apr 20, 2023
@pheyvaer pheyvaer assigned pbonte and pheyvaer and unassigned RubenVerborgh, pbonte and pheyvaer Apr 20, 2023
@pheyvaer
Copy link
Contributor

@pbonte Once you are doing with the review of the challenge, can you assign it to me? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
challenge technical problem applied to a use case proposal: pending ❓
Projects
None yet
Development

No branches or pull requests

4 participants