Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on storage for processing #51

Closed
greenTara opened this issue Feb 18, 2016 · 2 comments
Closed

Question on storage for processing #51

greenTara opened this issue Feb 18, 2016 · 2 comments
Assignees
Labels

Comments

@greenTara
Copy link
Collaborator

Currently the document says "RSPs should process streams of data actively and in-stream, without the need of storing them". However, there are some reasonable operations that would require partial storage, just not storage of the entire past history. For example, to determine if two streams are isomorphic, some storage would be required due to the possibility of different sequential ordering in the serialization.

@jpcik jpcik self-assigned this Feb 19, 2016
@kiat
Copy link
Collaborator

kiat commented Feb 19, 2016

I think this should be removed from the documents.

Syntactic and Semantics should only describe what an RSP is and not how a processing system has to process it. If someone wants to process it by using a DB approach she can do so, or if she wants to use stream processing system that use finite state machines, or any kind of rule engines to process the stream. In general we do not need to specify how it should be process. I think this sentence is on the document because of some historical reasons.

In general data stream processing system separate from the unbounded stream a small fraction of it (a bounded data set) and store it in main memory to be processed, because the semantic of query detection specifies the dataset that is should be applied to for pattern matching. We need this kind of storage anyhow to be able to windowing, but this does not mean to store the whole stream for example in a database.

But if someone has some high performance main memory data base and can store and index data in real-time, our specification of RDF stream should not have anything against it.

@beta2k
Copy link
Contributor

beta2k commented Feb 22, 2016

+1 for not arguing how to process streams in our documents. we could give "recommendations" or "best practices", but this should also not be our focus.

jpcik added a commit that referenced this issue Apr 1, 2016
jpcik added a commit that referenced this issue May 6, 2016
@jpcik jpcik closed this as completed May 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants