Example_of_RSP-QL_query

This query continuously look for bars where people are falling in love like Paolo and Francesca in Dante's Divine Comedy because of a book by Gallehault.

The query checks :

    over the default graph containing the points of interest (POIs) of http://somesocialnetwork.org/ that the POI is a bar.
    over the entire stream from http://someinvasivesensornetwork.org, that pairs of people entered in the poi in different moments within 4 hours.
    over the same stream, with a long lasting time window of 1 hour, that pairs of those people have been staying close by for at least 30 minutes. Note: that this may require some resoning being the property isCloseTo symmetric.
    over the same stream but with a short time window of 10 minutes, that the same pairs exit together.

As output, for each bar, it streams out an RDF graph with the list of pairs and the total number of pairs that felt in love.

Note that this example query covers features of C-SPARQL, CQELS, SPARQL-Stream, EP-SPARQL as well as new features missing in all RSP languages:

    From C-SPARQL it takes the REGISTER clause, the FROM STREAM clause as dataset clause, the AT clause to access the timestamp (in C-SPARQL, AT is implemented with the timestamp() function) and the aggregates (which are computed in parallel without shrinking the result set, but extending it).
    From CQELS it takes the idea of the STREAM keyword in the WHERE clause.
    From SPARQL-Stream it takes the ISTREAM clasue that ask the RSP engine to use the R2S operator.

    From EP-SPARQL, it takes the SEQ and the WITH DURATION clauses (in EP-SPARQL, WITHIN DURATION is implemented with the getDuration() function).

-->if we introduces new keyword SEQ for expressing order constraint, why don't we introduce keywords for 13 Allen's interval operator? for instance, BEFORE/MEETS/OVERLAPS/STARTS/DURING/FINISHES/EQUAL + inverse version of them (practically, it's the same effort for parsing)
# PW: good point. is there anything which prevents us from introducing these keywords? However, for me it is still unclear on how these interval (!) operators are applied in our language. For this we would need events, and I'm not sure on how we deal with this. As I see in our google doc, we have already put some thought on timestamps and/or intervals, so I am sorry, if my questions are already solved.

The new features are:

    the usage of an IRI to identify the query (and its stream of results)
    the optional UNDER ENTAILMENT REGIME clause # PW: what is the main motivation for making the entailment regime explicit? basically, i like it, but am not sure, if it would/could lead to conflicts.
    the FROM NAMED STREAM <<stream iri>> <<window>> AS << window name>> clause in the dataset declaration
    the WINDOW keyword in the WHERE clause
    --> is there any shorter way to express this? for instance, just STREAM keyword inside WHERE?
	# PW: if we would do it like this, i.e., refer to windows via the STREAM keyword inside WHERE, this would imply that the output of a WINDOW is again a STREAM, which I am not sure about. I think introducing the WINDOW keyword is good for readibility and prevents confusion. If it's syntactically really necessary should be discussed.

PREFIX e: <http://somevocabulary.org/> 
PREFIX s: <http://someinvasivesensornetwork.org/streams#>
PREFIX g: <http://somesocialnetwork.org/graphs#>
PREFIX : <http://acrasycompany.org/rsp#>
REGISTER STREAM :GallehaultWasTheBar 
UNDER ENTAILMENT REGIME <http://www.w3.org/ns/entailment/RIF>
AS
FROM NAMED STREAM s:1
FROM NAMED STREAM s:1 [RANGE PT1H STEP PT5M] AS :longWindow # PW: sorry, if it's obvious, but what does PT mean?
FROM NAMED STREAM s:1 [RANGE PT10M STEP PT5M] AS :shortWindow
FROM NAMED GRAPH g:SocialGraph
FROM GRAPH g:POIs
CONSTRUCT ISTREAM { 
 ?poi rdf:type :Gallehault ; 
      :count ?howmanycouples ;
      :for (?somebody ?someoneelse)   		# I cannot understand this one clearly				 
} 
WHERE {
 ?poi rdf:type e:bar . # PW: this matches only triples in the default graph, i.e. g:POIs, right?
 STREAM s:1 { # PW: as this runs over the entire stream, I ask myself: Will this scale for huge streams? What is the definition of "entire stream"? Or should we not care about this issue, since we "just" define the language?
     { ?somebody e:enters ?poi } 
     SEQ 
     { ?someoneelse e:enters ?poi } WITHIN PT4H
 }
 WINDOW :longWindow {
     { ?somebody e:isCloseTo ?someoneelse # should isCloseTo be a triple with one timestamp, or is it an event with start-time and end-time? I ask, because I was wondering how the duration (PT30M) can be calculated? Then again, this might be an issue for someone implementing an engine and is of no concern when defining the language. 
       MINUS { ?somebody e:isCloseTo ?yetanotherone . FILTER (?yetanotherone != ?someoneelse) }
     } WITH DURATION ?longtime 
     FILTER (?longtime>"PT30M"^^xsd:duration)
 }
 WINDOW :shortWindow {
     { ?somebody e:exits ?bar} AT ?t1
     { ?someoneelse e:exits ?bar } AT ?t2 
     FILTER (abs(?t2-?t1)<"PT1M"^^xsd:duration )
 }
 GRAPH g:1 { # PW: g:1 is not defined anywhere, right? I think this should be g:SocialGraph
     FILTER NOT EXIST { ?somebody e:knows ?someoneelse }
 }
 FILTER (?somebody != ?someoneelse)
}
AGGREGATE {
 GROUP BY ?bar 
 COUNT(?somebody) AS ?howmanycouples 
}