Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplified protocol for using SPARQL as patch language #104

Open
kjetilk opened this issue Jan 14, 2020 · 11 comments
Open

Simplified protocol for using SPARQL as patch language #104

kjetilk opened this issue Jan 14, 2020 · 11 comments

Comments

@kjetilk
Copy link

kjetilk commented Jan 14, 2020

Why?

To edit RDF sources, it would be nice to have a simple protocol to modify a resource without setting up a full SPARQL Endpoint.

Previous work

Solid uses SPARQL (Update) as a language for patches, specifically, it takes a SPARQL query as the body of a PATCH HTTP request, and runs that against the resource identified by the request-URI.

Proposed solution

We are currently specifying the mechanism, which has been a part of Solid for a while, example are in there, and the current specification effort is unlikely to depart significantly from that.

Considerations for backward compatibility

None that I can think of, it just an alternative protocol.

@namedgraph
Copy link

namedgraph commented Jan 15, 2020

Graph Store Protocol specifies PATCH using SPARQL Update, though it's "informative".

@ericprud
Copy link
Member

I wrote up a SPARQL PATCH proposal for the LWP WG.
+ proper sub-language of SPARQL
- has no provision for eliminating chunks of the graph orphaned by a DELETE.

@afs
Copy link
Collaborator

afs commented Jan 18, 2020

This how I got to RDF Patch which is much less than a full SPARQL Update endpoint and designed so it can be generated, not needing to be hand-written. Important for automatically propagating changes between copies.

It is "N-triples diff" and easy to generate when there is a client side copy of the data (mechanically log changes made to the graph), or when focusing on triples in isolation. Without blank nodes, a patch maps to INSERT DATA/DELETE DATA.

The protocol is HTTP PATCH with "Content-type: application/rdf-patch", if sent to the dataset or a named service related to the dataset.

If the app has context for a change, LDP PATCH works - it is more like a update language (you have to write the path somehow - by graph context tracking/ remembering how the app got to the update point, or human-written.

@ericprud
Copy link
Member

The LDP Patch SOTD lists some alternatives as well. I think LDP Patch is the only one with cut:

The Cut operation is used to remove one or more triples connected to a specific blank node b. More precisely, it removes all the outgoing arcs for b from the target graph, and does the same recursively for all objects of those triples being blank nodes. Finally, it removes all incoming arcs of b.

This is probably a powerful weapon when aimed at one's foot. RDF Patch makes such deletions explicit. Thoughts on who cut helps vs. who it endangers? (We can add cut to any of the proposals if we want, though the utility is reduced if we add it to e.g. SPARQL PATCH, which is intended to be processable by any SPARQL engine.)

@afs
Copy link
Collaborator

afs commented Jan 22, 2020

Wildcard delete D <http://host/subject> * * . was in some earlier work but isn't currently in the format.

The main reason is that an RDF Patch describes the changes made, not the operations performed. A patch as a record of what changed is standalone. D <> * * . beaks that - you need data in the start-state to know what changed.

Indeed, if the patch entries are actual changes (check to see if an add or delete really did make a change) then patches are reversible. Run the patches backwards and the previous version is restored.

@ericprud
Copy link
Member

Run the patches backwards and the previous version is restored. Run the patches backwards and the previous version is restored.

I think that's a very cool feature, but it does have the cost that the client has to calculate the diff, and that it requires some form of skolemization. The CUT alternative I was discussing works without that and allows the client to get by with simpler requests, e.g.

CUT <http://a.example/Msg1>

which removes all the bnodes in:

<http://a.example/Msg1> :recieved  [
  :to <...>
  :date "2000"..
]
:sent [...]

I'm not saying this is necessary; I just want to make sure people understand the feature.

@afs
Copy link
Collaborator

afs commented Jan 24, 2020

CUT assumes a particular data shape.

Maybe the app wants to remove <http://a.example/Msg1> :recieved [] and replace it with <http://a.example/Msg1> :received [] leaving the original triples.

To use our old friend, FOAF, when people used bnodes, you get graphs of bnodes connected by foaf:knows. Recursing can have a big effect.

Now, we can say "don't do that" but when wanting a general purpose mechanism (RDF Patch came out of wanting live dataset replication for HA), these usages need to be coped with.

@kjetilk
Copy link
Author

kjetilk commented Jan 27, 2020

Yes, I am aware there are alternatives, but since this is SPARQL, it seems like this is the place to be constrained to discuss SPARQL :-)

Anyway, @namedgraph 's comment was well recieved, I had forgotten it was in there. Perhaps we can seek to find more experience around it over in the Solid project to see if its status can reasonably be elevated to something normative later.

@pchampin
Copy link

Chiming in (a little late, sorry about that) with my LD Patch editor hat on...

[RDF Patch] can be generated, not needing to be hand-written

Indeed, that's an important difference; LD Patch was designed to capture an "intention" more than the precise effect on the target graph. Hence the distinction between Add and AddNew, for example.

CUT assumes a particular data shape

Well, it will "work" regardless of the shape, but granted, it is more suited for tree-like structures. Using it when there are bnode cycles could lead to unexpected results...

Actually, I have a vague memory of discussing an alternative semantics for CUT, with Alexandre and Andrei, where it would only recurse on bnodes with a single incoming arc. But if we did discuss that (I'm really not sure), I'm guessing we ruled it out for being too complex. Retrospectively, maybe that would not have been such a bad idea.

@lisp
Copy link
Contributor

lisp commented Aug 26, 2020

why not use ones which already exist, rather than introduce new encodings?
as @namedgraph noted, gsp patch is available.
this could be combined with a multipart media type to compose the changes successive as POST and DELETE parts.

@lisp
Copy link
Contributor

lisp commented Oct 9, 2020

Run the patches backwards and the previous version is restored. Run the patches backwards and the previous version is restored.

I think that's a very cool feature, but it does have the cost that the client has to calculate the diff, and that it requires some form of skolemization. The CUT alternative I was discussing works without that and allows the client to get by with simpler requests, e.g.

there are approaches according to which it is a matter of more appropriate notions of scope instead of rewriting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants