-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slash semantics and conflicting requests on resource creation #301
Comments
We're working on converting tests for the new harness, and without going into too much detail on that work-in-progress, we have noted that there is a lot of divergence on how the implementations deal with the tension between slash semantics and LDP Interaction Models. I think this means that the heuristics from #128 is not sufficiently clear. My main fear is that since there is quite a lot of assumptions connected to containers and resources in the access control, lack of clarity in this area could result in opening attack vectors that we can't imagine right now. So, the intention isn't to change the good path (i.e. these are errors an end-user should never see), but to define error behaviour in the other cases, e.g. when something is declared a Non-RDF resource but doesn't have a In #40 (comment) I tried to introduce the two concepts of consistency and exactness. I still think they are useful and that we need them. By consistency, I mean that different parts of the request carries the same semantics. If something has a As previously mentioned, by exactness I mean to constrain the freedom the server has to adapt the request. Exactness is a criterion that applies differently to
So, it is not appropriate to change a request URI to fit a different interaction model when It is different with HTTP also uses the term "consistent":
In this case, I think the suggested transformation does not imply that it is legitimate to create a different type of resource (e.g. change the interaction model), the transformation applies when you can merely change some aspects of the representation. For these interaction-model-changing situations, I think this makes it very clear that an error is appropriate. Again, I think it is better to have an error early than learn of a mistake later. Thus, my opinion is that we should specify consistency and exactness requirements, and that we should specify that errors should be thrown when these requirements aren't met. I'll try to tabulate various combinations of situations that could cause consistency and/or exactness failures when creating resources (note that the use of
I believe Finally, we need to decide what a failure should result in. Earlier, I advocated a simple There are considerations around this in several older issues that I think was closed a bit prematurely. #121 deals with conflicting interaction models. There's also some discussion in #105 and others. |
There are a few tests for this amongst the converted tests, and I have reviewed the results of these tests for NSS and CSS. Generally, CSS seems to agree with these tests, but uses CSS fails in a few corner cases that are certainly arguable, notably it accepts the case where: Content-Type: text/turtle; charset=UTF-8
Link: <http://www.w3.org/ns/ldp#NonRDFSource>; rel="type" but that seems pretty meaningless to accept Turtle as a NonRDFSource. NSS fails a lot more tests, in general it seems to default to disregard the LDP Interaction Models. It will for example allow the creation of a non-container resource even if the Even though LDP isn't a central part of Solid, there is some use of such data and such inconsistency can create significant confusion in user expectation that can lead to attacks. |
In the LDP sense, any Turtle document that includes comments or human-important whitespace is a |
OK, that's an interesting rabbit hole, because there are hardly any RDF serializations where you can distinguish between different interaction models without doing an examination of the representation. Perhaps NSS behavior of ignoring them entirely are actually quite appropriate? |
I think two more things should be considered:
Here I've listed what I believe to be combinations of parameters that result in failed requests. Empty cells denote parameters irrelevant for said case.
|
There is certainly some tension here, because it could be argued that "any representation goes", that is, you could have an image with nodes and arrows and say that it represents the same data as some RDF. With such an interpretation, interaction models wouldn't even make sense. And yet, there are cases where such metadata is useful. Let me reiterate that my agenda here is to ensure that users aren't confused by apps that are making assumptions based on a too liberal interpretation of metadata like |
@langsamu thanks a lot! I agree with all of these, possibly save for one. A few of them are in my tests too. I purposely left the body out, because that opens another can of worms, since the data may also be a factor there as I argued in solid/data-interoperability-panel#154 . I feel that in the first iteration, it makes sense to only make restrictions based on message headers and request URI, but I think we should return to validating the body with some urgency. The one case where we may differ is row 11. My main missing feature in Solid is |
Yeah, from that definition, I agree it seems like stretching it a bit. However, my instinct is to always look for more specific codes than the x00-codes, as that may come in useful for the client in resolving the problem. But I agree it is pointless to stretch the definition of an error code if it does not actually help with that. Now, I'll cite the things I think are most relevant from RFC7231 that I think pertains to this, it is an excerpt from the first comment, so you can see if I take it too much out of context:
I'll argue that this is (mostly) the case. Such a constraint is that a container has a URI that ends with a I would also argue that there is a current state of the target resource, even if it doesn't yet exist, for example the state of the resource whose URI ends with A |
Any interaction that may result in loss of data which the humans (who really are the focus of all interactions, even when they aren't involved except at the fringes) might consider important, should be avoided. RDF graph equality is NOT enough with data files which include data which is not RDF encoded, including syntactically irrelevant comments and whitespace, such as Turtle or RDFa may contain. RDF graph equality when comparing Turtle to RDFa to N-triples to N-quads does not preserve all data therein, and this should not be acceptable. I am rather weary of explaining this over and over, and hearing back, "we don't care about your idea of what data is." That is a path certain to kill Solid, quite possibly quite messily, though it may be a slow death. The LDP WG made specific decisions, and wrote them up in a way that we thought at the time would be clear to readers, especially implementers but also including users. IF Solid is to continue to claim LDP compliance at all, and UNLESSS Solid writes up and warns of the significant data-losing departures from that compliance, with those warnings being presented to the user at every point they arise, then non-RDF contents of Turtle and RDFa files MUST be preserved and these files MUST be treated as |
Currently, Solid does not claim LDP compliance in this regard, it only references LDP normatively for Basic Containers and |
Great stuff! I think that would be great to fast track for 1.0.
Question is whether
Right. The problem is that HTTP isn't entirely consistent at this point... I don't have particularly strong feelings here. In fact, I think I would have preferred a 3xx error saying "this is not the resource you're looking for, the server's guess is that you're looking for an entirely different resource over there" :-) |
LDP describes server behaviour when a client includes the Currently, the SP only describes server behaviour when a client includes The I suggest that SP's approach for consistency should not be motivated through the lens of LDP. Consistency in SP should rather reflect the normative requirements, e.g., if an abstract semantic type is of interest to consider as one dimension of the request semantics, then we first need to establish the required It depends on the role of Consistency between I suggest that we either put more emphasis on requiring specific The tables in this issue are interesting from the point of implementing a server that tries to conform to the SP and LDP. However, that's advisory, not requirement. With that understanding, I'm in favour of factoring it under a non-normative "Relation to LDP" section in SP (or elsewhere). See #224 . (Along with #194 (comment) which describes compatibility between LDP clients and Solid servers, and LDP servers and Solid clients.) Currently, the SP is not requiring RDF validation. As servers MUST HTTP/1.1 (RFC 7230, RFC 7231), they can return 4xx as usual when they process request's representation data. When processed, the IANA media-type used in representation metadata is integral for validation, as opposed to the abstract semantic type. |
They may be motivated through the lens of LDP, but I don't think that's where the disagreement is, it is rather a symptom of that we view this differently. I feel you're making this too much of theoretical exercise, @csarven . It shouldn't be. I am trying prevent a likely and relatively easily foreseeable class of non-technical security problems. That's the point, not LDP. Also, importantly, there is an open world, so people may introduce new types of resources, possibly not telling us. Therefore, I think it is very important that the protocol contains generic language that legitimizes tests that fails when there is an understood inconsistency, as well as prompting people to think about it. So, exhaustively finding and documenting inconsistencies cannot happen, thus it should not depend on #191 , rather a part of solving #191 should be to document new possible conflicts. Generic language still needs to be in for the open world, and for reducing the risk of vulnerabilities. so, jumping to
It would be nice to have it in a "Relation to LDP" section in addition, but that won't allow testing it rigorously, and so, does not address the actual problem, prevention of a class of possible attacks. There needs to be normative general language in the spec that encompasses the possible inconsistencies so that tests can check that implementations aren't doing anything dangerous. Then, we can have non-normative language that provides guidance, but that's of less value than tests anyway, so we don't need that.
I interprete this in two ways, one is a desire to close the world. I can see why, but I don't think it is necessary or the right thing to do. The other is the old debate on how lenient you should be in error handling. That's not an easy debate. To me, if something is just clearly an inconsequential mistake, then yes, I can easily advocate for an ignore approach. If something is clearly an attempted attack, then, it really has to be an error. The line between the two is a bit blurry, but I think in this case, it is clear enough: If something sets a header despite the very clear semantics we have around containers and non-containers, then it is most likely an attempt at doing something nasty. It should be an error. This is unlikely to happen as a simple mistake, this is not constraining the good path in any way, it should not ever be visible when running legitimate software. This is a line of defense against attacks exploiting semantic inconsistencies. |
The response payload can provide the reason for the client error. Issue: #28 using RDF messages. URI Slash Semantics (SP). When the request's URI Persistence (SP). When the request's When the request's When the encapsulating container of the HTTP message is invalid, the server MUST responds with 400 (RFC 7231). The consistency between request target, representation metadata and data is categorically different than "bad" requests. |
I suggested that for consistency to be useful in the SP, it needs to work with normative requirements. If that's not prescribed in the SP, the existing RFCs already cover the general requirements in addition to good practices. |
OK, lets bump it off the milestone. This problem arose as extensive testing shows that the implementations did not agree the slightest around what the actual behavior should be. If it is then adequately covered by the RFCs, I think we should clarify it, but we can also just go with it and hope no-one finds a way to exploit it in the wild. |
Can you refer to the tests corresponding to the requirements in the SP? |
No description provided.
The text was updated successfully, but these errors were encountered: