-
-
Notifications
You must be signed in to change notification settings - Fork 672
Federation authenticated media endpoints have invalid multipart response #3414
Comments
Seems like this is actually a bug in Go's standard library. It actually used to include a leading CRLF but was changed ~13 years ago as a workaround for a bug in a different program: https://codereview.appspot.com/4635063#msg4. |
This might be a problem in Ruma, the Spec wants RFC 2046 for the boundary. (Yea, the MSC linked to RFC 1341, which may had different "rules" for boundaries) The multipart package, which Dendrite uses, implements RFC 2046. |
@S7evinK Good catch! RFC 2046 seems to have similar text stating that boundaries must be preceded by a CRLF though. On page 19:
I haven't read the whole thing in detail, it's definitely possible that I'm missing something. It would be at least somewhat surprising this was a bug sitting in the go standard library for 13 years without anybody else noticing it. |
Coincidentally I read Go's |
on page 22 of RFC2046 the multipart body is specified as:
Which seems to me like the leading CRLF is only required when a preamble is present, which is optional |
Oh fun. I agree with your reading that the BNF clearly states the first CRLF is not required unless there is a preamble. That seems to conflict with the earlier text in the RFC to me, but either way the next step is to change ruma to be more permissive. |
RFC 2046 is somewhat ambiguous on whether or not it's valid to omit the preceding CRLF for the first boundary. The prose on page 19 suggests that it is not: > The boundary delimiter MUST occur at the beginning of a line, i.e., > following a CRLF, and the initial CRLF is considered to be attached > to the boundary delimiter line rather than part of the preceding > part. The boundary may be followed by zero or more characters of > linear whitespace. It is then terminated by either another CRLF and > the header fields for the next part, or by two CRLFs, in which case > there are no header fields for the next part. If no Content-Type > field is present it is assumed to be "message/rfc822" in a > "multipart/digest" and "text/plain" otherwise. > > NOTE: The CRLF preceding the boundary delimiter line is conceptually > attached to the boundary so that it is possible to have a part that > does not end with a CRLF (line break). Body parts that must be > considered to end with line breaks, therefore, must have two CRLFs > preceding the boundary delimiter line, the first of which is part of > the preceding body part, and the second of which is part of the > encapsulation boundary. But the BNF on page 22 suggests that it is, as long as there is no preamble: > dash-boundary := "--" boundary > ; boundary taken from the value of > ; boundary parameter of the > ; Content-Type field. > > multipart-body := [preamble CRLF] > dash-boundary transport-padding CRLF > body-part *encapsulation > close-delimiter transport-padding > [CRLF epilogue] Dendrite currently generates multipart responses without a preceding CRLF for the first boundary[2], which were rejected by the previous ruma parsing logic. [1]: https://datatracker.ietf.org/doc/html/rfc2046 [2]: matrix-org/dendrite#3414
RFC 2046[1] is somewhat ambiguous on whether or not it's valid to omit the preceding CRLF for the first boundary. The prose on page 19 suggests that it is not: > The boundary delimiter MUST occur at the beginning of a line, i.e., > following a CRLF, and the initial CRLF is considered to be attached > to the boundary delimiter line rather than part of the preceding > part. The boundary may be followed by zero or more characters of > linear whitespace. It is then terminated by either another CRLF and > the header fields for the next part, or by two CRLFs, in which case > there are no header fields for the next part. If no Content-Type > field is present it is assumed to be "message/rfc822" in a > "multipart/digest" and "text/plain" otherwise. > > NOTE: The CRLF preceding the boundary delimiter line is conceptually > attached to the boundary so that it is possible to have a part that > does not end with a CRLF (line break). Body parts that must be > considered to end with line breaks, therefore, must have two CRLFs > preceding the boundary delimiter line, the first of which is part of > the preceding body part, and the second of which is part of the > encapsulation boundary. But the BNF on page 22 suggests that it is, as long as there is no preamble: > dash-boundary := "--" boundary > ; boundary taken from the value of > ; boundary parameter of the > ; Content-Type field. > > multipart-body := [preamble CRLF] > dash-boundary transport-padding CRLF > body-part *encapsulation > close-delimiter transport-padding > [CRLF epilogue] Dendrite currently generates multipart responses without a preceding CRLF for the first boundary[2], which were rejected by the previous ruma parsing logic. [1]: https://datatracker.ietf.org/doc/html/rfc2046 [2]: matrix-org/dendrite#3414
RFC 2046[1] is somewhat ambiguous on whether or not it's valid to omit the preceding CRLF for the first boundary. The prose on page 19 suggests that it is not: > The boundary delimiter MUST occur at the beginning of a line, i.e., > following a CRLF, and the initial CRLF is considered to be attached > to the boundary delimiter line rather than part of the preceding > part. The boundary may be followed by zero or more characters of > linear whitespace. It is then terminated by either another CRLF and > the header fields for the next part, or by two CRLFs, in which case > there are no header fields for the next part. If no Content-Type > field is present it is assumed to be "message/rfc822" in a > "multipart/digest" and "text/plain" otherwise. > > NOTE: The CRLF preceding the boundary delimiter line is conceptually > attached to the boundary so that it is possible to have a part that > does not end with a CRLF (line break). Body parts that must be > considered to end with line breaks, therefore, must have two CRLFs > preceding the boundary delimiter line, the first of which is part of > the preceding body part, and the second of which is part of the > encapsulation boundary. But the BNF on page 22 suggests that it is, as long as there is no preamble: > dash-boundary := "--" boundary > ; boundary taken from the value of > ; boundary parameter of the > ; Content-Type field. > > multipart-body := [preamble CRLF] > dash-boundary transport-padding CRLF > body-part *encapsulation > close-delimiter transport-padding > [CRLF epilogue] Dendrite currently generates multipart responses without a preceding CRLF for the first boundary[2], which were rejected by the previous ruma parsing logic. [1]: https://datatracker.ietf.org/doc/html/rfc2046 [2]: matrix-org/dendrite#3414
…boundary RFC 2046[1] is somewhat ambiguous on whether or not it's valid to omit the preceding CRLF for the first boundary. The prose on page 19 suggests that it is not: > The boundary delimiter MUST occur at the beginning of a line, i.e., > following a CRLF, and the initial CRLF is considered to be attached > to the boundary delimiter line rather than part of the preceding > part. The boundary may be followed by zero or more characters of > linear whitespace. It is then terminated by either another CRLF and > the header fields for the next part, or by two CRLFs, in which case > there are no header fields for the next part. If no Content-Type > field is present it is assumed to be "message/rfc822" in a > "multipart/digest" and "text/plain" otherwise. > > NOTE: The CRLF preceding the boundary delimiter line is conceptually > attached to the boundary so that it is possible to have a part that > does not end with a CRLF (line break). Body parts that must be > considered to end with line breaks, therefore, must have two CRLFs > preceding the boundary delimiter line, the first of which is part of > the preceding body part, and the second of which is part of the > encapsulation boundary. But the BNF on page 22 suggests that it is, as long as there is no preamble: > dash-boundary := "--" boundary > ; boundary taken from the value of > ; boundary parameter of the > ; Content-Type field. > > multipart-body := [preamble CRLF] > dash-boundary transport-padding CRLF > body-part *encapsulation > close-delimiter transport-padding > [CRLF epilogue] Dendrite currently generates multipart responses without a preceding CRLF for the first boundary[2], which were rejected by the previous ruma parsing logic. [1]: https://datatracker.ietf.org/doc/html/rfc2046 [2]: matrix-org/dendrite#3414
Background information
0.13.7+7a4ef24
go version
: unknownb4fecbc51719a33d09be1e76d55ae0eec11fb71a
(over federation API)Description
The multipart spec states that every encapsulation boundary in the body should be preceded by a CRLF. Instead, dendrite's response to the federation auth media download endpoint has the
--
of the first boundary starting directly at the beginning of the body, with no CRLF.Ruma expects a preceding CRLF here when parsing the response. As a result, homeserver implementations using ruma are unable to fetch media from dendrite servers over the authed media endpoints. This affects grapevine, and likely also affects conduwuit although I have not tested it.
relevant text from RFC 1341
Steps to reproduce
/_matrix/federation/v1/media/download/:mediaid
request to a dendrite serverThe text was updated successfully, but these errors were encountered: