-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some aiohttp web server redirects fail #5319
Comments
The solution is: aiohttp should not requote HTTP redirect URL. Other quoting logic should be untouched. The issue is easy and wais for a champion. |
I submitted a PR that works in my program but ... raised interesting failures in the CI pipeline E.g. exotic test cases where the redirection URL has linefeeds characters ("\r\n"), my patch transmits these characters unchanged. The current code (before this PR) silently fixes these URLs with yarl by encoding these as any character. For my patch to be valid, we should enrich it to check that the URL follows the HTTP standard. This becomes harder as we need a standard-compliant checker, and it may potentially break redirects for existing aiohttp users who were relying heavily on yarl transparent url corrections and didn't realize their redirection URLs were not standard-compliant... In such case it may be simpler to just fix yarl (I am changing my mind on this as I realize we are pulling other potentially larger issues by trying to workaround yarl). |
@asvetlov I have a concrete and trivial case of this In [6]: import aiohttp
...: import asyncio
...:
...: async def main():
...:
...: async with aiohttp.ClientSession() as session:
...: async with session.get('https://releases.jfrog.io/artifactory/artifactory-pro-rpms/repodata/8c87521e43dbed223c90e23922c0c4bbe7159112-primary.xml.gz') as response:
...:
...: print("Status:", response.status)
...: print("Content-type:", response.headers['content-type'])
...:
...: html = await response.text()
...: print("Body:", html)
...:
...: loop = asyncio.get_event_loop()
...: loop.run_until_complete(main())
Status: 403
Content-type: text/xml
Body: <?xml version="1.0" encoding="UTF-8"?><Error><Code>AccessDenied</Code><Message>Access denied</Message></Error> aiohttp processes the redirect location as the following - note the "X-Artifactory-artifactPath" parameter
Whereas wget, httpie and others process the redirect as
and it works fine |
Perhaps |
Not exotic really but request smuggling attack prevention. |
Per Andrew's comment, maybe aiohttp could make use of the |
@dralley Have you tried setting |
@webknjaz This works, thank you. It does beg the question though - why does aiohttp default to rewriting the redirect URLs it receives, rather than defaulting to not doing so? All things considered it does still seem like surprising behavior. |
I think it has to do with the fact that I did some Git archeology and found that this issue was first reported in #1474 against aiohttp v1.1.4. Then, with @fafhrd91's patch ed9bf2b this setting appeared in v2.1.0, followed by a later change by @asvetlov — #3436 — that deprecated the corresponding attribute mutability. If anybody's able to recall the original motivation, it'd probably be @fafhrd91. |
🐞 Describe the bug
Aiohttp server redirects ( triggered through "raise HTTPFound(...)") fail sometimes, depending on the final webserver targeted by the redirect, when the URL includes a question mark "?" in a query string argument (e.g., when this query argument is itself a URL), even when this question mark is URL-encoded.
💡 To Reproduce
On an aiohttp server send back a redirect to the client with "raise HTTPFound(...)", to a URL (on another final web server) that includes a query string that includes another URL with an encoded question mark. Some final web servers will reject that URL due to the second question mark, and the network trace will show that the initial aiohttp webserver had transformed the redirection URL, decoding the question mark in the query string.
Further analysis revealed this bug is due to 2 things:
(1) The aiowebserver decodes and reencodes the URL through the yarl library
(2) The yarl library has a bug with some query string characters that it does not escape (which is compliant with the web standards but not compliant with many webservers expectations):
aio-libs/yarl#245
Although the yarl maintainers may fix (2), this bug has been opened for 2 years and, more importantly, the usefulness of (1) is questioned: why not letting the user be responsible for the URL it submits? Reencoding it exposes him to expectation mismatchs (I never expected the web server to silently transforms my URL) and such potential bugs.
💡 Expected behavior
The redirect should work flawlessly.
📋 Logs/tracebacks
📋 Your version of the Python
3.7.8
📋 Your version of the aiohttp/yarl/multidict distributions
📋 Additional context
The text was updated successfully, but these errors were encountered: