Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre encoded path/params/fragments should be kept #2976

Closed
wants to merge 1 commit into from

Conversation

elupus
Copy link

@elupus elupus commented Dec 4, 2023

Summary

URL with pre-encoded segments would end up double encoded when parsed by the URL parser. We must consider the % as a safe value, to avoid encoding it here.

Related to comments in #2883
Related #2723
Duplicate #2929

Checklist

  • I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.

Copy link
Contributor

@zanieb zanieb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me. I stumbled upon this behavior in the / encoding implementation and was confused.

@T-256
Copy link
Contributor

T-256 commented Dec 4, 2023

I don't think this PR fixes #2883. That was about encodes forward slash.
Does this comment need to be updated accordingly?
Is this behavior change?

@elupus
Copy link
Author

elupus commented Dec 4, 2023

I don't think this PR fixes #2883. That was about encodes forward slash. Does this comment need to be updated accordingly? Is this behavior change?

Hmm. Well not the initial report there indeed. But the the receiver of that url should url decode. Not doing that would be a bug in its parser.

It might only fix the followup comments with similar issues.

@@ -150,6 +150,26 @@ def test_param_with_existing_escape_requires_encoding():
assert str(url) == "http://webservice?u=http%3A%2F%2Fexample.com%3Fq%3Dfoo%252Fa"


def test_param_with_existing_escape():
url = httpx.URL("https://webservice/?u=/%3D%26&v=1%202")
assert str(url) == "https://webservice/?u=%2F%3D%26&v=1%202"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will work, it's still corrupting the parameter by encoding / to %2F which is something not all servers can handle.

@elupus
Copy link
Author

elupus commented Dec 5, 2023

Moving to draft n favor off #2980

@elupus elupus marked this pull request as draft December 5, 2023 05:54
@zanieb
Copy link
Contributor

zanieb commented Dec 7, 2023

While I think we may want to revert the behavior that encodes /, I think this change doesn't need to be conflated with that. We should not double-encode already encoded characters — that's a clear bug from my perspective. I'd rather review the changes in isolation so we can have a discussion about one thing at a time. Perhaps the approach taken in #2980 is better but I'm not sure you should include the / encode revert there.

@tomchristie
Copy link
Member

tomchristie commented Dec 8, 2023

Thanks for taking a look at this @elupus, I've been planning to address this for a while and have taken a fresh pass on it at #2990. Reviews appreciated.

(Also, I like your approach artist fka @madkinsz - yep separating out the different aspects is almost always the tack I prefer.)

@tomchristie
Copy link
Member

Superseeded by #2990

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants