Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make clear that names/values are UTF-8 in x-www-form-urlencoded #235

Merged
merged 2 commits into from
Sep 12, 2024

Conversation

jogu
Copy link
Collaborator

@jogu jogu commented Aug 25, 2024

Copy link
Member

@c2bo c2bo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to introduce the general encoding scheme somewhere in terminology with most of the original text you linked and then reference to it?

MUST be encoded using the UTF-8 character encoding scheme [RFC3629] first; the resulting octet sequence then needs to be further encoded using the escaping rules defined in [W3C.REC-html401-19991224].

@jogu
Copy link
Collaborator Author

jogu commented Aug 26, 2024

Wouldn't it make sense to introduce the general encoding scheme somewhere in terminology with most of the original text you linked and then reference to it?

MUST be encoded using the UTF-8 character encoding scheme [RFC3629] first; the resulting octet sequence then needs to be further encoded using the escaping rules defined in [W3C.REC-html401-19991224].

Thanks Christian. I'm not sure if I want to start picking at some of that as things start to unravel/get more complicated. The RFC6749 definition isn't actually the canonical one anymore - the definition of application/x-www-form-urlencoded as per the official media type registration ( https://www.iana.org/assignments/media-types/application/x-www-form-urlencoded ) is actually found in https://url.spec.whatwg.org/#application/x-www-form-urlencoded but that's a pretty hard definition to read... and the important thing is basically "you can just use what the url/http library in your favourite language already does but do make sure to use utf-8 as that's the only part that's really not defined elsewhere".

@nemqe
Copy link
Contributor

nemqe commented Aug 27, 2024

In general I would just remove the word first from the sentences because in the referred section there is a continuation (first encode it using utf-8, then escape it) and here we are missing that so it reads a bit funny (at least to me, but I am not a native speaker).

@Sakurann
Copy link
Collaborator

Sakurann commented Sep 5, 2024

I tend to agree with @nemqe 's suggetion to remove first. and the direction to clarify things as they are in this PR vs introducing a new section

As per feedback on PR.

Co-authored-by: Kristina <[email protected]>
@jogu
Copy link
Collaborator Author

jogu commented Sep 10, 2024

Thanks for the feedback, and to Kristina for adding the suggestions - I've committed them, please review/approve!

@Sakurann Sakurann requested a review from c2bo September 11, 2024 13:34
@Sakurann Sakurann merged commit e29d8d7 into main Sep 12, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Charset of application/x-www-form-urlencoded requests/responses
4 participants