Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed safe uri alphabet (+$ -> _.) for compressToEncodedURIComponent/decompressToEncodedURIComponent #127

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

r0b-
Copy link

@r0b- r0b- commented Jan 29, 2019

I don't know why '+' and '$' were chosen when this feature was implemented, but both characters need to be encoded for valid query strings. (encodeURIComponent('+$') => '%2B%24')
So I changed them to '_' and '.' ... and also removed the +/whitespace replacement as it's not needed anymore.

@JobLeonard
Copy link
Collaborator

Thanks, this is a known issue but we cannot merge this unfortunately :(

The problem is that we cannot change this without breaking backwards compatibility.

(also, definitely something to remember if we ever go through a "spec" change - see #122)

@pieroxy
Copy link
Owner

pieroxy commented Mar 1, 2023

While I agree we should do this, this is a breaking change and will break havoc all over the place.

So we either reject it or we create another method...

@wkrick
Copy link

wkrick commented May 27, 2023

@r0b- @pieroxy

I propose two new methods...

  • compressToURISafe() or compressToURISafeEncoding()
  • decompressFromURISafe() or decompressFromURISafeEncoding()

With "URI Safe" being the "Unreserved" characters defined in RFC 3986 as discussed here...

https://joshbuchea.com/uri-safe-characters/

...but excluding the tilde character '~' for historical reasons due to RFC 1738 as discussed here...

https://jkorpela.fi/tilde.html

I think that naming the methods the way I suggest above makes it clearer what the method is actually doing. As "URI Safe" can be thought of as a character encoding like "UTF-16" so the new method naming should parallel the existing naming (i.e. compressToUTF16 / decompressFromUTF16). The current method names (which allude to encodeURIComponent and decodeURIComponent ) make it sound like the result will have URL-encoded values using percents or something, which isn't really what it does as the character set used by the method (ignoring the current bug) should produce a string that doesn't actually need to be URI-encoded (i.e. using percents) at all. I totally understand why the original name was chosen, but I think it could be clearer for the user.

Possible alternate (but longer) method names if the ones suggested above aren't acceptable:

  • compressToRFC3986URISafe() decompressFromRFC3986URISafe()

  • compressToRFC3986Unreserved() decompressFromRFC3986Unreserved()

  • compressToRFC3986UnreservedURICharacters() decompressFromRFC3986UnreservedURICharacters()

etc...

@wkrick wkrick mentioned this pull request Jun 26, 2023
7 tasks
@wkrick
Copy link

wkrick commented Dec 20, 2023

@r0b- @Rycochet

I just came across something called Base64URL which is described in Section 5 of RFC 4648 back in 2006...

https://www.rfc-editor.org/rfc/rfc4648#section-10

...and explained in more detail on this website...

https://base64.guru/standards/base64url

Base64URL is addressing the same problem with regards to Base64 encoded strings containing characters that aren't allowed in URLs.

In short, it's the same algorithm as Base64 except for these 4 changes:

  • Replaces “+” by “-” (minus)
  • Replaces “/” by “_” (underline)
  • Does not require a padding character
  • Forbids line separators

I feel that the new compressToURISafe() method (or whatever it ends up being called) should use the same implementation as described in RFC 4648.

EDIT: I found another reference point in the Java 8 Base64 API...
https://docs.oracle.com/javase/8/docs/api/java/util/Base64.html

They also have a Base64 MIME encoding method which we may also want to implement.

karnthis added a commit to karnthis/lz-string that referenced this pull request Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants