-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accented characters being rejected in new versions #107
Comments
Standard allows to use utf-8 characters |
@kritik Yes, I agree with you. However, on the other side, if users copy and paste some URLs opened in their browser, they may contain accented characters. Also, if you use client-side validation in forms (e.g. a URL field) most browsers allow accented characters (I don't know why). |
can you give an example of wrong url? |
@kritik e.g. |
oh, then it should be fixed |
@kritik Based on the answers on SO only ASCII characters should be accepted. However that is not the common behavior of browsers, which currently accept also other characters (e.g. accented characters). Accepting non ASCII characters is probably not compliant with the standard. However that would be more user-friendly and reflect the browser behavior. In any case if we accept non ASCII characters we should make sure not to create security issues (for example when the link is included in a Rails |
Let me put it in this way. non ASCII characters are accepted in the urls. If you network doesn't support it, possibly in US, then browser can encode it to ASCII. If domain name has non-ASCII characters then special prefix will be added (don't remember which one). Other parts of url will be encoded by url encode logic |
That is widespread, but not standard. Here's more information. |
May be that website old? When we did at Perfectline update for Estonian domain system upgrade then we had to support ASCII and Non-ASCII characters. It works the same way as in Russian or Chinese domains. For more info you can check https://github.com/internetee/registry |
I am unable to validate a URL with Chinese characters:
|
@kritik Any update on this issue? I'm getting validation error for this url |
My two cents is that it should be an option, maybe something like |
I also have a similar problem with a linkedin profile that has in its url the word |
I have noticed that URLs containing non-ASCII characters, like accented characters, were accepted in the past. However now this library rejects all of them.
This change is probably related to one of these recent commits:
1945ae4
3dde863
Is this change made on purpose or not?
We should probably add a test to clarify this if it is made on purpose.
I have read that the standard requires only ASCII characters, so this is probably correct.
On the other side, I have noticed that many users of my website were affected and started getting validation errors when they tried to post external URLs on our website... So, if this is not a security issue (I don't know), maybe we can consider accepting them as we did in the past?
The text was updated successfully, but these errors were encountered: