Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate language tags, throwing errors if they aren't valid #24

Open
dlongley opened this issue Oct 11, 2019 · 6 comments
Open

Validate language tags, throwing errors if they aren't valid #24

dlongley opened this issue Oct 11, 2019 · 6 comments
Assignees

Comments

@dlongley
Copy link
Member

No description provided.

@dlongley
Copy link
Member Author

Use the language tag regex from RDF (not the more strict one from BCP47).

@dlongley
Copy link
Member Author

Turtle uses this

[144s] LANGTAG                        ::= "@" [a-zA-Z]+ ( "-" [a-zA-Z0-9]+ )*

@davidlehn
Copy link
Member

davidlehn commented Oct 11, 2019

Note a PR exists for this on the jsonld.js side:
digitalbazaar/jsonld.js#284

It uses this:

const LANG_RE = /^[a-zA-Z]+(-[a-zA-Z0-9]+)*$/;

I still am concerned about merging that until we have performance hit benchmarks, probably a flag to disable all that checking (when you know you don't need it), and maybe an API for handling failures vs just dropping invalid data silently. Same issues if the check is in this lib.

@dlongley
Copy link
Member Author

We should use BCP47 validation instead, not RDF, RDF will be getting errata/bug report.

@davidlehn
Copy link
Member

@dlongley
Copy link
Member Author

Those regexes are for capturing the various components of a language tag; we don't need any of that, we just need to know if it's valid. It should be a much simpler regex.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants