Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suffixes of V (fifth) or X (tenth) are not parsed correctly #35

Open
a-maas opened this issue Oct 31, 2019 · 1 comment
Open

Suffixes of V (fifth) or X (tenth) are not parsed correctly #35

a-maas opened this issue Oct 31, 2019 · 1 comment

Comments

@a-maas
Copy link

a-maas commented Oct 31, 2019

If a name has a suffix of V (fifth), it is considered a family name, not the suffix:

>> Namae.parse("Adam Burren V")
=> [#<Name family="V" given="Adam Burren">]

This is because the suffix regex is:

/\s*\b(JR|Jr|jr|SR|Sr|sr|[IVX]{2,})(\.|\b)/

The part that accounts for roman numeral suffixes is [IVX]{2,}, which looks for 2 or more characters, while V or X would only be one.

Perhaps this is intentional, because looking for a single character may be problematic and cause a lot of false positives, but I wanted to create an issue for it and see.

@inukshuk
Copy link
Member

If I remember correctly, your guess is right: we used this default pattern to avoid false positives matching initials. All these patterns are configurable (e.g., Namae::Parser.instance.options for the singleton instance), so if you expect those kind of suffixes (or do not expect initials) you can just use a more suitable default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants