Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails to detect name parts if suffix preceeds comma #14

Open
scarver2 opened this issue Sep 8, 2015 · 2 comments
Open

Fails to detect name parts if suffix preceeds comma #14

scarver2 opened this issue Sep 8, 2015 · 2 comments

Comments

@scarver2
Copy link

scarver2 commented Sep 8, 2015

These "last_name suffix, first_name" names generate an empty array.
Gump Jr., Bubba
Gump Jr., Bubba B.
Gump II, Bubba
Gump Jr., B.B.

However, when split-reverse-joined, the names parse correctly.
Bubba Gump Jr.
Bubba B. Gump Jr.
Bubba Gump II
B.B. Gump Jr.

This is not full-proof, because some names are formatted "first_name last_name, suffix". My conversion code below would actually make the suffix the first name.

My code used for correcting the above names

name = 'Gump Jr., Bubba B.'
coerced_name = name.split(',').reverse.join(' ').squish
namae = Namae.parse(coerced_name)
@inukshuk
Copy link
Member

inukshuk commented Sep 8, 2015

We could probably add those patterns to the display order grammar -- the fact that there is no comma means that it will only work for suffices which are recognized by the tokenizer of course (which in this case I think they would be).

Do you think you can do that?

Just splitting on commas is not general solution, because we need to support multiple last names which are quite common in languages like Spanish or Portuguese.

@rdimartino
Copy link

Also just noticed this. We saw the failure with Gump Jr., Bubba but also with a nickname, e.g. Gump, Bubba "Shrimp King"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants