Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Namae doesn't parse PHD titles #6

Open
freesteph opened this issue Sep 3, 2014 · 9 comments
Open

Namae doesn't parse PHD titles #6

freesteph opened this issue Sep 3, 2014 · 9 comments

Comments

@freesteph
Copy link

We're using Faker to generate our test names, and I just noticed that Namae cannot parse a PHD title.

Namae.parse("Bernardo Franecki, PhD")

returns an empty array.

After a bit of googling, looks like this is a tangible use case that Namae should handle.

@drwl
Copy link

drwl commented Sep 4, 2014

I encountered this too. I'm looking at other options, but if nothing proves fruitful I'll try to get around to submitting a pull request.

@freesteph
Copy link
Author

I've had a look at the code (and the fascinating grammar file) and PHD is included in there, but the normal order, First Last, PHD, doesn't work. If you switch it around (PhD, First Last), it does parse each entity correctly.

Couldn't figure out a proper way to change the parsing order but hopefully it shouldn't be too hard.

@inukshuk
Copy link
Member

inukshuk commented Sep 9, 2014

I think there are two aspects to this: one is you can configure the lexer pattern for titles at runtime (in case a given spelling of PhD is not detected as a title) and secondly we will probably need to add more grammar rules to allow for more variations of where the title can be placed. Will take a look at this now.

inukshuk added a commit to inukshuk/namae that referenced this issue Sep 9, 2014
inukshuk added a commit to inukshuk/namae that referenced this issue Sep 14, 2014
@bl00513
Copy link

bl00513 commented Feb 22, 2019

@inukshuk Is this issue closed? It appears to have been covered in an earlier PR.

@inukshuk
Copy link
Member

IIRC the parser will still not accept the example above, because it will need additional grammar rules, so let's leave it open while that's still the case.

@drwl
Copy link

drwl commented Feb 26, 2019

@inukshuk
Copy link
Member

@drwl no those specs just test the tokenizer. The tokenizer is not that critical, because you can just update the title patterns at runtime anyway. But some title-usage requires new grammar rules. I added some specs here but they won't pass yet unless I'm forgetting something.

Happy to merge a PR if someone would like to take a stab at this.

@jachwe
Copy link

jachwe commented Jun 1, 2022

I'd also be very interested in a fix for this.

@jachwe
Copy link

jachwe commented Jun 7, 2022

As a workaround: If you add PhD to the list of titles e.g. :suffix => /\s*\b(JR|Jr|jr|SR|Sr|sr|[IVX]{2,}|PhD|Phd)(\.|\b)/,
then the name is parsed correctly. Only downside is that the PhD is recognised as suffix instead of a title. But at least parsing works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants