Namae doesn't parse PHD titles #6

freesteph · 2014-09-03T15:31:09Z

We're using Faker to generate our test names, and I just noticed that Namae cannot parse a PHD title.

Namae.parse("Bernardo Franecki, PhD")

returns an empty array.

After a bit of googling, looks like this is a tangible use case that Namae should handle.

drwl · 2014-09-04T21:03:26Z

I encountered this too. I'm looking at other options, but if nothing proves fruitful I'll try to get around to submitting a pull request.

freesteph · 2014-09-08T10:57:42Z

I've had a look at the code (and the fascinating grammar file) and PHD is included in there, but the normal order, First Last, PHD, doesn't work. If you switch it around (PhD, First Last), it does parse each entity correctly.

Couldn't figure out a proper way to change the parsing order but hopefully it shouldn't be too hard.

inukshuk · 2014-09-09T12:23:13Z

I think there are two aspects to this: one is you can configure the lexer pattern for titles at runtime (in case a given spelling of PhD is not detected as a title) and secondly we will probably need to add more grammar rules to allow for more variations of where the title can be placed. Will take a look at this now.

see berkmancenter#6

bl00513 · 2019-02-22T16:28:10Z

@inukshuk Is this issue closed? It appears to have been covered in an earlier PR.

inukshuk · 2019-02-26T15:25:54Z

IIRC the parser will still not accept the example above, because it will need additional grammar rules, so let's leave it open while that's still the case.

drwl · 2019-02-26T15:57:37Z

Is it not resolved by https://github.com/inukshuk/namae/blob/master/spec/namae/parser_spec.rb#L97 ?

inukshuk · 2019-02-26T16:19:16Z

@drwl no those specs just test the tokenizer. The tokenizer is not that critical, because you can just update the title patterns at runtime anyway. But some title-usage requires new grammar rules. I added some specs here but they won't pass yet unless I'm forgetting something.

Happy to merge a PR if someone would like to take a stab at this.

jachwe · 2022-06-01T21:43:05Z

I'd also be very interested in a fix for this.

jachwe · 2022-06-07T07:06:35Z

As a workaround: If you add PhD to the list of titles e.g. :suffix => /\s*\b(JR|Jr|jr|SR|Sr|sr|[IVX]{2,}|PhD|Phd)(\.|\b)/,
then the name is parsed correctly. Only downside is that the PhD is recognised as suffix instead of a title. But at least parsing works.

inukshuk added a commit to inukshuk/namae that referenced this issue Sep 9, 2014

add failing specs for titles

8926885

see berkmancenter#6

inukshuk added a commit to inukshuk/namae that referenced this issue Sep 14, 2014

improve title parsing (wip)

d197d2e

see berkmancenter#6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Namae doesn't parse PHD titles #6

Namae doesn't parse PHD titles #6

freesteph commented Sep 3, 2014

drwl commented Sep 4, 2014

freesteph commented Sep 8, 2014

inukshuk commented Sep 9, 2014

bl00513 commented Feb 22, 2019

inukshuk commented Feb 26, 2019

drwl commented Feb 26, 2019

inukshuk commented Feb 26, 2019

jachwe commented Jun 1, 2022

jachwe commented Jun 7, 2022

Namae doesn't parse PHD titles #6

Namae doesn't parse PHD titles #6

Comments

freesteph commented Sep 3, 2014

drwl commented Sep 4, 2014

freesteph commented Sep 8, 2014

inukshuk commented Sep 9, 2014

bl00513 commented Feb 22, 2019

inukshuk commented Feb 26, 2019

drwl commented Feb 26, 2019

inukshuk commented Feb 26, 2019

jachwe commented Jun 1, 2022

jachwe commented Jun 7, 2022