Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How we determine which is the preferred name of an actor? #49

Open
yeozy95 opened this issue Oct 13, 2022 · 2 comments
Open

How we determine which is the preferred name of an actor? #49

yeozy95 opened this issue Oct 13, 2022 · 2 comments

Comments

@yeozy95
Copy link

yeozy95 commented Oct 13, 2022

DDL has previously just took the first instance of the actor's name appearance in our database as generally the "preferred" version. Do we have a stricter logic for the OpenClimate Schema?

@evanp
Copy link
Contributor

evanp commented Oct 28, 2022

It's a matter of choice, and primarily for UI.

I think the preferred name would be the one we would show as default for a language, and the "name" column in Actor is the fallback in case there are no matches.

Preference might include these factors, in no particular order:

  • Readability (Cocos Islands more readable than Territory of Cocos (Keeling) Islands, The)
  • Briefness (Bolivia is briefer than Plurinational State of Bolivia)
  • Frequency of use
  • Official use (example: Mumbai vs Bombay, Alphabet Inc. vs Google, Deere & Company vs. John Deere)
  • Current use
  • Word order
  • Clarity (example: United Kingdom is clearer than UK, North Korea is clearer than DPRK)

I think for 99% of actors, we'll have only one or two names in any particular language. In general, I'd probably defer to Wikipedia or Wikidata for names, since they have a lot of editors and contributors who put in a lot of time discussing the best name to use!

@evanp
Copy link
Contributor

evanp commented Oct 28, 2022

Maybe we should change this to a Q score. Like 0 -> 1, where 1.0 is the best name to use, and 0 is the worst name to use. If an Actor has lots of names, use the one with the best score.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants