Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to manage multilingual labels #41

Open
nichtich opened this issue Nov 8, 2017 · 2 comments
Open

How to manage multilingual labels #41

nichtich opened this issue Nov 8, 2017 · 2 comments

Comments

@nichtich
Copy link
Contributor

nichtich commented Nov 8, 2017

Is there a way to put labels in multiple languages into one MARC record, e.g. repeat field 153? If not, should mc2skos provide a method to compare and merge multiple MARC files of the same classification in different languages?

@danmichaelo
Copy link
Member

Not that I'm aware of. Neither 100 in authority nor 153 in classification are repeatable. There was a discussion paper in 2001 on Multilingual Authority Records recommending separate records for each language. Interestingly it mentions something called "context markers" which could perhaps also be used to indicate language in a single-record approach, but I'm not sure what happened to that idea. There is mention of a follow-up paper to be prepared "for the midwinter 2002 meeting", but I haven't been able to find that (should have been here I guess).

I've seen model A in use as well. I think GND includes English terms in 4XX fields, but without any language marker, so that's not very optimal. We had to prepare a similar file to get our English terms searchable in Primo though. Not sure what the equivalent of 4XX would be in Marc21 classification.

Merging could be a feature. Not sure if it need to be part ofmc2skos though, or if we can rely on some other RDF tool like riot? If the URIs are based on the classification number or some other common identifier, it should be easy enough to merge the RDF files afterwards, shouldn't it?

@nichtich
Copy link
Contributor Author

Thanks for the background and history. So to create multilingual KOS from MARC, multiple MARC files have to be converted and merged. Merging is easy in RDF but making sure that all input files align could cause problems. It may be more reliable to have one master file and additional translation files. The latter should only be used for string properties (skos:prefLabel, skos:altLabel, skos:scopeNote, skos:editorialNote, skos:historyNote). My use case is to help get English translations into the RVK classification.

I think a good solution would be an option to only include string properties and a tool/guideline to merge KOS files.

$ mc2skos master.xml master.ttl
$ mc2skos --stringsOnly translation.xml translation.ttl
$ merge master.ttl translation.ttl > multilingual.ttl

Here merge can be replaced by cat for RDF/Turtle ([nd]json need other mechanism) but some additional checking would be better to make sure that the translation does not add any concepts not included in the master. Anyway this checking should better be put into another tool, e.g. skosify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants