Skip to content

Commit

Permalink
v0.9.0 (#148)
Browse files Browse the repository at this point in the history
* improve coverage for teiHeader

- add monogr to core module (needed for biblStruct)
- add forename and surname to namesdates for more fine-grained author/persName

* add editorialDecl

Needed for LexSeal-compatible information on encoding

* add email for LexSeal-compatibility in teiHeader

* make availability required in publicationStmt

* allow biblStruct only in sourceDesc

* make publicationStmtPart.agency unbound in publicationStmt

we need this so that we can add both authority and publisher; we agreed that authority should be used for encoding various rights holders

* add @ROLE to <authority>

so that we can specify what kind of authority we're talking about; for instance <authority role="rightsHolder"></authority>

* changer requirements around langUsage

- add @ROLE and a closed value list to language (objectLanguage, workingLanguage, sourceLanguage, targetLanguage)
- require language, langUsage and profileDesc

* various improvements

- specify suggested values for authority/@ROLE
- add narrative documentation for TEI Header
- correct various misspellings
- add different css style for <val>
- add Boris as contributor

* temporarily remove VOLP example

to clarify editorial roles and date of publication with Ana
  • Loading branch information
ttasovac authored Sep 26, 2021
1 parent 0eba67d commit fb889b9
Show file tree
Hide file tree
Showing 14 changed files with 1,758 additions and 487 deletions.
43 changes: 42 additions & 1 deletion Schemas/TEILex0/TEILex0.odd
Original file line number Diff line number Diff line change
Expand Up @@ -108,9 +108,15 @@
<surname>Hildenbrandt</surname>
</persName>
</author>
<author xml:id="BL" role="contributing">
<persName>
<forename>Boris</forename>
<surname>Lehečka</surname>
</persName>
</author>
</titleStmt>
<editionStmt>
<edition n="0.8.6"/>
<edition n="0.9.0"/>
</editionStmt>
<publicationStmt>
<authority>DARIAH Working Group on Lexical Resources</authority>
Expand All @@ -123,6 +129,40 @@
<p>Born digital</p>
</sourceDesc>
</fileDesc>
<revisionDesc>
<listChange>
<listChange n="0.9.0">
<change when="2021-09-27" type="docs">add section on <ref target="#header">TEI
Header</ref></change>
<change type="docs">correction of various misspellings</change>
<change type="spec">add <gi>monogr</gi> (needed for
<gi>biblStruct</gi>)</change>
<change type="spec">add <gi>forename</gi> and <gi>surname</gi> for more
fine-grained bibliographic information</change>
<change type="spec">add <gi>editorialDecl</gi></change>
<change type="spec">add <gi>email</gi> to make possible contact information in
the header</change>
<change type="spec">require <gi>availability</gi> in <gi>publicationStmt</gi> to
provide <gi>licence</gi></change>
<change type="spec">make <gi>sourceDesc</gi> optional</change>
<change type="spec">allow only <gi>biblStruct</gi> in
<gi>sourceDesc</gi></change>
<change type="spec">make <ref target="#TEI.model.publicationStmtPart.agency"
>model.publicationStmtPart.agency</ref> unbound to allow both
<gi>publisher</gi> and <gi>authority</gi> in
<gi>publicationStmt</gi></change>
<change type="spec">add <att>role</att> to <gi>authority</gi> with suggested
values: <val>funder</val>, <val>sponsor</val>,
<val>rightsHolder</val></change>
<change type="spec">require <gi>language</gi>, <gi>langUsage</gi> and
<gi>profileDesc</gi></change>
<change type="spec">add <att>role</att> to <gi>language</gi> with a closed list
of values: <val>objectLanguage</val>, <val>workingLanguage</val>,
<val>sourceLanguage</val>, <val>targetLanguage</val></change>

</listChange>
</listChange>
</revisionDesc>
</teiHeader>
<text>
<front>
Expand All @@ -131,6 +171,7 @@
<body>
<xi:include href="TEILex0.parts/01__introduction.xml" parse="xml"
xpointer="introduction"/>
<xi:include href="TEILex0.parts/015__header.xml" parse="xml" xpointer="header"/>
<xi:include href="TEILex0.parts/02__entries.xml" parse="xml" xpointer="entries"/>
<xi:include href="TEILex0.parts/03__forms.xml" parse="xml" xpointer="forms"/>
<xi:include href="TEILex0.parts/04__senses.xml" parse="xml" xpointer="senses"/>
Expand Down
225 changes: 225 additions & 0 deletions Schemas/TEILex0/TEILex0.parts/015__header.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="../css/tei_oxygen.css"?>
<?xml-stylesheet type="text/css" href="../css/tei.lex0.odd.css"?>
<div xmlns="http://www.tei-c.org/ns/1.0" xmlns:xi="http://www.w3.org/2001/XInclude" xml:id="header">
<head>Header</head>
<div>
<head>General remarks</head>
<p>A lexical resource encoded in TEI Lex-0 must, like any TEI file, start with a
<gi>teiHeader</gi> element. A TEI header contains information about the lexical resource
itself, its source(s), its encoding, and its revisions. Proper, structured metadata of this
kind is equally important for scholars using the resource, for software processing them,
and for cataloguers in libraries and archives.</p>
<p>The TEI header of a lexical resource has five major parts:<list>
<item>a <hi rend="italic">file description</hi>, tagged <gi>fileDesc</gi>, provides a
full bibliographic description of the electronic lexical resource itself as well as
the source(s), analogue or digital, from which it may have been derived. For details,
see section <ref target="#file-description">File Description</ref> below. </item>
<item>an <hi rend="italic">encoding description</hi>, tagged <gi>encodingDesc</gi>,
describes the relationship between the electronic resource and its source(s). It
allows for detailed description of whether (or how) the electronic resource was
produced, transcribed or normalized, how the encoder resolved ambiguities in the
source, what levels of encoding or analysis were applied etc.</item>
<item>a <hi rend="italic">profile description</hi>, tagged <gi>profileDesc</gi>,
contains classificatory and contextual information about the lexical resource
including its object and working languages.</item>
<item>a <hi rend="italic">container for external metadata</hi>, tagged
<gi>xenoData</gi>, contains metadata from non-TEI schemas, for instance Dublin Core,
MARCXML or MODS, if available.</item>
<item>a <hi rend="italic">revision history</hi>, tagged <gi>revisionDesc</gi>, contains
a list of changes made during the development of the lexical resource, both before
and after its official release. </item>
</list></p>
<p>Of these, two elements are required in TEI Lex-0: <gi>fileDesc</gi> and
<gi>profileDesc</gi>. It is highly recommended to include additional information in
<gi>encodingDesc</gi>. It is also an example of good practice to record changes in
<gi>revisionDesc</gi>. </p>
</div>
<div xml:id="file-description">
<head>File description</head>
<p>The bibliographic description of the given machine-readable lexical resource is absolutely
essential for identifying the basic information about the resource itself, its creators and
publishers as well as the conditions under which it is made available to the public. </p>
<p> The elements that make up <gi>fileDesc</gi> are: <specList>
<specDesc key="titleStmt"/>
<specDesc key="editionStmt"/>
<specDesc key="extent"/>
<specDesc key="publicationStmt"/>
<specDesc key="seriesStmt"/>
<specDesc key="sourceDesc"/>
</specList>
</p>
<p><gi>fileDesc</gi> is a mandatory element in plain TEI as well, but in TEI Lex-0 there are
some additional constraints and recommendations related to the content of this element.</p>
<p>
<list>
<item>In <gi>titleStmt</gi>, TEI Lex-0 <hi rend="italic">recommends</hi> the use of
<att>type</att> on <gi>title</gi> (with values either <val>full</val> or
<val>abbr</val>) to record both the full bibliographic title of the lexicographic
resource and the preferred abbreviated title for easy reference, should one exist. </item>
<item>In <gi>titleStmt</gi>, TEI Lex-0 <hi rend="italic">recommends</hi> the use of
<gi>persName</gi> and <gi>orgName</gi> to distinguish between the names of persons
and organizations. This is especially important since in some cases, the name of an
institution is used to take up the collective authorship of a work.</item>
<item>When using <gi>persName</gi>, TEI Lex-0 <hi rend="italic">recommends</hi> to
further structure the name with elements <gi>forename</gi> and
<gi>surname</gi>.</item>
<item>In <gi>publicationStmt</gi>, TEI Lex-0 <hi rend="italic">requires</hi> the use of
<gi>availability</gi> to record the <gi>licence</gi> of the given lexicographic
resource. In other words, a TEI Lex-0 <hi rend="italic">must</hi> include explicit
information on the conditions under which the given resource can be used. </item>
<item>In addition to <gi>publisher</gi> and <gi>distributor</gi>, the
<gi>publicationStmt</gi> in TEI Lex-0 <hi rend="italic">may</hi> include
information on any other <gi>authority</gi> responsible for creating or making the
resource available. </item>
<item>If using <gi>authority</gi>, TEI Lex-0 <hi rend="italic">requires</hi> the use of
<att>role</att> with values <val>funder</val>, <val>sponsor</val> or
<val>rightsHolder</val>.</item>
<item>In TEI Lex-0, <gi>sourceDesc</gi> is an optional element. Born-digital resources
or those which cannot be properly sourced do not require a <gi>sourceDesc</gi>. </item>
<item>If a resource <hi rend="italic">is</hi> sourced, <gi>sourceDesc</gi> in TEI Lex-0
requires the use of <gi>biblStruct</gi> for structuring bibliographic information
about the source(s). This is a departure from vanilla TEI which is more permissive in
this respect. </item>
</list>
</p>
<!--<list type="examples">
<item>-->
<!-- <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="fd-port" source="#VOLP">
<fileDesc>
<titleStmt>
<title type="full">Vocabulário Ortográfico da Língua Portuguesa [em linha]</title>
<title type="abbr">VOLP-1940</title>
<editor>
<persName>
<forename>Ana</forename>
<surname>Salgado</surname>
</persName>
</editor>
</titleStmt>
<publicationStmt>
<publisher>
<orgName>Academia das Ciências de Lisboa</orgName>
</publisher>
<publisher>
<orgName>
<orgName>Imprensa Nacional de Lisboa</orgName>
</orgName>
</publisher>
<authority role="funder">
<orgName>Fundo de Apoio à Comunidade Científica (FACC) — FCT</orgName>
</authority>
<availability>
<licence target="https://creativecommons.org/licenses/by/4.0/">Creative Commons
Attribution 4.0 International (CC BY 4.0)</licence>
</availability>
</publicationStmt>
<sourceDesc>
<biblStruct>
<monogr>
<title>Vocabulário Ortográfico da Língua Portuguesa</title>
<author>
<orgName>Academia das Ciências</orgName>
</author>
<imprint>
<publisher>Imprensa Nacional de Lisboa</publisher>
<date>1940</date>
</imprint>
<extent>1 volume</extent>
<extent>821 pp.</extent>
</monogr>
</biblStruct>
</sourceDesc>
</fileDesc>
</egXML>-->
<!-- </item>
<item>-->
<egXML xmlns="http://www.tei-c.org/ns/Examples" source="#EtymWB-XML">
<fileDesc>
<titleStmt>
<title type="full">Etymologisches Wörterbuch des Deutschen: Die XML-Edition</title>
<title type="abbr">EtymWB-XML</title>
<editor xml:id="WP">
<persName>
<forename>Wolfgang</forename>
<surname>Pfeifer</surname>
</persName>
</editor>
<!-- ... -->
<respStmt xml:id="OT">
<resp>annotated by</resp>
<persName>
<forename>Oxana</forename>
<surname>Tsunykowa</surname>
</persName>
</respStmt>
<respStmt xml:id="YW">
<resp>annotated by</resp>
<persName>
<forename>Yvonne</forename>
<surname>Wirkus</surname>
</persName>
</respStmt>
<respStmt xml:id="LL">
<resp>restructured by</resp>
<persName>
<forename>Lothar</forename>
<surname>Lemnitzer</surname>
</persName>
</respStmt>
<respStmt xml:id="AH">
<resp>restructured by</resp>
<persName>
<forename>Axel</forename>
<surname>Herold</surname>
</persName>
</respStmt>
</titleStmt>
<editionStmt>
<edition>
<title>XML encoded version</title>
<date>2006-11-17</date>
</edition>
</editionStmt>
<publicationStmt>
<publisher>Berlin-Brandenburgische Akademie der Wissenschaften</publisher>
<publisher>Digitales Wörterbuch der deutschen Sprache</publisher>
<authority role="funder">Berlin-Brandenburgische Akademie der Wissenschaften</authority>
<pubPlace>Berlin</pubPlace>
<date>2009</date>
<availability>
<licence>Copyright 2002-2009</licence>
</availability>
</publicationStmt>
<sourceDesc>
<biblStruct>
<monogr>
<author>Wolfgang Pfeifer</author>
<title>Etymologisches Wörterbuch des Deutschen</title>
<edition>2</edition>
<imprint>
<publisher>Akademie Verlag</publisher>
<pubPlace>Berlin</pubPlace>
<date>1993</date>
<note>with additional notes by the author</note>
</imprint>
</monogr>
</biblStruct>
</sourceDesc>
</fileDesc>
</egXML>
<!-- </item>
</list>-->
</div>
<div xml:id="profile-description">
<head>Profile description</head>
<p>In plain TEI, <gi>profileDesc</gi> is an optional element, whereas in TEI Lex-0, it is
required. This is because the nature lexicographic resources is such that it is essential
to identify and record the language(s) used as part of the resource metadata. </p>
<p>That's why <gi>profileDesc</gi> requires <gi>langUsage</gi> and <gi>langUsage</gi> requires
at least one <gi>language</gi> element.</p>
<p>Regarding the use of the required attribute <att>role</att> and its possible values
(<val>objectLanguage</val>, <val>workingLanguage</val>, <val>sourceLanguage</val> or
<val>targetLanguage</val>), see the specification details for <gi>language</gi>. </p>
</div>
</div>
5 changes: 3 additions & 2 deletions Schemas/TEILex0/TEILex0.parts/01__introduction.xml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
<div>
<head>DARIAH Working Group</head>
<p>The DARIAH Working Group on Lexical Resources is a self-organized scholarly community
working under the auspicies of the pan-European Digital Research Infrastructure for
working under the auspices of the pan-European Digital Research Infrastructure for
Arts and Humanities (DARIAH-EU). The goals of the WG are: <list>
<item>to explore, assess and recommend standard tools and methods for the
creation, application and dissemination of born-digital and retro-digitized
Expand Down Expand Up @@ -141,7 +141,7 @@
the support of the German Ministry of Education and Research (BMBF), CLARIN
and DARIAH-DE. For an overview, check out this <ref
target="https://digilex.hypotheses.org/386">blog post</ref>.</item>
<item><ref target="https://lexmc18.sciencesconf.org">Lexcal Data Masterclass
<item><ref target="https://lexmc18.sciencesconf.org">Lexical Data Masterclass
2018</ref>. Co-organized by DARIAH, the Berlin Brandenburg Academy of
Sciences (BBAW), Inria and the Belgrade Center for Digital Humanities, with
the support of the German Ministry of Education and Research (BMBF), French
Expand Down Expand Up @@ -218,6 +218,7 @@
<head>The guidelines</head>
<!-- <p>TODO: How to use these guidelines. </p>-->
<divGen type="how-to-cite"></divGen>
<divGen type="revision-history"></divGen>
</div>

</div>
4 changes: 2 additions & 2 deletions Schemas/TEILex0/TEILex0.parts/02__entries.xml
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,8 @@
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<profileDesc>
<langUsage>
<language ident="mix">Mixtepec Mixtec</language>
<language ident="mix-x-YCNY">Yucanany Mixtec</language>
<language ident="mix" role="objectLanguage">Mixtepec Mixtec</language>
<language ident="mix-x-YCNY" role="objectLanguage">Yucanany Mixtec</language>
</langUsage>
</profileDesc>
</egXML>
Expand Down
2 changes: 1 addition & 1 deletion Schemas/TEILex0/TEILex0.parts/03__forms.xml
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@
indistinguishable from <gi>entry</gi> itself. Eventually, the new content model of
<gi>entry</gi>, which allows nesting, was adopted by TEI itself (<ref
target="#Tasovac2020">Tasovac 2020</ref>).</p>
<p>TODO: explain different types of mwe's from a dict. model perspective refering to <ref
<p>TODO: explain different types of mwe's from a dict. model perspective referring to <ref
target="#Tasovac2020">Tasovac 2020</ref>)</p>
<div>
<head>Collocations</head>
Expand Down
2 changes: 1 addition & 1 deletion Schemas/TEILex0/TEILex0.parts/05__cross-references.xml
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@
<head>Related</head>
<p> The default reference to another lexical unit when no more granular information
about the type of relationship is available.</p>
<p>In TEI Lex-0, cross-references are by default enoded as <code>&lt;xr
<p>In TEI Lex-0, cross-references are by default encoded as <code>&lt;xr
type=&quot;related&quot;&gt;&lt;/xr&gt;</code>.</p>
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<xi:include href="../TEILex0.examples/examples.stripped.xml" corresp="../TEILex0.examples/examples.xml" xpointer="borcht"/>
Expand Down
Loading

0 comments on commit fb889b9

Please sign in to comment.