diff --git a/Historical corpora/Historical corpora.html b/Historical corpora/Historical corpora.html index 91d9b83..1a94344 100644 --- a/Historical corpora/Historical corpora.html +++ b/Historical corpora/Historical corpora.html @@ -49,28 +49,26 @@
- The Diorisis Ancient Greek Corpus + Greek Medieval Texts
- Size: 10.2 million words
-
- Annotation: PoS-tagged, lemmatised
+ Size: 3.4 million words
- Licence: CC BY 4.0
+ Licence: CC-BY
This corpus consists of 820 texts spanning between the beginnings of the Ancient Greek literary tradition (Homer) to the fifth century AD.
-The texts are sourced from the Perseus Canonical Greek Lit Repository, "The Little Sailing" digital library, and the Bibliotheca Augustana digital library.
-The corpus is available for download from Figshare.
+This corpus contains texts from the 4th to the 16th century.
+The texts belong to the following categories: religious, poetical-literary, political, and historical texts, as well as hymns and epigrams.
+The corpus is available for download from the clarin:el repository.
-For the relevant publication, see Vatri and McGillivray (2018)
+- Greek Medieval Texts + The Diorisis Ancient Greek Corpus
- Size: 3.4 million words
+ Size: 10.2 million words
- Licence: CC-BY
+ Annotation: PoS-tagged, lemmatised
+
+ Licence: CC BY 4.0
This corpus contains texts from the 4th to the 16th century.
-The texts belong to the following categories: religious, poetical-literary, political, and historical texts, as well as hymns and epigrams.
-The corpus is available for download from the clarin:el repository.
+This corpus consists of 820 texts spanning between the beginnings of the Ancient Greek literary tradition (Homer) to the fifth century AD.
+The texts are sourced from the Perseus Canonical Greek Lit Repository, "The Little Sailing" digital library, and the Bibliotheca Augustana digital library.
+The corpus is available for download from Figshare.
- +For the relevant publication, see Vatri and McGillivray (2018)
- The Nottingham Corpus of Early Modern German Midwifery and Women's Medicine (ca. 1500-1700) + GerManC. A Historical Corpus of German Newspapers 1650-1800
- Size: 120,000 tokens
+ Size: 700,000 words
- Annotation: TEI Lite markup, no linguistic annotation
+ Annotation: no annotation
Licence: CC-BY-NC-SA 3.0
This corpus contains medical writing from 1500 to 1700.
-The texts are taken primarily from digital facsimile copies available online via the University of Würzburg’s library interface, particularly from the subcategory of pertaining to gynaecology.
+This corpus contains personal letters, sermons and fictional, scholarly (i.e., humanities), scientific and legal texts from 1650 to 1800.
The corpus is available for download from the Oxford Text Archive.
- GerManC. A Historical Corpus of German Newspapers 1650-1800 + Mannheimer Korpus Historischer Zeitungen und Zeitschriften
- Size: 700,000 words + Size: 3532 pages +
+This corpus contains texts from the 18th and 19th centuries.
+The corpus is available for download directly through the VLO.
+ + ++ Referenzkorpus Mittelhochdeutsch (Middle High German Reference Corpus) +
+
+ Size: 2.5 million tokens
- Annotation: no annotation
+ Annotation: tokenised, PoS-tagged, lemmatised, normalised, morphosyntactic description
- Licence: CC-BY-NC-SA 3.0
+ Licence: CC-BY-SA 4.0
This corpus contains personal letters, sermons and fictional, scholarly (i.e., humanities), scientific and legal texts from 1650 to 1800.
-The corpus is available for download from the Oxford Text Archive.
+This corpus contains texts from 1050 to 1350.
+The corpus is available for download from the Deutsches Text Archiv and through a concordancer.
- +For the relevant publication, see Klein and Dipper (2016).
- Mannheimer Korpus Historischer Zeitungen und Zeitschriften + SaCoCo—Saarbrücken Cookbook Corpus
- Size: 3532 pages
+ Size: 436,000 tokens
+
+ Annotation: PoS-tagged using the STTS tagset, lemmatised, normalised
+
+ Licence: CC-BY-NC-SA-3.0
This corpus contains texts from the 18th and 19th centuries.
-The corpus is available for download directly through the VLO.
+This corpus contains historical cookbook recipes from 1569 to 1800, as well as contemporary ones from 2012.
+The corpus is available through the CQPweb concordancer provided by CLARIN-D.
- Referenzkorpus Mittelhochdeutsch (Middle High German Reference Corpus) + The Nottingham Corpus of Early Modern German Midwifery and Women's Medicine (ca. 1500-1700)
- Size: 2.5 million tokens
+ Size: 120,000 tokens
- Annotation: tokenised, PoS-tagged, lemmatised, normalised, morphosyntactic description
+ Annotation: TEI Lite markup, no linguistic annotation
- Licence: CC-BY-SA 4.0
+ Licence: CC-BY-NC-SA 3.0
This corpus contains texts from 1050 to 1350.
-The corpus is available for download from the Deutsches Text Archiv and through a concordancer.
+This corpus contains medical writing from 1500 to 1700.
+The texts are taken primarily from digital facsimile copies available online via the University of Würzburg’s library interface, particularly from the subcategory of pertaining to gynaecology.
+The corpus is available for download from the Oxford Text Archive.
-For the relevant publication, see Klein and Dipper (2016).
+- SaCoCo—Saarbrücken Cookbook Corpus -
-
- Size: 436,000 tokens
-
- Annotation: PoS-tagged using the STTS tagset, lemmatised, normalised
-
- Licence: CC-BY-NC-SA-3.0
-
This corpus contains historical cookbook recipes from 1569 to 1800, as well as contemporary ones from 2012.
-The corpus is available through the CQPweb concordancer provided by CLARIN-D.
- - -+ CIPM +
+
+ Size: 3.5 million words
+
+ Licence: CC-BY-NC-ND
+
This is a corpus of historical, religious, notarial, literary texts in prose and verse.
+The corpus is available from PORTULAN.
+ + ++ Portuguese Parish Memories (1758) +
++ Licence: CC BY +
+This is a corpus of historical surveys from the 18th century.
+The corpus is available from PORTULAN.
+ + +