Skip to content

Commit

Permalink
fixed JSON syntax errors
Browse files Browse the repository at this point in the history
  • Loading branch information
kreetrapper committed Aug 29, 2024
1 parent 4da4563 commit 8559294
Show file tree
Hide file tree
Showing 50 changed files with 62 additions and 62 deletions.
2 changes: 1 addition & 1 deletion corpora/academic-corpora/aca-hum.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"Annotation": [],
"Infrastructure": "CLARIN",
"Access": {
"Concordancer": "https://spraakbanken.gu.se/korp/?corpus=sweachum"
"Concordancer": "https://spraakbanken.gu.se/korp/?corpus=sweachum",
"Download": "http://hdl.handle.net/10794/49"
},
"Publication":""
Expand Down
2 changes: 1 addition & 1 deletion corpora/academic-corpora/aca-soc.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"Annotation": ["sentence segmentation"],
"Infrastructure": "CLARIN",
"Access": {
"Concordancer": "https://spraakbanken.gu.se/korp/?corpus=sweacsam"
"Concordancer": "https://spraakbanken.gu.se/korp/?corpus=sweacsam",
"Download": "http://hdl.handle.net/10794/50"
},
"Publication":""
Expand Down
4 changes: 2 additions & 2 deletions corpora/academic-corpora/jezkor.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@
"Annotation": ["PoS-tagged (UD)", "MSD-tagged (UD & MULTEXT-East)", "lemmatised", "annotated for named entities and author/text metadata"],
"Infrastructure": "CLARIN",
"Access": {
"Concordancer (noSketchEngine)": "https://www.clarin.si/ske/#dashboard?corpname=jezkor"
"Concordancer (KonText)": "https://www.clarin.si/kontext/query?corpname=jezkor"
"Concordancer (noSketchEngine)": "https://www.clarin.si/ske/#dashboard?corpname=jezkor",
"Concordancer (KonText)": "https://www.clarin.si/kontext/query?corpname=jezkor",
"Download": "http://hdl.handle.net/11356/1755"
},
"Publication":""
Expand Down
6 changes: 3 additions & 3 deletions corpora/academic-corpora/open-slo.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@
"Name": "Corpus of scientific texts from the Open Science Slovenia portal OSS 1.0",
"URL": "http://hdl.handle.net/11356/1774",
"Family": "Academic corpora",
"Description": "This corpus contains a large collection of scientific writing in the Slovenian language gathered from the <a href="https://openscience.si">Open Science Slovenia portal</a>. It consists of over 150 thousand monographs, articles, diploma, master's and doctoral theses, advanced textbooks, reviews etc. mostly published between 2000 and 2022 by Slovenian universities, research institutions, etc. Texts are accompanied by metadata, i.e. author, supervisor (for theses), year of publication, publisher (mostly faculties of the various universities), type of publication (according to SICRIS classification), keywords, and CERIF and UDC codes. The texts were obtained directly from PDFs, so it should be noted that they can contain various types of character noise. The texts are linguistically annotated with the <a href=\"https://github.com/clarinsi/classla\">CLASSLA pipeline</a> on the levels lemmatisation, MULTEXT-East Version 6 morphosyntactic descriptions, Universal Dependencies part-of-spech and morphological features, and named entities. The corpus is distributed in CoNLL-U and vertical file formats, one file for each text. The text metadata is given as a TSV file.\nNote that there exist similar, but older and smaller corpora <a href=\"http://hdl.handle.net/11356/1448\">KAS 2.0</a> and <a href=\"http://hdl.handle.net/11356/1244\">KAS 1.0</a>. These contain only theses and only up to 2018, but are cleaner and with more metadata. The repository also archives a number of KAS-derived datasets; pls. search for "KAS" to find them.\nThe corpus is available for download from the CLARIN.SI repository as well as for online browsing through the noSketch Engine and KonText concordancers.",
"Description": "This corpus contains a large collection of scientific writing in the Slovenian language gathered from the <a href=\"https://openscience.si\">Open Science Slovenia portal</a>. It consists of over 150 thousand monographs, articles, diploma, master's and doctoral theses, advanced textbooks, reviews etc. mostly published between 2000 and 2022 by Slovenian universities, research institutions, etc. Texts are accompanied by metadata, i.e. author, supervisor (for theses), year of publication, publisher (mostly faculties of the various universities), type of publication (according to SICRIS classification), keywords, and CERIF and UDC codes. The texts were obtained directly from PDFs, so it should be noted that they can contain various types of character noise. The texts are linguistically annotated with the <a href=\"https://github.com/clarinsi/classla\">CLASSLA pipeline</a> on the levels lemmatisation, MULTEXT-East Version 6 morphosyntactic descriptions, Universal Dependencies part-of-spech and morphological features, and named entities. The corpus is distributed in CoNLL-U and vertical file formats, one file for each text. The text metadata is given as a TSV file.\nNote that there exist similar, but older and smaller corpora <a href=\"http://hdl.handle.net/11356/1448\">KAS 2.0</a> and <a href=\"http://hdl.handle.net/11356/1244\">KAS 1.0</a>. These contain only theses and only up to 2018, but are cleaner and with more metadata. The repository also archives a number of KAS-derived datasets; pls. search for \"KAS\" to find them.\nThe corpus is available for download from the CLARIN.SI repository as well as for online browsing through the noSketch Engine and KonText concordancers.",
"Languages": ["slv"],
"License": "CC BY-SA",
"Size": ["326 million tokens"],
"Annotation": ["PoS-tagged (UD)", "MSD-tagged (UD & MULTEXT-East)", "lemmatised", "annotated for named entities and author/text metadata"],
"Infrastructure": "CLARIN",
"Access": {
"Concordancer (noSketchEngine)": "https://www.clarin.si/ske/#dashboard?corpname=oss10"
"Concordancer (KonText)": "https://www.clarin.si/kontext/query?corpname=oss10"
"Concordancer (noSketchEngine)": "https://www.clarin.si/ske/#dashboard?corpname=oss10",
"Concordancer (KonText)": "https://www.clarin.si/kontext/query?corpname=oss10",
"Download": "http://hdl.handle.net/11356/1774"
},
"Publication":""
Expand Down
2 changes: 1 addition & 1 deletion corpora/academic-corpora/roysoc.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"Annotation": ["PoS-tagged", "lemmatised", "normalised", "author and document metadata"],
"Infrastructure": "CLARIN",
"Access": {
"Concordancer": "http://fedora.clarin-d.uni-saarland.de/rsc_v4/access.html#cqpweb"
"Concordancer": "http://fedora.clarin-d.uni-saarland.de/rsc_v4/access.html#cqpweb",
"Download": "http://fedora.clarin-d.uni-saarland.de/rsc_v4/access.html#download"
},
"Publication": "https://www.zotero.org/groups/562080/items/FWYERQ4A"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"URL": "https://sla.talkbank.org/TBB/dementia",
"Family": "Corpora of Disordered Speech",
"Description": "This is a corpus of multimedia interactions for the study of communication in dementia.\nAccess to the data in DementiaBank is password protected and restricted to members of the DementiaBank consortium group.\nData in TalkBank use a consistent XML-compatible representation called CHAT. All of the data is transcribed in CHAT and CA/CHAT formats.",
"Languages": ["eng", "deu", "cmn", "spa", "nan" (Taiwanese)],
"Languages": ["eng", "deu", "cmn", "spa", "Taiwanese"],
"License": "email request for access",
"Size": [],
"Annotation": ["CHAT and CA/CHAT"],
Expand Down
2 changes: 1 addition & 1 deletion corpora/corpora-of-disordered-speech/polish-cued.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"Annotation": ["CHAT format"],
"Infrastructure": "CLARIN",
"Access": {
"Download": ""https://hdl.handle.net/1839/dbcd8568-d17d-4861-94bb-aa553e943399
"Download": "https://hdl.handle.net/1839/dbcd8568-d17d-4861-94bb-aa553e943399"
},
"Publication": ""
}
2 changes: 1 addition & 1 deletion corpora/historical-corpora/anno-cuneiform.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["1,600,563 tokens"],
"Annotation": ["tokenised", "lemmatised", "PoS-tagged", "semantically annotated"],
"Access": {
"Concordancer": "http://urn.fi/urn:nbn:fi:lb-2019060601"
"Concordancer": "http://urn.fi/urn:nbn:fi:lb-2019060601",
"Download": "http://urn.fi/urn:nbn:fi:lb-2019111602"
},
"Publication": ""
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/b4-hist-preach.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["92,500 tokens"],
"Annotation": ["tokenised", "syntactic and discursive annotation"],
"Access": {
"Concordancer": "http://annis.corpora.uni-hamburg.de:8080/gui/sfb632"
"Concordancer": "http://annis.corpora.uni-hamburg.de:8080/gui/sfb632",
"Download": "http://hdl.handle.net/11022/0000-0000-9B23-A"
},
"Publication": ""
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/b4-ludolf.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["6,690 tokens"],
"Annotation": ["tokenised", "tagged for clause type and grammatical function"],
"Access": {
"Concordancer": "http://annis.corpora.uni-hamburg.de:8080/gui/sfb632"
"Concordancer": "http://annis.corpora.uni-hamburg.de:8080/gui/sfb632",
"Download": "http://hdl.handle.net/11022/0000-0000-9B22-B"
},
"Publication": ""
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/b4-tatian.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["11,300 tokens"],
"Annotation": ["tokenised", "MSD-tagged"],
"Access": {
"Concordancer": "http://annis.corpora.uni-hamburg.de:8080/gui/sfb632"
"Concordancer": "http://annis.corpora.uni-hamburg.de:8080/gui/sfb632",
"Download": "http://hdl.handle.net/11022/0000-0000-9B1E-1"
},
"Publication": ""
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/dig-hist-slovene.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["17.7 million tokens"],
"Annotation": ["tokenised", "lemmatised", "PoS-tagged"],
"Access": {
"Concordancer": "https://www.clarin.si/kontext/first_form?corpname=imp"
"Concordancer": "https://www.clarin.si/kontext/first_form?corpname=imp",
"Download": "http://hdl.handle.net/11356/1031"
},
"Publication": "Erjavec (2015)."
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/ecco-tcp.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["74 million tokens"],
"Annotation": ["no linguistic annotation"],
"Access": {
"Concordancer": "https://quod.lib.umich.edu/e/ecco/"
"Concordancer": "https://quod.lib.umich.edu/e/ecco/",
"Download": "https://textcreationpartnership.org/tcp-texts/ecco-tcp-eighteenth-century-collections-online/"
},
"Publication": ""
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/gysseling.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["1.5 million words"],
"Annotation": ["PoS-tagged", "lemmatised"],
"Access": {
"Concordancer": "https://corpusgysseling.ivdnt.org/corpus-frontend/Gysseling/search/"
"Concordancer": "https://corpusgysseling.ivdnt.org/corpus-frontend/Gysseling/search/",
"Download": "http://hdl.handle.net/10032/tm-a2-j4"
},
"Publication": ""
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/helsinki-eng.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"URL": "http://hdl.handle.net/20.500.14106/1477",
"Family": "Historical corpora",
"Description": "This corpus contains religious and fictional texts from 730 to 1710.\nSee <a href=\"http://icame.uib.no/hc/#con2\">the project page</a> for a list of all the texts included in the corpus.\nThe corpus is available for download from the Oxford Text Archive.",
"Languages": [English (Old and Middle)],
"Languages": ["English (Old)", "English (Middle)"],
"License": "Oxford Text Archive licence",
"Size": ["240,000 words"],
"Annotation": [],
Expand Down
4 changes: 2 additions & 2 deletions corpora/historical-corpora/hist-am-eng.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
"URL": "http://urn.fi/urn:nbn:fi:lb-2017061925",
"Family": "Historical corpora",
"Description": "This corpus contains texts from 1810 to 2009.\nEach decade has roughly the same balance of fiction, popular magazine, newspaper, and non-fiction books.\nThe corpus is available through the concordancer Korp.",
"Languages": [English (American)],
"License": "CLARN ACA",
"Languages": ["English (American)"],
"License": "CLARIN ACA",
"Size": ["385 million tokens"],
"Annotation": ["tokenised"],
"Access": {
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/late-modern-en-texts.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"URL": "http://hdl.handle.net/21.11119/0000-0002-43F3-0",
"Family": "Historical corpora",
"Description": "This corpus contains texts written by British and Irish authors from 1710 to 1920.\nIn terms of genre, the texts correspond to narrative fiction and non-fiction, drama, letters, treatises, and miscellaneous written works.\nThe corpus is available for download from a CLARIN-D repository. ",
"Languages": [English (Late Modern)],
"Languages": ["English (Late Modern)"],
"License": "CC-BY-NC-SA 4.0",
"Size": ["34 million words"],
"Annotation": ["PoS-tagged"],
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/latinise.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["13.3 million tokens"],
"Annotation": ["sentence segmented", "PoS-tagged", "lemmatized"],
"Access": {
"Concordancer": "https://app.sketchengine.eu/#dashboard?corpname=preloaded%2Flatinise_4"
"Concordancer": "https://app.sketchengine.eu/#dashboard?corpname=preloaded%2Flatinise_4",
"Download": "http://hdl.handle.net/11372/LRT-3170"
},
"Publication": "McGillivray and Kilgarriff (2015)"
Expand Down
4 changes: 2 additions & 2 deletions corpora/historical-corpora/menota.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@
"URL": "http://clarino.uib.no/menota/page",
"Family": "Historical corpora",
"Description": "This corpus contains Medieval Nordic texts.\nThe corpus is available for download and through the concordancer Corpuscle.",
"Languages": [Old Norse],
"Languages": ["Old Norse"],
"License": "CC-BY",
"Size": ["1.6 million tokens"],
"Annotation": ["tokenised", "MSD-tagged", "lemmatised"],
"Access": {
"Concordancer": "http://clarino.uib.no/menota/concordance"
"Concordancer": "http://clarino.uib.no/menota/concordance",
"Download": "http://clarino.uib.no/menota/catalogue"
},
"Publication": ""
Expand Down
4 changes: 2 additions & 2 deletions corpora/historical-corpora/old-bailey.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@
"URL": "http://hdl.handle.net/11858/00-246C-0000-0023-8CFB-2",
"Family": "Historical corpora",
"Description": "This corpus contains proceedings of the Old Bailey (i.e., legal documents) from 1674 to 1913.\nThe corpus is available for download from the CLARIN-D repository and through the CQPConcordancer.\nFor the corpus manual, see <a href=\"https://www.clarin.eu/resource-families/historical-corpora#Huber%20et%20al.%202016\">Huber et al. (2016).</a>",
"Languages": [English (Late Modern)],
"Languages": ["English (Late Modern)"],
"License": "CC-BY-NC-SA 4.0",
"Size": ["134 million words"],
"Annotation": ["detailed sociobiographical, pragmatic and textual annotation"],
"Access": {
"Concordancer": "http://corpora.clarin-d.uni-saarland.de/cqpweb"
"Concordancer": "http://corpora.clarin-d.uni-saarland.de/cqpweb",
"Download": "http://fedora.clarin-d.uni-saarland.de/oldbailey/downloads.html"
},
"Publication": ""
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/old-hungarian.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["3 million tokens"],
"Annotation": ["tokenised", "partially normalized", "partially MSD-tagged"],
"Access": {
"Concordancer": "http://oldhungariancorpus.nytud.hu/en-search.html"
"Concordancer": "http://oldhungariancorpus.nytud.hu/en-search.html",
"Download": "http://oldhungariancorpus.nytud.hu/en-codices.html"
},
"Publication": ""
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/parsed-hist-pt.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["3.3 million"],
"Annotation": ["tokenised", "PoS-tagged (2 million)", "treebanked (1.2 million)"],
"Access": {
"Concordancer": "http://www.tycho.iel.unicamp.br/~tycho/corpus/texts/csquery/en/csquery.html"
"Concordancer": "http://www.tycho.iel.unicamp.br/~tycho/corpus/texts/csquery/en/csquery.html",
"Download": "http://www.tycho.iel.unicamp.br/~tycho/corpus/en/index.html"
},
"Publication": ""
Expand Down
4 changes: 2 additions & 2 deletions corpora/historical-corpora/poldilemma.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"Name": ""PolDiLemma" Middle Polish Diachrone Lemmatised Corpus",
"Name": "\"PolDiLemma\" Middle Polish Diachrone Lemmatised Corpus",
"URL": "http://hdl.handle.net/11858/00-246C-0000-0023-8C44-B",
"Family": "Historical corpora",
"Family": "Historical corpora",
"Description": "This corpus contains political, religious and scientific texts from the 16th to the 18th century.\nThe corpus is available for download from the CLARIN-D repository.",
"Languages": ["ces","lat","deu","pol"],
"License": "CC BY-NC-SA 4.0",
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/ref-hist-slovene.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["300,000 tokens"],
"Annotation": ["manually tokenised", "lemmatised", "PoS-tagged", "modern synonyms for archaic words"],
"Access": {
"Concordancer": "https://www.clarin.si/kontext/first_form?corpname=goo300k"
"Concordancer": "https://www.clarin.si/kontext/first_form?corpname=goo300k",
"Download": "http://hdl.handle.net/11356/1025"
},
"Publication": "Erjavec (2012)."
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/ref-mhd.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["2.5 million tokens"],
"Annotation": ["tokenised", "PoS-tagged", "lemmatised", "normalised", "morphosyntactic description"],
"Access": {
"Concordancer": "http://www.deutschestextarchiv.de/"
"Concordancer": "http://www.deutschestextarchiv.de/",
"Download": "http://deutschestextarchiv.de/rem/"
},
"Publication": "Klein and Dipper (2016)."
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/ref-mid-low-de.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["200,700 tokens"],
"Annotation": ["tokenised", "MSD-tagged"],
"Access": {
"Concordancer": "http://annis.corpora.uni-hamburg.de:8080/gui/#_c=UmVOXzIwMTctMDYtMTU"
"Concordancer": "http://annis.corpora.uni-hamburg.de:8080/gui/#_c=UmVOXzIwMTctMDYtMTU",
"Download": "http://hdl.handle.net/11022/0000-0007-C64C-5"
},
"Publication": "Schröder (2014)."
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/roysoc-corp.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["35 million tokens"],
"Annotation": ["PoS-tagged using PennTreebank tagset", "lemmatised", "normalised"],
"Access": {
"Concordancer": "http://fedora.clarin-d.uni-saarland.de/rsc_v4/access.html#cqpweb"
"Concordancer": "http://fedora.clarin-d.uni-saarland.de/rsc_v4/access.html#cqpweb",
"Download": "http://fedora.clarin-d.uni-saarland.de/rsc_v4/access.html#download"
},
"Publication": ""
Expand Down
4 changes: 2 additions & 2 deletions corpora/historical-corpora/saga.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@
"URL": "https://clarin.is/en/resources/sagacorpus/",
"Family": "Historical corpora",
"Description": "This corpus contains Old Icelandic (Old Norse) Narrative texts from the 13th to the 15th century.\nThe corpus is available for download from CLARIN-IS and for search through the concordancer Korp.",
"Languages": [Icelandic (Old)],
"Languages": ["Icelandic (Old)"],
"License": "CC-BY 4.0",
"Size": ["1.5 million tokens"],
"Annotation": ["tokenised", "PoS-tagged", "lemmatised", "normalized orthography"],
"Access": {
"Concordancer": "https://malheildir.arnastofnun.is/?mode=fornrit#?lang=en&stats_reduce=word&isCaseInsensitive&searchBy=word&cqp=%5B%5D"
"Concordancer": "https://malheildir.arnastofnun.is/?mode=fornrit#?lang=en&stats_reduce=word&isCaseInsensitive&searchBy=word&cqp=%5B%5D",
"Download": "http://www.malfong.is/index.php?dlid=2&lang=en"
},
"Publication": "Rögnvaldsson and Helgadóttir (2011)"
Expand Down
2 changes: 1 addition & 1 deletion corpora/historical-corpora/sprakbanken-hist.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"URL": "https://spraakbanken.gu.se/korp/?mode=all_hist#?lang=en&stats_reduce=word&cqp=%5B%5D",
"Family": "Historical corpora",
"Description": "This collection of corpora contains – among others – diachronic legal texts, Bible translations, medieval letters, digitized newspapers from the Swedish National Library and 19th century fiction from the Swedish Literature Bank.\nThe corpora are available through the concordancer Korp.",
"Languages": [FIXME Swedish, German, French and others],
"Languages": ["swe", "deu", "fra", "and others"],
"License": "CC-BY",
"Size": ["1.34 billion tokens"],
"Annotation": ["tokenised", "PoS-tagged", "lemmatised", "syntactically parsed", "word sense (for materials more recent than 1800)"],
Expand Down
4 changes: 2 additions & 2 deletions corpora/historical-corpora/yu1parl.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
"Size": ["34,542 utterances", "578,958 sentences", "13,271,885 words", "15,403 pages"],
"Annotation": ["tokenised", "MSD-tagged", "lemmatised"],
"Access": {
"Concordancer (noSketch)": "https://www.clarin.si/ske/#dashboard?corpname=yu1parl"
"Concordancer (KonText)": "https://www.clarin.si/kontext/query?corpname=yu1parl"
"Concordancer (noSketch)": "https://www.clarin.si/ske/#dashboard?corpname=yu1parl",
"Concordancer (KonText)": "https://www.clarin.si/kontext/query?corpname=yu1parl",
"Download": "http://hdl.handle.net/11356/1845"
},
"Publication": ""
Expand Down
2 changes: 1 addition & 1 deletion corpora/literary-corpora/anth-me.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"URL": "http://hdl.handle.net/20.500.14106/1398",
"Family": "Literary corpora",
"Description": "This corpus contains literary texts from 1100 to 1400.\nThe corpus is available for download from the Oxford Text Archive.",
"Languages": ["enm"), "heb"],
"Languages": ["enm", "heb"],
"License": "Oxford Text Archive Licence",
"Size": ["4,000 words"],
"Annotation": [],
Expand Down
2 changes: 1 addition & 1 deletion corpora/literary-corpora/bonnier-one.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["6,578,675 tokens", "462,625 sentences"],
"Annotation": ["sentence scrambling"],
"Access": {
"Browse": "https://spraakbanken.gu.se/korp/#corpus=romi"
"Browse": "https://spraakbanken.gu.se/korp/#corpus=romi",
"Download": "http://hdl.handle.net/10794/115"
},
"Publication": ""
Expand Down
2 changes: 1 addition & 1 deletion corpora/literary-corpora/bonnier-two.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"Size": ["4,304,271 tokens", "298,361 sentences"],
"Annotation": ["sentence scrambling"],
"Access": {
"Browse": "https://spraakbanken.gu.se/korp/#corpus=romii"
"Browse": "https://spraakbanken.gu.se/korp/#corpus=romii",
"Download": "http://hdl.handle.net/10794/116"
},
"Publication": ""
Expand Down
Loading

0 comments on commit 8559294

Please sign in to comment.