Releases: autotyp/autotyp-data
AUTOTYP 1.1.1
AUTOTYP 1.1.0
The main changes in this release are:
- A new CLDF export, kindly contributed by Robert Forkel
- Revamp of the verb synthesis datasets which has been renamed to
MaximallyInflectedVerbSynthesis
(see below) - Enhancements to
GrammaticalRelations
Important: this release introduces breaking changes to the Morphology module.
CLDF export
A CLDF export of AUTOTYP database was one of the most requested features since the new release.
We are very thankful to Robert Forkel for writing the code that generates a CLDF dataset of
AUTOTYP data. Starting from AUTOTYP 1.1.0 there is a CLDF dataset in data
. The accompanying
Python scripts used to generate this export can be found in the repository
autotyp/autotyp-cldf-scripts
Revamp of verb synthesis datasets in the Morphology module
We wanted to clarify that the verb synthesis dataset in the Morphology module describes
maximally inflected verb forms only rather than all attested verb forms. To make this more
obvious we have renamed the affected datasets to include the phrase MaximallyInflected
as well as
updated the variable metadata to clarify this point. Thus:
VerbSynthesis
becomesMaximallyInflectedVerbSynthesis
VerbInflectionAndAgreementCountsByPosition
becomesMaximallyInflectedVerbInflectionAndAgreementCountsByPosition
VerbAgreementAggregatedByMarkerPosition
becomesMaximallyInflectedVerbAgreementAggregatedByMarkerPosition
VerbInflectionCategoriesAggregatedByMarkerPosition
becomesMaximallyInflectedVerbInflectionCategoriesAggregatedByMarkerPosition
- etc.
This is a breaking change.
Enhancements to GrammaticalRelations
Multiple new variables that describe properties of grammatical relations have been added to
GrammaticalRelations
dataset. Please refer to the list of changes below.
autotyp-data 1.0.1
AUTOTYP 1.0.1
This is a bugfix release that focuses on JSON output and improving metadata for variables of type
value list. Notable changes:
- improved JSON output
- improved and corrected the metadata for multiple variables of the type value list
- improved the bibliography data, added Glottolog language and reference IDs (many thanks to
Robert Forkel for doing this work) - minor data fixes (duplicate entries in datasets
Alienability
,Gender
andNumeralClassifiers
)
Many thanks to Robert Forkel for reporting many of these issues and curating the bibliography
files!
autotyp-data 1.0.0
This is a completely new release, radically overhauled from the earlier 0.1.x version, and focuses on usability, documentation, and completeness. New features include:
- Over 260 typological variables that describe 1,319 languages across approximately 260,000 data points or, together with the derived (aggregated) data, over 1,700,000 data points.
- New naming conventions for datasets and variables, focusing on usability and clarity.
- Language name and Glottolog code now accompany every dataset, so each dataset is a self-standing table of a typological variable (but can also be linked to any and all of the others via the internal language ID).
- Published data now includes the raw exported database data as well as derived aggregated tables. All aggregation scripts used to compute derived data are published as well.
- New R and JSON exports for users who prefer those environments.
For a complete list of major new features see:
https://github.com/autotyp/autotyp-data/blob/v1.0.0/CHANGES-1.0.0.md
For general information about the database:
https://github.com/autotyp/autotyp-data/blob/v1.0.0/readme.md
Please cite as
Bickel, Balthasar, Nichols, Johanna, Zakharko, Taras, Witzlack-Makarevich, Alena, Hildebrandt, Kristine, Rießler, Michael, Bierkandt, Lennart, Zúñiga, Fernando & Lowe, John B. 2022. The AUTOTYP database (v1.0.0). https://doi.org/10.5281/zenodo.5931509
autotyp-data 0.1.2
-
General:
- Multiple data points updated
-
Register:
- New languages added: Musqueam, Blagar, Kaitetu, Kol
autotyp-data 0.1.1
autotyp-data 0.1.1
-
General:
- Multiple data points updated
-
Register:
- Glottocodes updated to Glottolog 4.2, previous glottocode field renamed to
Glottocode.2014
- New languages added: Anal, Baima, Daai Chin, Hakhun, Khroskyabs, Konyak Naga, Liangmai Naga,
Lolopo, Maram Naga, Phola, Phuza, South Estonian, Tagin, Ugong, Wusa Nasu, Yacham-Tengsa, Zbu, Zhaba
- Glottocodes updated to Glottolog 4.2, previous glottocode field renamed to
-
Grammatical_markers:
Exponence
field is now a seminolon-separated contatenation of feature values, the values
are now pre-sorted- New language descriptions added: vá-Canoeiro, Cocama, Guajá, Guarayu, Kaiwá, Kayabí,
Pai Tavytera, Parakanã, Tocantins Asurini, Wayampi
-
Synthesis:
-
New language descriptions added: Amdo, Byangsi, Jugli, Khaling, Khroskyabs, Kulung, Mizo (Lushai),
Palula, Rongpo, Saami (Kildin), Saami (Pite), Saami (South), Sheko, Sidaama, Thulung (Mukli),
Tibetan (Classical), Zaiwa -
Fixed an unfortunate typo in a variable name (VInlfCatSurveyComplete -> VInflCatSurveyComplete)
-
Initial release
Initial release of AUTOTYP data