This repo contains a text fabric dataset of the Ugaritic text corpus. It is work in progress.
This dataset is developed as part of the CACCHT project, which is a collaboration of Christian Canu Højgaard, Martijn Naaijer, Martin Ehrensvärd, Robert Rezetko, Oliver Glanz, and Willem van Peursen. The goal of CACCHT is to prepare and publish ancient Semitic texts digitally, that can be used for research.
For this dataset, we cooperate with Tania Notarius (University of the Free State), Lynn Strietzel and Maria Simion, volunteer assistants (Polis - the Jerusalem Institute of Language and Humanities).
The following tablets of Die keilalphabetischen Texte aus Ugarit (KTU) are currently available:
- KTU 1.1-1.7
- KTU 1.14-1.22
- KTU 2.5-2.18
- KTU 2.20-2.27
- KTU 2.30-2.32
- KTU 2.34-2.44
- KTU 2.46-2.75
- KTU 2.77-2.80
- KTU 2.82-2.100
The texts are currently annotated with the following features:
- tablet: tablet title
- column: column number
- line: line number
- side: tablet side of inscription
- g_cons: a consonantal representation of each word in Latin script
- trailer: a representation of word spacing or word dividers
- language: Ugaritic
- sign: Letter in Latin script
- emen: emendations of various sorts in relation to a sign (including reconstructed, missing, excised, or redundant signs/letters)
- cert: certainty of the text in relation to a sign (corresponding to the italic of KTU)
- cont: marking of line continuation in between lines
- alt: alternative reading