Skip to content

this is the abraxa lexicon, a multilingual database of phonetic transcriptions to search for cognates and homophones across languages

License

Notifications You must be signed in to change notification settings

IkuStudies/abraxalexicon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

abraxalexicon

this is the abraxa lexicon, a multilingual vector memory milvus database of phonetic transcriptions to search for cognates and homophones across languages

much of the code in this repository was written as language processing modules for parsing data, for formatting and uploading to database.

work in progress

languages supported rn english greek arabic farsi german french burmese dutch turkish

with version 1. now that ive figured out vector mapping support is coming quickly for

japanese chinese hebrew finnish swedish frisian swahili khazak tamil and middle egyptian

update 5/17/2023: abraxa lexicon IPA(international phonetic alphabet) mapping key finalized

(link to key) (link to key breakdown)

and (link to how the abraxa lexicon works for deep language searches)

todo make graphics showing abraxa workflow for user literacy: input word is given in english or whatever epitran to phoneme phoneme to key to embedding embedding to abraxa

configure search overall homophone accuracy strength 0-100

consonant strength 0-100 vowel strength 0-100

todo group consanants embeddings and vowels embeddings offer configuration variable for string comparison strength

exception options if M and W are to be grouped in a search z and N A and V H and I f and s

etc

from old literature

will take input from philologists once search is live

to configure a wide range of vector search configurations

--i just finished formatting all the json files uploading today into milvus vector memory

ikustudies

About

this is the abraxa lexicon, a multilingual database of phonetic transcriptions to search for cognates and homophones across languages

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published