Skip to content

Commit

Permalink
🐛 Deduplicate model names
Browse files Browse the repository at this point in the history
  • Loading branch information
pajowu committed Jul 7, 2023
1 parent 3acf70b commit e0f323e
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 5 deletions.
10 changes: 5 additions & 5 deletions server/app/models.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ German:
size: 1.9G
type: transcription
compressed: true
- name: big
- name: big-2
url: https://alphacephei.com/vosk/models/vosk-model-de-tuda-0.6-900k.zip
description: Latest big wideband model from <a href="https://github.com/uhh-lt/kaldi-tuda-de">Tuda-DE</a>
project
Expand Down Expand Up @@ -108,7 +108,7 @@ Russian Other:
size: 1.5G
type: transcription
compressed: true
- name: big
- name: big-2
url: https://alphacephei.com/vosk/models/vosk-model-ru-0.10.zip
description: Big narrowband Russian model for servers
size: 2.5G
Expand Down Expand Up @@ -235,7 +235,7 @@ Arabic:
size: 318M
type: transcription
compressed: true
- name: big
- name: big-2
url: https://alphacephei.com/vosk/models/vosk-model-ar-0.22-linto-1.1.0.zip
description: Big model from <a href="https://doc.linto.ai/#/services/linstt">LINTO</a>
project
Expand All @@ -249,7 +249,7 @@ Farsi:
size: 47M
type: transcription
compressed: true
- name: small
- name: small-2
url: https://alphacephei.com/vosk/models/vosk-model-small-fa-0.5.zip
description: Bigger small model for desktop application (Persian)
size: 60M
Expand All @@ -270,7 +270,7 @@ Ukrainian:
size: 73M
type: transcription
compressed: true
- name: small
- name: small-2
url: https://alphacephei.com/vosk/models/vosk-model-small-uk-v3-small.zip
description: Small model from <a href="https://github.com/egorsmkv/speech-recognition-uk">Speech
Recognition for Ukrainian</a>
Expand Down
10 changes: 10 additions & 0 deletions server/scripts/generate_models_list.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,9 +104,19 @@ def print_table_from_dict_list(dict_list, columns=None):
print_table_from_dict_list(models, columns=["lang", "name", "url"])

by_language = defaultdict(list)
names_by_language = defaultdict(set)
for model in models:
lang = model["lang"]
del model["lang"]

i = 2
name = model["name"]
while name in names_by_language[lang]:
name = model["name"] + "-" + str(i)
i += 1
model["name"] = name
names_by_language[lang].add(name)

by_language[lang] += [model]

with open(Path(__file__).parent.parent / "app" / "models.yml", "w") as outfile:
Expand Down

0 comments on commit e0f323e

Please sign in to comment.