Skip to content

Latest commit

 

History

History
37 lines (18 loc) · 1.03 KB

README.md

File metadata and controls

37 lines (18 loc) · 1.03 KB

S2T-Speech-To-Text

An end to end full stack application to transform speech to text and perform further downstream tasks like Text Similarity, Text Summarization, Named Entity Recognition

Things to set up: Install Docker

Install ElasticSearch

    Link for Tutorial: https://dylancastillo.co/elasticsearch-python/#what%E2%80%99s-elasticsearch

Install FastAPI

    Link for Tutorial: https://fastapi.tiangolo.com/tutorial/first-steps/

Install Whisper AI

    Tutorial Link: 

        https://medium.com/the-research-nest/how-to-setup-openais-whisper-model-on-windows-10-11-df001d5a350b

    Install ffmpeg

        Download the zip file from https://github.com/BtbN/FFmpeg-Builds/releases

        Extract and put the link to bin folder into System varibales path variable
   
   Install whisper-timestamped

        https://github.com/linto-ai/whisper-timestamped

   Install PyAnnote

        https://github.com/pyannote/pyannote-audio