Voice-to-Text is a Python application that leverages AI4BHARAT models to perform voice to text conversion using API calls. This project allows you to transcribe spoken words from audio recordings into written text, making it useful for various applications such as transcription services, voice assistants, and more.
- Utilizes AI4BHARAT's pre-trained models for accurate and efficient voice-to-text conversion.
- Easy-to-use Python script for making API calls and obtaining transcriptions.
- Support for various audio formats.
- Customization options for API requests to adapt to different use cases.
- Detailed documentation to help you get started quickly.
To get started with Voice-to-Text, follow these installation steps:
-
Clone the repository to your local machine:
git clone https://github.com/your-username/voice-to-text.git cd voice-to-text
-
Install the required Python packages using pip:
pip install -r requirements.txt
To run the Voice-to-Text application and make a POST request via the terminal, follow these steps:
-
Open your terminal or command prompt.
-
Navigate to the project directory:
cd path/to/voice-to-text
-
Start the Flask application by running the Python script:
python python_script.py
The app will start running with the host set to '127.0.0.1' and port set to 5000.
-
Open another terminal window or tab.
-
Use curl to make a POST request to your Flask API endpoint with the specified parameters, including the audio file:
curl -X POST -F "service_id=your_service_id" -F "src_lang_code=your_language_code" -F "audio_content=@path/to/your/audio_file.wav" http://localhost:5000/transcribe
Replace the following placeholders:
- your_service_id with the appropriate service ID from service_ids.xlsx.
- your_language_code with the desired language code.
- path/to/your/audio_file.wav with the path to your audio file.
-
Press Enter to make the POST request.
This will send a POST request to your Flask API via the terminal, including the specified parameters and audio file. Your Flask server should process the request and return the transcript.
If you encounter any issues, please ensure that your Flask server is running on http://localhost:5000, and that you've followed the steps correctly.
Before using Voice-to-Text, you need to configure the API credentials. Follow these steps:
-
Sign up or log in to your AI4BHARAT account.
-
Generate API credentials, such as an API key or token, from your AI4BHARAT dashboard.
-
Open the config.json file located in the project directory.
-
Replace the placeholder values in config.json with your API credentials:
{ "api_key": "your_api_key_here", "api_url": "https://api.ai4bharat.org/asr/v0.2/recognize" }
Replace
"your_api_key_here"
with your actual API key/token. -
Save the
config.json
file.
-
Download the Samaaja app
bench get-app https://github.com/fossunited/Samaaja
-
Install the app on your site
bench --site <your-site-name-here> install-app samaaja
For locatiing your Samaaja app follow the given steps :
-
Navigate to frappe folder, open frappe-bench.
-
Navigate to apps/saamaja
-
Open this folder in a VS Code-like editor.
To run the Voice-to-Text application in the samaaja interface and make a POST request via the terminal, follow these steps:
-
Inside the "Samaaja" folder, navigate to the "samaaja/api" folder.
-
Open the "voice_to_text" folder in the VS Code-like editor.
-
Locate the "frappe_script.py" file inside the "voice_to_text" folder.
-
Copy the "frappe_script.py" file.
-
Go back to the "api" folder (step 3), and paste the copied "frappe_script.py" file there.
-
Open the "frappe_script.py" file that you've just pasted into the "api" folder.
-
Inside the "frappe_script.py" file, locate the following lines:
API_KEY = "Your_API_KEY_here" INFERENCE_URL = "Your_INFERENCE_URL_here"
-
Replace "Your_API_KEY_here" with your actual API key.
-
Replace "Your_INFERENCE_URL_here" with your actual inference URL.
-
Save the changes to the "frappe_script.py" file.
-
Open your terminal or command prompt.
-
Navigate to the project directory:
cd path/to/frappe-bench
-
Run frappe-bench:
bench start
-
Open another terminal window or tab.
-
Use curl to make a POST request to your Flask API endpoint with the specified parameters, including the audio file:
curl -X POST -F "service_id=your_service_id" -F "src_lang_code=your_language_code" -F "audio_url=http://example.com/path/to/your/audio_file.wav" http://127.0.0.1:8000/api/method/samaaja.api.frappe_script.transcribe_audio
Replace the following placeholders:
- your_service_id with the appropriate service ID from service_ids.xlsx.
- your_language_code with the desired language code.
- http://example.com/path/to/your/audio_file.wav with the URL to your audio file.
-
Press Enter to make the POST request.
This will send a POST request to your Frappe API via the terminal, including the specified parameters and audio file url. Your Frappe server should process the request and return the transcript.
If you encounter any issues, please ensure that your Frappe server is running on http://127.0.0.1:8000, and that you've followed the steps correctly.
We welcome contributions to improve Voice-to-Text. To contribute, follow these steps:
-
Fork the repository.
-
Create a new branch for your feature or bug fix:
git checkout -b feature/your-feature
-
Make your changes, test thoroughly, and ensure proper documentation.
-
Commit your changes with clear and concise messages.
-
Push your changes to your fork:
git push origin feature/your-feature
-
Create a pull request to the main repository's
master
branch, describing your changes and their purpose.
This project is licensed under the GNU GENERAL PUBLIC LICENSE. See the LICENSE file for details.