Indic Translate Server

Overview

This project sets up an Indic translation server using Docker Compose, allowing translation between various languages including English, Kannada, Hindi, and others. It utilizes models from AI4Bharat to perform translations.

Languages Supported

Here is the list of languages supported by the IndicTrans2 models:

Assamese (asm_Beng)	Kashmiri (Arabic) (kas_Arab)	Punjabi (pan_Guru)
Bengali (ben_Beng)	Kashmiri (Devanagari) (kas_Deva)	Sanskrit (san_Deva)
Bodo (brx_Deva)	Maithili (mai_Deva)	Santali (sat_Olck)
Dogri (doi_Deva)	Malayalam (mal_Mlym)	Sindhi (Arabic) (snd_Arab)
English (eng_Latn)	Marathi (mar_Deva)	Sindhi (Devanagari) (snd_Deva)
Konkani (gom_Deva)	Manipuri (Bengali) (mni_Beng)	Tamil (tam_Taml)
Gujarati (guj_Gujr)	Manipuri (Meitei) (mni_Mtei)	Telugu (tel_Telu)
Hindi (hin_Deva)	Nepali (npi_Deva)	Urdu (urd_Arab)
Kannada (kan_Knda)	Odia (ory_Orya)

Live Server

We have hosted an Translation service for Indian languages. The service is available in two modes:

High Latency, Slow System (Available 24/7)

URL: High Latency ASR Service

Low Latency, Fast System (Available on Request)

URL: Low Latency ASR Service

How to Use the Service

With curl

You can test the service using curl commands. Below are examples for both service modes:

High Latency Service

curl -X 'POST' \
  'https://gaganyatri-translate-indic-server-cpu.hf.space/translate?src_lang=kan_Knda&tgt_lang=eng_Latn&device_type=cpu' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "sentences": [
     "ನಮಸ್ಕಾರ, ಹೇಗಿದ್ದೀರಾ?", "ಶುಭೋದಯ!"
  ],
  "src_lang": "kan_Knda",
  "tgt_lang": "eng_Latn"
}'

Low Latency Service - GPU server on demand

curl -X 'POST' \
  'https://gaganyatri-translate-indic-server.hf.space/translate?src_lang=kan_Knda&tgt_lang=eng_Latn&device_type=gpu' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "sentences": [
     "ನಮಸ್ಕಾರ, ಹೇಗಿದ್ದೀರಾ?", "ಶುಭೋದಯ!"
  ],
  "src_lang": "kan_Knda",
  "tgt_lang": "eng_Latn"
}'

Via Swagger UI

URL: High Latency translation Service
URL: Low Latency translation Service

Prerequisites

Docker and Docker Compose installed on your machine.
Python 3.x installed for the development environment.
Internet access to download translation models.

Running with Docker Compose

Start the server:
```
docker compose -f compose.yaml up -d
```

Setting Up the Development Environment

Create a virtual environment:
```
python -m venv venv
```
Activate the virtual environment:
```
source venv/bin/activate
```
Install dependencies:
```
pip install -r requirements.txt
```

Model Downloads for Translation

Collection Models on HuggingFace - IndicTrans2

Below is a table summarizing the available models for different translation tasks:

Task	Variant	Model Name	VRAM Size	Download Command
Indic to English	200M (distilled)	indictrans2-indic-en-dist-200M	950 MB	`huggingface-cli download ai4bharat/indictrans2-indic-en-dist-200M`
	1B (base)	indictrans2-indic-en-1B	4.5 GB	`huggingface-cli download ai4bharat/indictrans2-indic-en-1B`
English to Indic	200M (distilled)	indictrans2-en-indic-dist-200M	950 MB	`huggingface-cli download ai4bharat/indictrans2-en-indic-dist-200M`
	1B (base)	indictrans2-en-indic-1B	4.5 GB	`huggingface-cli download ai4bharat/indictrans2-en-indic-1B`
Indic to Indic	320M (distilled)	indictrans2-indic-indic-dist-320M	950 MB	`huggingface-cli download ai4bharat/indictrans2-indic-indic-dist-320M`
	1B (base)	indictrans2-indic-indic-1B	4.5 GB	`huggingface-cli download ai4bharat/indictrans2-indic-indic-1B`

Running with FastAPI Server

You can run the server using FastAPI:

with GPU

python src/translate_api.py --port 7860 --host 0.0.0.0 --device cuda --use_distilled

with CPU only

python src/translate_api.py --port 7860 --host 0.0.0.0 --device cpu --use_distilled

Evaluating Results

You can evaluate the translation results using curl commands. Here are some examples:

English to Kannada

curl -X 'POST' \
  'http://localhost:7860/translate?tgt_lang=kan_Knda&src_lang=eng_Latn&device_type=cuda' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "sentences": [
    "Hello, how are you?", "Good morning!"
  ],
  "src_lang": "eng_Latn",
  "tgt_lang": "kan_Knda"
}'

Response:

{
  "translations": [
    "ಹಲೋ, ಹೇಗಿದ್ದೀರಿ? ",
    "ಶುಭೋದಯ! "
  ]
}

Kannada to English

curl -X 'POST' \
  'http://localhost:7860/translate?src_lang=kan_Knda&tgt_lang=eng_Latn&device_type=cuda' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "sentences": [
    "ನಮಸ್ಕಾರ, ಹೇಗಿದ್ದೀರಾ?", "ಶುಭೋದಯ!"
  ],
  "src_lang": "kan_Knda",
  "tgt_lang": "eng_Latn"
}'

Response:

{
  "translations": ["Hello, how are you?", "Good morning!"]
}

Kannada to Hindi

curl -X 'POST' \
  'http://localhost:7860/translate?src_lang=kan_Knda&tgt_lang=eng_Latn&device_type=cuda' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "sentences": [
    "ನಮಸ್ಕಾರ, ಹೇಗಿದ್ದೀರಾ?", "ಶುಭೋದಯ!"
  ],
  "src_lang": "kan_Knda",
  "tgt_lang": "eng_Latn"
}'

Response

Hindi to Kannada

CPU

curl -X 'POST' \
  'http://localhost:7860/translate?src_lang=kan_Knda&tgt_lang=eng_Latn&device_type=cpu' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "sentences": [
    "ನಮಸ್ಕಾರ, ಹೇಗಿದ್ದೀರಾ?", "ಶುಭೋದಯ!"
  ],
  "src_lang": "kan_Knda",
  "tgt_lang": "eng_Latn"
}'

Response

{
  "translations": [
    "Hello, how are you?",
    "Good morning!"
  ]
}

Build Docker Image

GPU

docker build -t slabstech/indic_translate_server -f Dockerfile .

CPU only

docker build -t slabstech/indic_translate_server_ -f Dockerfile.cpu .

References

IndicTrans2 Paper
AI4Bharat IndicTrans2 Model
AI4Bharat IndicTrans2 GitHub Repository
IndicTransToolkit
Extra - pip install git+https://github.com/VarunGumma/IndicTransToolkit.git

Contributing

We welcome contributions! Please read the CONTRIBUTING.md file for guidelines on how to contribute to this project.

Also you can join the discord group to collaborate

License

This project is licensed under the MIT License - see the LICENSE file for details.

FAQ

Q: How do I change the source and target languages?

A: Modify the compose.yaml file to set the SRC_LANG and TGT_LANG variables as needed.

Q: How do I download the translation models?

A: Use the huggingface-cli commands provided in the Downloading Translation Models section.

Q: How do I run the server locally?

A: Follow the instructions in the Running with FastAPI Server section.

License

This README provides a comprehensive guide to setting up and running the Indic Translate Server. For more details, refer to the linked resources.

Citation

@article{gala2023indictrans,
title={IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages},
author={Jay Gala and Pranjal A Chitale and A K Raghavan and Varun Gumma and Sumanth Doddapaneni and Aswanth Kumar M and Janki Atul Nawale and Anupama Sujatha and Ratish Puduppully and Vivek Raghavan and Pratyush Kumar and Mitesh M Khapra and Raj Dabre and Anoop Kunchukuttan},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2023},
url={https://openreview.net/forum?id=vfT4YuzAYA},
note={}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
docs		docs
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile.cpu		Dockerfile.cpu
Dockerfile.dev		Dockerfile.dev
LICENSE		LICENSE
README.md		README.md
compose.yaml		compose.yaml
cpu_compose.yaml		cpu_compose.yaml
dev_compose.yaml		dev_compose.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Indic Translate Server

Table of Contents

Overview

Languages Supported

Live Server

High Latency, Slow System (Available 24/7)

Low Latency, Fast System (Available on Request)

How to Use the Service

High Latency Service

Low Latency Service - GPU server on demand

Prerequisites

Running with Docker Compose

Setting Up the Development Environment

Model Downloads for Translation

Running with FastAPI Server

Evaluating Results

English to Kannada

Kannada to English

Kannada to Hindi

Response

Hindi to Kannada

CPU

Response

Build Docker Image

References

Contributing

License

FAQ

License

Citation

About

Releases

Sponsor this project

Packages

Languages

License

slabstech/indic-translate-server

Folders and files

Latest commit

History

Repository files navigation

Indic Translate Server

Table of Contents

Overview

Languages Supported

Live Server

High Latency, Slow System (Available 24/7)

Low Latency, Fast System (Available on Request)

How to Use the Service

High Latency Service

Low Latency Service - GPU server on demand

Prerequisites

Running with Docker Compose

Setting Up the Development Environment

Model Downloads for Translation

Running with FastAPI Server

Evaluating Results

English to Kannada

Kannada to English

Kannada to Hindi

Response

Hindi to Kannada

CPU

Response

Build Docker Image

References

Contributing

License

FAQ

License

Citation

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Languages

Packages