This is the model pipeline in our self-designed application--"Mlator", which aims to help manga fans and publishers to overcome the language barrier and lower the cost of translation respectively.
Two Main Functions:
I) Drag your page to the DEMO region at the bottom and the result will show on the right side
II) After you register and log in, you can translate multiple pages and download the translations at the same time
https://github.com/MSDS698/product-analytics-group7
This example image is from <<Q.E.D.iff-proven end-11>> Episode 1 © Motohiro Katou.
First, train an object detection model that helps us locate the text in the bubble. Here thanks to Manga109 providing us with a large amount of high quality annotated dataset. As the following image shows, the identified areas are marked with orange bounding boxes, and content in the box would be processed by the next step.
Next, we use a state-of-the-art OCR engine to parse the image segment we identified in step 1 into machine-readable text. Besides, a few tricks are needed to help the model parse vertically-oriented Japanese text and stylized comic fonts.
All the extracted Japanese text is translated into English. This is a crucial stage in the process, since a quality translation is what allows readers to enjoy the results.
If we simply use the bounding boxes as our translated text background, some of the boxes would leak beyond the bounds of the bubble, which make the page uncomfortable to read. It would be best if the bubble is used for background, that's why we need to remove the original text.
Finally, the English text is broken up into lines of an appropriate length and resized to comfortably fit their corresponding speech bubble. At this point, the comics are translated and ready for reading!
Instance Type: Ubuntu Server 18.04 LTS (HVM), SSD Volume Type
Size: >= t2.medium
Models: https://drive.google.com/drive/folders/1mEvrweffTBs7-wb2WyNQ8wOjoTKVWxCT?usp=sharing
Command Lines:
- install conda
wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh
bash Anaconda3-4.2.0-Linux-x86_64.sh
export PATH=/home/ubuntu/anaconda3/bin:$PATH
conda update coda
- install git and git clone the files
sudo yum install git
git config —global credential.helper store
git clone https://github.com/MSDS698/product-analytics-group7.git
- setup environment
cd product-analytics-group7
conda env create -f environment.yml
source activate MSDS603
- setup packages
pip install scikit-image
pip install opencv-python
sudo apt update && sudo apt install -y libsm6 libxext6
pip install pytesseract
pip install —upgrade google-cloud-translate
pip install Keras==2.1.4
pip install tensorflow
- setup tesseract
sudo apt-get update
sudo apt install tesseract-ocr
sudo apt install libtesseract-dev
sudo apt install tesseract-ocr-chi-tra-vert
sudo apt install tesseract-ocr-jpn-vert
- transfer well-trained models: SSD, U-Net and OCR to certain paths
cd product-analytics-group7/server/
mkdir checkpoint
scp -i your.pem ssd300_all.h5 [email protected]:product-analytics-group7/server/checkpoint/
scp -i your.pem unet_8.hdf5 [email protected]:product-analytics-group7/server/checkpoint/
# if permission denied when you run the following cmd, scp to the Desktop then sudo mv
scp -i your.pem jpn_vbest.traineddata [email protected]:/usr/share/tesseract-ocr/4.00/tessdata
- run the server
cd ~
python product-analytics-group7/server/server.py