-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
a544389
commit 0f0be1c
Showing
12 changed files
with
559 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,2 @@ | ||
results/ | ||
cache.db |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
.file-cache/ | ||
__pycache__/ | ||
cache.db |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
<!-- | ||
--- | ||
title: Transcribe text from images with Surya | ||
type: guide | ||
tier: all | ||
order: 100 | ||
hide_menu: true | ||
hide_frontmatter_title: true | ||
meta_title: Surya model connection for transcribing text in images | ||
meta_description: The Surya model connection integrates the capabilities of Surya with Label Studio to assist in machine learning labeling tasks involving Optical Character Recognition (OCR). | ||
categories: | ||
- Computer Vision | ||
- Optical Character Recognition | ||
- Surya | ||
image: "/tutorials/surya.png" | ||
--- | ||
--> | ||
|
||
# Surya model connection | ||
|
||
The [Surya](https://github.com/VikParuchuri/surya) model connection is a powerful tool that integrates the capabilities of Surya with Label Studio. It is designed to assist in machine learning labeling tasks, specifically those involving Optical Character Recognition (OCR). | ||
|
||
The primary function of this connection is to recognize and extract text from images, which can be a crucial step in many machine learning workflows. By automating this process, the Surya model connection can significantly increase efficiency, reducing the time and effort required for manual text extraction. | ||
|
||
In the context of Label Studio, this connection enhances the platform's labeling capabilities, allowing users to automatically generate labels for text in images. This can be particularly useful in tasks such as data annotation, document digitization, and more. | ||
|
||
## Before you begin | ||
|
||
Before you begin, you must install the [Label Studio ML backend](https://github.com/HumanSignal/label-studio-ml-backend?tab=readme-ov-file#quickstart). | ||
|
||
This tutorial uses the [`surya` example](https://github.com/xiaoyao9184/docker-surya/tree/master/label). | ||
|
||
## Labeling configuration | ||
|
||
The Surya model connection can be used with the default labeling configuration for OCR in Label Studio. This configuration typically involves defining the types of labels to be used (e.g., text, handwriting, etc.) and the regions of the image where these labels should be applied. | ||
|
||
When setting the labeling configuration, select the **Computer Vision > Optical Character Recognition**. This template is pre-configured for OCR tasks and includes the necessary elements for labeling text in images: | ||
|
||
```xml | ||
<View> | ||
<Image name="image" value="$image"/> | ||
|
||
<Labels name="label" toName="image"> | ||
<Label value="Text" background="green"/> | ||
<Label value="Handwriting" background="blue"/> | ||
</Labels> | ||
|
||
<Rectangle name="bbox" toName="image" strokeWidth="3"/> | ||
<Polygon name="poly" toName="image" strokeWidth="3"/> | ||
|
||
<TextArea name="transcription" toName="image" | ||
editable="true" | ||
perRegion="true" | ||
required="true" | ||
maxSubmissions="1" | ||
rows="5" | ||
placeholder="Recognized Text" | ||
displayMode="region-list" | ||
/> | ||
</View> | ||
``` | ||
|
||
|
||
> Warning! Please note that the current implementation of the Surya model connection does not support images that are directly uploaded to Label Studio. It is designed to work with images that are hosted publicly on the internet. Therefore, to use this connection, you should ensure that your images are publicly accessible via a URL. | ||
|
||
## Running with Docker (recommended) | ||
|
||
1. Start the Machine Learning backend on `http://localhost:9090` with the prebuilt image: | ||
|
||
```bash | ||
cd docker/up.label@gpu-online | ||
docker-compose up | ||
``` | ||
|
||
2. Validate that backend is running | ||
|
||
```bash | ||
$ curl http://localhost:9090/ | ||
{"status":"UP"} | ||
``` | ||
|
||
3. Create a project in Label Studio. Then from the **Model** page in the project settings, [connect the model](https://labelstud.io/guide/ml#Connect-the-model-to-Label-Studio). The default URL is `http://localhost:9090`. | ||
|
||
|
||
## Building from source (advanced) | ||
|
||
To build the ML backend from source, you have to clone the repository and build the Docker image: | ||
|
||
```bash | ||
docker build -t xiaoyao9184/surya:master -f ./docker/build@source/dockerfile . | ||
``` | ||
|
||
## Running without Docker (advanced) | ||
|
||
To run the ML backend without Docker, you have to clone the repository and install all dependencies using conda: | ||
|
||
```bash | ||
conda env create -f ./environment.yml | ||
``` | ||
|
||
Then you can start the ML backend: | ||
|
||
```bash | ||
conda activate surya | ||
label-studio-ml start --root-dir . label | ||
``` | ||
|
||
The Surya model connection offers several configuration options that can be set in the `docker-compose.yml` file: | ||
|
||
- `BASIC_AUTH_USER`: Specifies the basic auth user for the model server. | ||
- `BASIC_AUTH_PASS`: Specifies the basic auth password for the model server. | ||
- `LOG_LEVEL`: Sets the log level for the model server. | ||
- `WORKERS`: Specifies the number of workers for the model server. | ||
- `THREADS`: Specifies the number of threads for the model server. | ||
- `MODEL_DIR`: Specifies the model directory. | ||
- `LANG_LIST`: Specifies the list of languages to be used by the OCR model, separated by commas (default: `mn,en`). | ||
- `SCORE_THRESHOLD`: Sets the score threshold to filter out noisy results. | ||
- `LABEL_MAPPINGS_FILE`: Specifies the file with mappings from COCO labels to custom labels. | ||
- `LABEL_STUDIO_ACCESS_TOKEN`: Specifies the Label Studio access token. | ||
- `LABEL_STUDIO_HOST`: Specifies the Label Studio host. | ||
|
||
These options allow you to customize the behavior of the Surya model connection to suit your specific needs. | ||
|
||
# Customization | ||
|
||
The ML backend can be customized by adding your own models and logic inside the `./label` directory. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
import os | ||
import argparse | ||
import json | ||
import logging | ||
import logging.config | ||
|
||
logging.config.dictConfig({ | ||
"version": 1, | ||
"disable_existing_loggers": False, | ||
"formatters": { | ||
"standard": { | ||
"format": "[%(asctime)s] [%(levelname)s] [%(name)s::%(funcName)s::%(lineno)d] %(message)s" | ||
} | ||
}, | ||
"handlers": { | ||
"console": { | ||
"class": "logging.StreamHandler", | ||
"level": os.getenv('LOG_LEVEL'), | ||
"stream": "ext://sys.stdout", | ||
"formatter": "standard" | ||
} | ||
}, | ||
"root": { | ||
"level": os.getenv('LOG_LEVEL'), | ||
"handlers": [ | ||
"console" | ||
], | ||
"propagate": True | ||
} | ||
}) | ||
|
||
from label_studio_ml.api import init_app | ||
from model import SuryaOCR | ||
|
||
|
||
_DEFAULT_CONFIG_PATH = os.path.join(os.path.dirname(__file__), 'config.json') | ||
|
||
|
||
def get_kwargs_from_config(config_path=_DEFAULT_CONFIG_PATH): | ||
if not os.path.exists(config_path): | ||
return dict() | ||
with open(config_path) as f: | ||
config = json.load(f) | ||
assert isinstance(config, dict) | ||
return config | ||
|
||
|
||
if __name__ == "__main__": | ||
parser = argparse.ArgumentParser(description='Label studio') | ||
parser.add_argument( | ||
'-p', '--port', dest='port', type=int, default=9090, | ||
help='Server port') | ||
parser.add_argument( | ||
'--host', dest='host', type=str, default='0.0.0.0', | ||
help='Server host') | ||
parser.add_argument( | ||
'--kwargs', '--with', dest='kwargs', metavar='KEY=VAL', nargs='+', type=lambda kv: kv.split('='), | ||
help='Additional LabelStudioMLBase model initialization kwargs') | ||
parser.add_argument( | ||
'-d', '--debug', dest='debug', action='store_true', | ||
help='Switch debug mode') | ||
parser.add_argument( | ||
'--log-level', dest='log_level', choices=['DEBUG', 'INFO', 'WARNING', 'ERROR'], default=None, | ||
help='Logging level') | ||
parser.add_argument( | ||
'--model-dir', dest='model_dir', default=os.path.dirname(__file__), | ||
help='Directory where models are stored (relative to the project directory)') | ||
parser.add_argument( | ||
'--check', dest='check', action='store_true', | ||
help='Validate model instance before launching server') | ||
parser.add_argument('--basic-auth-user', | ||
default=os.environ.get('ML_SERVER_BASIC_AUTH_USER', None), | ||
help='Basic auth user') | ||
|
||
parser.add_argument('--basic-auth-pass', | ||
default=os.environ.get('ML_SERVER_BASIC_AUTH_PASS', None), | ||
help='Basic auth pass') | ||
|
||
args = parser.parse_args() | ||
|
||
# setup logging level | ||
if args.log_level: | ||
logging.root.setLevel(args.log_level) | ||
|
||
def isfloat(value): | ||
try: | ||
float(value) | ||
return True | ||
except ValueError: | ||
return False | ||
|
||
def parse_kwargs(): | ||
param = dict() | ||
for k, v in args.kwargs: | ||
if v.isdigit(): | ||
param[k] = int(v) | ||
elif v == 'True' or v == 'true': | ||
param[k] = True | ||
elif v == 'False' or v == 'false': | ||
param[k] = False | ||
elif isfloat(v): | ||
param[k] = float(v) | ||
else: | ||
param[k] = v | ||
return param | ||
|
||
kwargs = get_kwargs_from_config() | ||
|
||
if args.kwargs: | ||
kwargs.update(parse_kwargs()) | ||
|
||
if args.check: | ||
print('Check "' + SuryaOCR.__name__ + '" instance creation..') | ||
model = SuryaOCR(**kwargs) | ||
|
||
app = init_app(model_class=SuryaOCR, basic_auth_user=args.basic_auth_user, basic_auth_pass=args.basic_auth_pass) | ||
|
||
app.run(host=args.host, port=args.port, debug=args.debug) | ||
|
||
else: | ||
# for uWSGI use | ||
app = init_app(model_class=SuryaOCR) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{} |
Oops, something went wrong.