Skip to content

Commit

Permalink
Support ml backend for label-studio
Browse files Browse the repository at this point in the history
  • Loading branch information
xiaoyao9184 committed Jan 4, 2025
1 parent a544389 commit 0f0be1c
Show file tree
Hide file tree
Showing 12 changed files with 559 additions and 0 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
results/
cache.db
3 changes: 3 additions & 0 deletions label/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.file-cache/
__pycache__/
cache.db
127 changes: 127 additions & 0 deletions label/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
<!--
---
title: Transcribe text from images with Surya
type: guide
tier: all
order: 100
hide_menu: true
hide_frontmatter_title: true
meta_title: Surya model connection for transcribing text in images
meta_description: The Surya model connection integrates the capabilities of Surya with Label Studio to assist in machine learning labeling tasks involving Optical Character Recognition (OCR).
categories:
- Computer Vision
- Optical Character Recognition
- Surya
image: "/tutorials/surya.png"
---
-->

# Surya model connection

The [Surya](https://github.com/VikParuchuri/surya) model connection is a powerful tool that integrates the capabilities of Surya with Label Studio. It is designed to assist in machine learning labeling tasks, specifically those involving Optical Character Recognition (OCR).

The primary function of this connection is to recognize and extract text from images, which can be a crucial step in many machine learning workflows. By automating this process, the Surya model connection can significantly increase efficiency, reducing the time and effort required for manual text extraction.

In the context of Label Studio, this connection enhances the platform's labeling capabilities, allowing users to automatically generate labels for text in images. This can be particularly useful in tasks such as data annotation, document digitization, and more.

## Before you begin

Before you begin, you must install the [Label Studio ML backend](https://github.com/HumanSignal/label-studio-ml-backend?tab=readme-ov-file#quickstart).

This tutorial uses the [`surya` example](https://github.com/xiaoyao9184/docker-surya/tree/master/label).

## Labeling configuration

The Surya model connection can be used with the default labeling configuration for OCR in Label Studio. This configuration typically involves defining the types of labels to be used (e.g., text, handwriting, etc.) and the regions of the image where these labels should be applied.

When setting the labeling configuration, select the **Computer Vision > Optical Character Recognition**. This template is pre-configured for OCR tasks and includes the necessary elements for labeling text in images:

```xml
<View>
<Image name="image" value="$image"/>

<Labels name="label" toName="image">
<Label value="Text" background="green"/>
<Label value="Handwriting" background="blue"/>
</Labels>

<Rectangle name="bbox" toName="image" strokeWidth="3"/>
<Polygon name="poly" toName="image" strokeWidth="3"/>

<TextArea name="transcription" toName="image"
editable="true"
perRegion="true"
required="true"
maxSubmissions="1"
rows="5"
placeholder="Recognized Text"
displayMode="region-list"
/>
</View>
```


> Warning! Please note that the current implementation of the Surya model connection does not support images that are directly uploaded to Label Studio. It is designed to work with images that are hosted publicly on the internet. Therefore, to use this connection, you should ensure that your images are publicly accessible via a URL.

## Running with Docker (recommended)

1. Start the Machine Learning backend on `http://localhost:9090` with the prebuilt image:

```bash
cd docker/up.label@gpu-online
docker-compose up
```

2. Validate that backend is running

```bash
$ curl http://localhost:9090/
{"status":"UP"}
```

3. Create a project in Label Studio. Then from the **Model** page in the project settings, [connect the model](https://labelstud.io/guide/ml#Connect-the-model-to-Label-Studio). The default URL is `http://localhost:9090`.


## Building from source (advanced)

To build the ML backend from source, you have to clone the repository and build the Docker image:

```bash
docker build -t xiaoyao9184/surya:master -f ./docker/build@source/dockerfile .
```

## Running without Docker (advanced)

To run the ML backend without Docker, you have to clone the repository and install all dependencies using conda:

```bash
conda env create -f ./environment.yml
```

Then you can start the ML backend:

```bash
conda activate surya
label-studio-ml start --root-dir . label
```

The Surya model connection offers several configuration options that can be set in the `docker-compose.yml` file:

- `BASIC_AUTH_USER`: Specifies the basic auth user for the model server.
- `BASIC_AUTH_PASS`: Specifies the basic auth password for the model server.
- `LOG_LEVEL`: Sets the log level for the model server.
- `WORKERS`: Specifies the number of workers for the model server.
- `THREADS`: Specifies the number of threads for the model server.
- `MODEL_DIR`: Specifies the model directory.
- `LANG_LIST`: Specifies the list of languages to be used by the OCR model, separated by commas (default: `mn,en`).
- `SCORE_THRESHOLD`: Sets the score threshold to filter out noisy results.
- `LABEL_MAPPINGS_FILE`: Specifies the file with mappings from COCO labels to custom labels.
- `LABEL_STUDIO_ACCESS_TOKEN`: Specifies the Label Studio access token.
- `LABEL_STUDIO_HOST`: Specifies the Label Studio host.

These options allow you to customize the behavior of the Surya model connection to suit your specific needs.

# Customization

The ML backend can be customized by adding your own models and logic inside the `./label` directory.
122 changes: 122 additions & 0 deletions label/_wsgi.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
import os
import argparse
import json
import logging
import logging.config

logging.config.dictConfig({
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"standard": {
"format": "[%(asctime)s] [%(levelname)s] [%(name)s::%(funcName)s::%(lineno)d] %(message)s"
}
},
"handlers": {
"console": {
"class": "logging.StreamHandler",
"level": os.getenv('LOG_LEVEL'),
"stream": "ext://sys.stdout",
"formatter": "standard"
}
},
"root": {
"level": os.getenv('LOG_LEVEL'),
"handlers": [
"console"
],
"propagate": True
}
})

from label_studio_ml.api import init_app
from model import SuryaOCR


_DEFAULT_CONFIG_PATH = os.path.join(os.path.dirname(__file__), 'config.json')


def get_kwargs_from_config(config_path=_DEFAULT_CONFIG_PATH):
if not os.path.exists(config_path):
return dict()
with open(config_path) as f:
config = json.load(f)
assert isinstance(config, dict)
return config


if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Label studio')
parser.add_argument(
'-p', '--port', dest='port', type=int, default=9090,
help='Server port')
parser.add_argument(
'--host', dest='host', type=str, default='0.0.0.0',
help='Server host')
parser.add_argument(
'--kwargs', '--with', dest='kwargs', metavar='KEY=VAL', nargs='+', type=lambda kv: kv.split('='),
help='Additional LabelStudioMLBase model initialization kwargs')
parser.add_argument(
'-d', '--debug', dest='debug', action='store_true',
help='Switch debug mode')
parser.add_argument(
'--log-level', dest='log_level', choices=['DEBUG', 'INFO', 'WARNING', 'ERROR'], default=None,
help='Logging level')
parser.add_argument(
'--model-dir', dest='model_dir', default=os.path.dirname(__file__),
help='Directory where models are stored (relative to the project directory)')
parser.add_argument(
'--check', dest='check', action='store_true',
help='Validate model instance before launching server')
parser.add_argument('--basic-auth-user',
default=os.environ.get('ML_SERVER_BASIC_AUTH_USER', None),
help='Basic auth user')

parser.add_argument('--basic-auth-pass',
default=os.environ.get('ML_SERVER_BASIC_AUTH_PASS', None),
help='Basic auth pass')

args = parser.parse_args()

# setup logging level
if args.log_level:
logging.root.setLevel(args.log_level)

def isfloat(value):
try:
float(value)
return True
except ValueError:
return False

def parse_kwargs():
param = dict()
for k, v in args.kwargs:
if v.isdigit():
param[k] = int(v)
elif v == 'True' or v == 'true':
param[k] = True
elif v == 'False' or v == 'false':
param[k] = False
elif isfloat(v):
param[k] = float(v)
else:
param[k] = v
return param

kwargs = get_kwargs_from_config()

if args.kwargs:
kwargs.update(parse_kwargs())

if args.check:
print('Check "' + SuryaOCR.__name__ + '" instance creation..')
model = SuryaOCR(**kwargs)

app = init_app(model_class=SuryaOCR, basic_auth_user=args.basic_auth_user, basic_auth_pass=args.basic_auth_pass)

app.run(host=args.host, port=args.port, debug=args.debug)

else:
# for uWSGI use
app = init_app(model_class=SuryaOCR)
1 change: 1 addition & 0 deletions label/label_mappings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{}
Loading

0 comments on commit 0f0be1c

Please sign in to comment.