Skip to content

Commit f19a92d

Browse files
authoredDec 3, 2024
Merge pull request #5 from groundlight/dev
Added upload and evaluation scripts with simple instructions
2 parents 31e58af + b5e685a commit f19a92d

8 files changed

+1745
-2
lines changed
 

‎.gitignore

+166
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
6+
# C extensions
7+
*.so
8+
9+
# Distribution / packaging
10+
.Python
11+
build/
12+
develop-eggs/
13+
dist/
14+
downloads/
15+
eggs/
16+
.eggs/
17+
lib/
18+
lib64/
19+
parts/
20+
sdist/
21+
var/
22+
wheels/
23+
share/python-wheels/
24+
*.egg-info/
25+
.installed.cfg
26+
*.egg
27+
MANIFEST
28+
29+
# PyInstaller
30+
# Usually these files are written by a python script from a template
31+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
32+
*.manifest
33+
*.spec
34+
35+
# Installer logs
36+
pip-log.txt
37+
pip-delete-this-directory.txt
38+
39+
# Unit test / coverage reports
40+
htmlcov/
41+
.tox/
42+
.nox/
43+
.coverage
44+
.coverage.*
45+
.cache
46+
nosetests.xml
47+
coverage.xml
48+
*.cover
49+
*.py,cover
50+
.hypothesis/
51+
.pytest_cache/
52+
cover/
53+
54+
# Translations
55+
*.mo
56+
*.pot
57+
58+
# Django stuff:
59+
*.log
60+
local_settings.py
61+
db.sqlite3
62+
db.sqlite3-journal
63+
64+
# Flask stuff:
65+
instance/
66+
.webassets-cache
67+
68+
# Scrapy stuff:
69+
.scrapy
70+
71+
# Sphinx documentation
72+
docs/_build/
73+
74+
# PyBuilder
75+
.pybuilder/
76+
target/
77+
78+
# Jupyter Notebook
79+
.ipynb_checkpoints
80+
81+
# IPython
82+
profile_default/
83+
ipython_config.py
84+
85+
# pyenv
86+
# For a library or package, you might want to ignore these files since the code is
87+
# intended to run in multiple environments; otherwise, check them in:
88+
# .python-version
89+
90+
# pipenv
91+
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
92+
# However, in case of collaboration, if having platform-specific dependencies or dependencies
93+
# having no cross-platform support, pipenv may install dependencies that don't work, or not
94+
# install all needed dependencies.
95+
#Pipfile.lock
96+
97+
# poetry
98+
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
99+
# This is especially recommended for binary packages to ensure reproducibility, and is more
100+
# commonly ignored for libraries.
101+
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
102+
#poetry.lock
103+
104+
# pdm
105+
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
106+
#pdm.lock
107+
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
108+
# in version control.
109+
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
110+
.pdm.toml
111+
.pdm-python
112+
.pdm-build/
113+
114+
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
115+
__pypackages__/
116+
117+
# Celery stuff
118+
celerybeat-schedule
119+
celerybeat.pid
120+
121+
# SageMath parsed files
122+
*.sage.py
123+
124+
# Environments
125+
.env
126+
.venv
127+
env/
128+
venv/
129+
ENV/
130+
env.bak/
131+
venv.bak/
132+
133+
# Spyder project settings
134+
.spyderproject
135+
.spyproject
136+
137+
# Rope project settings
138+
.ropeproject
139+
140+
# mkdocs documentation
141+
/site
142+
143+
# mypy
144+
.mypy_cache/
145+
.dmypy.json
146+
dmypy.json
147+
148+
# Pyre type checker
149+
.pyre/
150+
151+
# pytype static type analyzer
152+
.pytype/
153+
154+
# Cython debug symbols
155+
cython_debug/
156+
157+
# misc
158+
.DS_Store
159+
*.pem
160+
161+
# PyCharm
162+
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
163+
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
164+
# and can be added to the global gitignore or merged into this file. For a more nuclear
165+
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
166+
#.idea/

‎README.md

+91-2
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,91 @@
1-
# model-evaluation-tool
2-
A simple tool for evaluating the performance of your Groundlight ML model
1+
# Model Evaluation Tool
2+
A simple tool for evaluating the performance of your Groundlight Binary ML model.
3+
4+
This script provides a simple way for users to do an independent evaluation of the ML's performance. Note that this is not the recommended way of using our service, as this only evaluates ML performance and not the combined performance of our ML + escalation system. However, the balanced accuracy results from `evaluate.py` should fall within the bounds of Projected ML Accuracy shown on our website, if the train and evaluation dataset that the user provided are well randomized.
5+
6+
## Installation
7+
8+
The dependencies for this script can be installed by either using poetry (recommended) or `requirements.txt`.
9+
10+
Using poetry
11+
12+
```bash
13+
poetry install
14+
```
15+
16+
Using `requirements.txt`
17+
```bash
18+
pip install -r requirements.txt
19+
```
20+
21+
## Usage
22+
23+
### Setting Up Your Account
24+
25+
To train a ML model, make sure to create a binary detector on the [Online Dashboard](https://dashboard.groundlight.ai/).
26+
27+
You will also need to create an API Token to start uploading images to the account. You can go [here](https://dashboard.groundlight.ai/reef/my-account/api-tokens) to create one.
28+
29+
After you have created your API token, add the token to your terminal as an variable:
30+
31+
```bash
32+
export GROUNDLIIGHT_API_TOKEN="YOUR_API_TOKEN"
33+
```
34+
35+
### Formatting Dataset
36+
37+
This script assumes your custom image dataset is structured in the following format:
38+
39+
```bash
40+
└── dataset
41+
├── dataset.csv
42+
└── images
43+
├── 1.jpg
44+
├── 10.jpg
45+
├── 11.jpg
46+
├── 12.jpg
47+
├── 13.jpg
48+
├── 14.jpg
49+
```
50+
51+
The `dataset.csv` file should have two columns: image_name and label (YES/NO), for example:
52+
53+
```bash
54+
1.jpg,YES
55+
11.jpg,NO
56+
12.jpg,YES
57+
13.jpg,YES
58+
14.jpg,NO
59+
```
60+
61+
The corresponding image file should be placed inside the `images` folder.
62+
63+
### Training the Detector
64+
65+
To train the ML model for a detector, simply run the script `train.py` with the following arguments:
66+
67+
```bash
68+
poetry run python train.py --detector-name NAME_OF_THE_DETECTOR --detector-query QUERY_OF_THE_DETECTOR --dataset PATH_TO_DATASET_TRAIN_FOLDER
69+
```
70+
71+
Optionally, set the `--delay` argument to prevent going over the throttling limit of your account.
72+
73+
### Evaluate the Detector
74+
75+
To evaluate the ML model performance for a detector, simply run the script `evaluate.py` with the following arguments:
76+
77+
```bash
78+
poetry run python evaluate.py --detector-id YOUR_DETECTOR_ID --dataset PATH_TO_DATASET_TEST_FOLDER
79+
```
80+
81+
Optionally, set the `--delay` argument to prevent going over the throttling limit of your account.
82+
83+
The evaluation script will output the following information:
84+
85+
```
86+
Number of Correct ML Predictions
87+
Average Confidence
88+
Balanced Accuracy
89+
Precision
90+
Recall
91+
```

‎evaluate.py

+106
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
#!/usr/bin/env python3
2+
"""
3+
A script to evaluate the accuracy of a detector on a given dataset.
4+
It will upload the images to the detector and compare the predicted labels with the ground truth labels.
5+
You can specify the delay between uploads.
6+
"""
7+
8+
import argparse
9+
import os
10+
import PIL
11+
import time
12+
import PIL.Image
13+
import pandas as pd
14+
import logging
15+
16+
from groundlight import Groundlight, Detector, BinaryClassificationResult
17+
from tqdm.auto import tqdm
18+
19+
logger = logging.getLogger(__name__)
20+
logging.basicConfig(level=logging.INFO)
21+
22+
23+
def upload_image(gl: Groundlight, detector: Detector, image: PIL) -> BinaryClassificationResult:
24+
"""
25+
Upload a image with a label to a detector.
26+
27+
Args:
28+
gl: The Groundlight object.
29+
detector: The detector to upload to.
30+
image: The image to upload.
31+
Returns:
32+
The predicted label (YES/NO).
33+
"""
34+
35+
# Convert image to jpg if not already
36+
if image.format != "JPEG":
37+
image = image.convert("RGB")
38+
39+
# Use ask_ml to upload the image and then return the result
40+
iq = gl.ask_ml(detector=detector, image=image)
41+
return iq.result
42+
43+
44+
if __name__ == "__main__":
45+
parser = argparse.ArgumentParser(description="Evaluate the accuracy of a detector on a given dataset.")
46+
parser.add_argument("--detector-id", type=str, required=True, help="The ID of the detector to evaluate.")
47+
parser.add_argument("--dataset", type=str, required=True, help="The folder containing the dataset.csv and images folder")
48+
parser.add_argument("--delay", type=float, required=False, default=0.1, help="The delay between uploads.")
49+
args = parser.parse_args()
50+
51+
gl = Groundlight()
52+
detector = gl.get_detector(args.detector_id)
53+
54+
# Load the dataset from the CSV file and images from the images folder
55+
# The CSV file should have two columns: image_name and label (YES/NO)
56+
57+
dataset = pd.read_csv(os.path.join(args.dataset, "dataset.csv"))
58+
images = os.listdir(os.path.join(args.dataset, "images"))
59+
60+
logger.info(f"Evaluating {len(dataset)} images on detector {detector.name} with delay {args.delay}.")
61+
62+
# Record the number of correct predictions
63+
# Also record the number of TP, TN, FP, FN for calculating balanced accuracy, precision, and recall
64+
true_positives = 0
65+
true_negatives = 0
66+
false_positives = 0
67+
false_negatives = 0
68+
total_processed = 0
69+
average_confidence = 0
70+
71+
for image_name, label in tqdm(dataset.values):
72+
if image_name not in images:
73+
logger.warning(f"Image {image_name} not found in images folder.")
74+
continue
75+
76+
if label not in ["YES", "NO"]:
77+
logger.warning(f"Invalid label {label} for image {image_name}. Skipping.")
78+
continue
79+
80+
image = PIL.Image.open(os.path.join(args.dataset, "images", image_name))
81+
result = upload_image(gl=gl, detector=detector, image=image)
82+
83+
if result.label == "YES" and label == "YES":
84+
true_positives += 1
85+
elif result.label == "NO" and label == "NO":
86+
true_negatives += 1
87+
elif result.label == "YES" and label == "NO":
88+
false_positives += 1
89+
elif result.label == "NO" and label == "YES":
90+
false_negatives += 1
91+
92+
average_confidence += result.confidence
93+
total_processed += 1
94+
95+
time.sleep(args.delay)
96+
97+
# Calculate the accuracy, precision, and recall
98+
balanced_accuracy = (true_positives / (true_positives + false_negatives) + true_negatives / (true_negatives + false_positives)) / 2
99+
precision = true_positives / (true_positives + false_positives)
100+
recall = true_positives / (true_positives + false_negatives)
101+
102+
logger.info(f"Processed {total_processed} images.")
103+
logger.info(f"Average Confidence: {average_confidence / total_processed:.2f}")
104+
logger.info(f"Balanced Accuracy: {balanced_accuracy:.2f}")
105+
logger.info(f"Precision: {precision:.2f}")
106+
logger.info(f"Recall: {recall:.2f}")

‎poetry.lock

+834
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

‎poetry.toml

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
[virtualenvs]
2+
in-project = true

‎pyproject.toml

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
[tool.poetry]
2+
name = "model-evaluation-tool"
3+
version = "0.1.0"
4+
description = "Simple script for sending labeled data to Groundlight and run evaluation tests"
5+
authors = ["Harry Tung <harry@groundlight.ai>"]
6+
license = "MIT"
7+
readme = "README.md"
8+
9+
[tool.poetry.dependencies]
10+
python = "^3.11"
11+
groundlight = "^0.19.0"
12+
pandas = "^2.2.3"
13+
tqdm = "^4.67.1"
14+
15+
16+
[build-system]
17+
requires = ["poetry-core"]
18+
build-backend = "poetry.core.masonry.api"

‎requirements.txt

+455
Large diffs are not rendered by default.

‎train.py

+73
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
#!/usr/bin/env python3
2+
"""
3+
A script to upload frames with labels to a detector in a controlled manner.
4+
You can specify the delay between uploads.
5+
"""
6+
7+
import argparse
8+
import os
9+
import PIL
10+
import time
11+
import PIL.Image
12+
import pandas as pd
13+
import logging
14+
15+
from groundlight import Groundlight, Detector
16+
from tqdm.auto import tqdm
17+
18+
logger = logging.getLogger(__name__)
19+
logging.basicConfig(level=logging.INFO)
20+
21+
22+
def upload_image(gl: Groundlight, detector: Detector, image: PIL, label: str) -> None:
23+
"""
24+
Upload a image with a label to a detector.
25+
26+
Args:
27+
gl: The Groundlight object.
28+
detector: The detector to upload to.
29+
image: The image to upload.
30+
label: The label to upload.
31+
"""
32+
33+
# Convert image to jpg if not already
34+
if image.format != "JPEG":
35+
image = image.convert("RGB")
36+
37+
if label not in ["YES", "NO"]:
38+
raise ValueError(f"Invalid label: {label}, must be 'YES' or 'NO'.")
39+
40+
# Use ask_async to upload the image and then add the label to the image query
41+
iq = gl.ask_async(detector=detector, image=image)
42+
gl.add_label(image_query=iq, label=label)
43+
44+
45+
if __name__ == "__main__":
46+
parser = argparse.ArgumentParser(description="Upload images with labels to a detector.")
47+
parser.add_argument("--detector-name", type=str, required=True, help="The name of the detector.")
48+
parser.add_argument("--detector-query", type=str, required=True, help="The query of the detector.")
49+
parser.add_argument("--dataset", type=str, required=True, help="The folder containing the dataset.csv and images folder")
50+
parser.add_argument("--delay", type=float, required=False, default=0.1, help="The delay between uploads.")
51+
args = parser.parse_args()
52+
53+
gl = Groundlight()
54+
detector = gl.get_or_create_detector(name=args.detector_name, query=args.detector_query)
55+
56+
# Load the dataset from the CSV file and images from the images folder
57+
# The CSV file should have two columns: image_name and label (YES/NO)
58+
59+
dataset = pd.read_csv(os.path.join(args.dataset, "dataset.csv"))
60+
images = os.listdir(os.path.join(args.dataset, "images"))
61+
62+
logger.info(f"Uploading {len(dataset)} images to detector {detector.name} with delay {args.delay}.")
63+
64+
for image_name, label in tqdm(dataset.values):
65+
if image_name not in images:
66+
logger.warning(f"Image {image_name} not found in images folder.")
67+
continue
68+
69+
image = PIL.Image.open(os.path.join(args.dataset, "images", image_name))
70+
upload_image(gl=gl, detector=detector, image=image, label=label)
71+
time.sleep(args.delay)
72+
73+
logger.info("Upload complete. Please wait around 10 minutes for the detector to retrain.")

0 commit comments

Comments
 (0)
Please sign in to comment.