YOLOv8 classification #59

Eldies · 2024-09-19T09:42:11Z

Summary

Adds support for yolo v8 classification format

How to test

Checklist

I submit my changes into the develop branch
I have added description of my changes into CHANGELOG
I have updated the documentation accordingly
I have added tests to cover my changes
I have linked related issues

License

I submit my code changes under the same MIT License that covers the project.
Feel free to contact the maintainers if that's a concern.
I have updated the license header for each file (see an example below)

# Copyright (C) 2022 CVAT.ai Corporation
#
# SPDX-License-Identifier: MIT

Summary by CodeRabbit

New Features
- Introduced support for YOLOv8 Classification format, enhancing dataset management capabilities.
- Added comprehensive documentation for importing and exporting YOLOv8 Classification datasets.
- New classes for handling YOLOv8 Classification data formats, including converters, importers, and extractors.
Bug Fixes
- Improved error handling and item retrieval in dataset extraction processes.
Documentation
- Updated changelog and user manual to include details on YOLOv8 Classification format and its usage.

coderabbitai · 2024-09-19T09:42:18Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The pull request introduces support for the YOLOv8 Classification format across multiple components of the Datumaro framework. Key changes include the addition of new classes for importing, exporting, and extracting datasets in the YOLOv8 Classification format. The changelog has been updated to reflect these enhancements, and comprehensive documentation has been created to guide users in utilizing the new format. Additionally, unit tests have been implemented to ensure functionality and robustness of the new features.

Changes

Files	Change Summary
`CHANGELOG.md`	Updated to include support for the YOLOv8 Classification format and enhancements to the `env.detect_dataset()` function.
`datumaro/plugins/yolo_format/converter.py`	Introduced `YOLOv8ClassificationConverter` class for exporting datasets in the YOLOv8 classification format, including methods for media type validation and exporting media files.
`datumaro/plugins/yolo_format/extractor.py`	Added `YoloBaseExtractor` class for improved structure and code reuse, with `YoloExtractor` now subclassing it. Introduced `YOLOv8ClassificationExtractor` for handling classification tasks.
`datumaro/plugins/yolo_format/format.py`	Added `YOLOv8ClassificationFormat` class with a constant for handling unlabeled images.
`datumaro/plugins/yolo_format/importer.py`	Introduced `YOLOv8ClassificationImporter` class with a `find_sources` method to recognize and handle classification data.
`site/content/en/docs/formats/yolo_v8_classification.md`	Created documentation for the YOLOv8 Classification dataset format, detailing import/export processes and expected dataset structure.
`site/content/en/docs/user-manual/supported_formats.md`	Added a new section for "Classification" formats, including links to format specifications, dataset examples, and documentation.
`tests/unit/data_formats/test_yolo_format.py`	Added tests for the new converter, importer, and extractor classes related to YOLOv8 classification, covering various scenarios to ensure functionality and integrity during conversion and import processes.

Poem

In the meadow where data plays,
New formats bloom in sunny rays.
YOLOv8 now takes its flight,
Classification shines so bright!
With tests and docs, we hop along,
In Datumaro, we all belong! 🐇✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 5

Outside diff range and nitpick comments (8)

site/content/en/docs/formats/yolo_v8_classification.md (5)
7-14: Clearly specifies the supported annotation type and attributes.

The section provides a helpful link to the format specification and clearly states that the format supports Label annotations without any attributes.

Please fix the grammatical number in this sentence:
-Format doesn't support any attributes for annotations objects.
+Format doesn't support any attributes for annotation objects.
Tools

LanguageTool

[uncategorized] ~14-~14: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...rmat doesn't support any attributes for annotations objects. ## Import YOLOv8 classificat...

(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)

17-52: Provides clear instructions for importing datasets.

The section offers helpful guidance on importing YOLOv8 Classification datasets using both the command line and Python API, along with the expected directory structure.

Please make the following changes:
-A Datumaro project with a ImageNet dataset can be created
+A Datumaro project with an ImageNet dataset can be created
-For successful importing of YOLOv8 Classification dataset the input directory with dataset
+For successful importing of the YOLOv8 Classification dataset, the input directory with the dataset
-should has the following structure:
+should have the following structure:
Tools

LanguageTool

[uncategorized] ~19-~19: Use the indefinite article “an” before nouns that start with a vowel sound.
Context: ...cation dataset A Datumaro project with a ImageNet dataset can be created in the ...

(AI_EN_LECTOR_REPLACEMENT_DETERMINER_A_AN)

[uncategorized] ~35-~35: You might be missing the article “the” here.
Context: ...tion') ``` For successful importing of YOLOv8 Classification dataset the input direct...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

[uncategorized] ~35-~35: A comma might be missing here.
Context: ...sful importing of YOLOv8 Classification dataset the input directory with dataset should...

(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)

[uncategorized] ~36-~36: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...the input directory with dataset should has the following structure: ```bash datas...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)

54-83: Provides clear instructions for exporting datasets.

The section offers helpful guidance on exporting YOLOv8 Classification datasets to other formats supported by Datumaro, using both the command line and Python API. The note about extra export options for some formats, along with the link to format-specific documentation, is useful.

Please add a comma in this sentence:
-For particular format see the
+For a particular format, see the
Tools

LanguageTool

[uncategorized] ~72-~72: You might be missing the article “the” here.
Context: ...our YOLOv8 Classification dataset using Python API ```python import datumaro as dm d...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

[uncategorized] ~82-~82: Possible missing comma found.
Context: ...ve extra export options. For particular format see the > docs to get...

(AI_HYDRA_LEO_MISSING_COMMA)

85-99: Provides clear instructions for converting datasets to YOLOv8 Classification format.

The section offers helpful guidance on converting datasets containing Label annotations to the YOLOv8 Classification format using Datumaro, with examples for both the command line and a Datumaro project.

Please add a comma before "and" in this sentence:
-If your dataset contains `Label` for images and you want to convert this
+If your dataset contains `Label` for images, and you want to convert this
Tools

LanguageTool

[uncategorized] ~87-~87: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...your dataset contains Label for images and you want to convert this dataset into t...

(COMMA_COMPOUND_SENTENCE)

101-105: Clearly lists and explains extra export options.

The section provides a clear list of extra options for exporting to YOLOv8 Classification formats, along with brief explanations of each option.

Please rephrase these sentences to fix the grammatical issues:
-- `--save-media` allow to export dataset with saving media files
+- `--save-media` allows exporting the dataset with media files
  (by default `False`)
-- `--save-dataset-meta` - allow to export dataset with saving dataset meta
+- `--save-dataset-meta` allows exporting the dataset with dataset metadata
  file (by default `False`)
Tools

LanguageTool

[grammar] ~102-~102: Did you mean “exporting”? Or maybe you should add a pronoun? In active voice, ‘allow’ + ‘to’ takes an object, usually a pronoun.
Context: ...ication formats: - --save-media allow to export dataset with saving media files (by d...

(ALLOW_TO)

[grammar] ~104-~104: Did you mean “exporting”? Or maybe you should add a pronoun? In active voice, ‘allow’ + ‘to’ takes an object, usually a pronoun.
Context: ...False) - --save-dataset-meta` - allow to export dataset with saving dataset meta file...

(ALLOW_TO)
datumaro/plugins/yolo_format/converter.py (1)

409-443: The apply method looks good overall!

The method follows a clear structure:

Validates the media type.

Creates necessary directories.

Saves dataset metadata if required.

Iterates through subsets and items.

Calls _export_media_for_label based on item annotations.

A few minor suggestions:

Consider adding a comment to explain the purpose of the DEFAULT_SUBSET_NAME check at line 425.

The assert statement at line 429 could be replaced with a more informative error message if the condition is not met.

tests/unit/data_formats/test_yolo_format.py (2)

1001-1015: Add docstrings to empty overridden test methods

The methods test_export_rotated_bbox, test_cant_save_with_reserved_subset_name, test_inplace_save_writes_only_updated_data, test_can_load_dataset_with_exact_image_info, and test_can_save_and_load_without_path_prefix are overridden with empty bodies. Providing docstrings explaining why these methods are intentionally left empty will improve code readability and help other developers understand their purpose.

978-1000: Consistent handling of overridden methods with different arguments

The method test_can_save_and_load_image_with_arbitrary_extension overrides a base class method with a different signature but does not disable the pylint warning for arguments-differ. For consistency and to avoid linting issues, consider adding # pylint: disable=arguments-differ or aligning the method signature with the base class.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 393cb66 and 5a38e5e.

Files ignored due to path filters (3)

tests/assets/yolo_dataset/yolov8_classification/train/label_0/2.jpg is excluded by !**/*.jpg
tests/assets/yolo_dataset/yolov8_classification/train/label_0/subfolder/1.jpg is excluded by !**/*.jpg
tests/assets/yolo_dataset/yolov8_classification/train/label_1/3.jpg is excluded by !**/*.jpg

Files selected for processing (8)

CHANGELOG.md (1 hunks)
datumaro/plugins/yolo_format/converter.py (2 hunks)
datumaro/plugins/yolo_format/extractor.py (9 hunks)
datumaro/plugins/yolo_format/format.py (1 hunks)
datumaro/plugins/yolo_format/importer.py (3 hunks)
site/content/en/docs/formats/yolo_v8_classification.md (1 hunks)
site/content/en/docs/user-manual/supported_formats.md (1 hunks)
tests/unit/data_formats/test_yolo_format.py (7 hunks)

Additional context used

LanguageTool

site/content/en/docs/formats/yolo_v8_classification.md

[uncategorized] ~14-~14: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ...rmat doesn't support any attributes for annotations objects. ## Import YOLOv8 classificat...

(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)

[uncategorized] ~19-~19: Use the indefinite article “an” before nouns that start with a vowel sound.
Context: ...cation dataset A Datumaro project with a ImageNet dataset can be created in the ...

(AI_EN_LECTOR_REPLACEMENT_DETERMINER_A_AN)

[uncategorized] ~35-~35: You might be missing the article “the” here.
Context: ...tion') ``` For successful importing of YOLOv8 Classification dataset the input direct...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

[uncategorized] ~35-~35: A comma might be missing here.
Context: ...sful importing of YOLOv8 Classification dataset the input directory with dataset should...

(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)

[uncategorized] ~36-~36: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...the input directory with dataset should has the following structure: ```bash datas...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)

[uncategorized] ~72-~72: You might be missing the article “the” here.
Context: ...our YOLOv8 Classification dataset using Python API ```python import datumaro as dm d...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

[uncategorized] ~82-~82: Possible missing comma found.
Context: ...ve extra export options. For particular format see the > docs to get...

(AI_HYDRA_LEO_MISSING_COMMA)

[uncategorized] ~87-~87: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...your dataset contains Label for images and you want to convert this dataset into t...

(COMMA_COMPOUND_SENTENCE)

[grammar] ~102-~102: Did you mean “exporting”? Or maybe you should add a pronoun? In active voice, ‘allow’ + ‘to’ takes an object, usually a pronoun.
Context: ...ication formats: - --save-media allow to export dataset with saving media files (by d...

(ALLOW_TO)

[grammar] ~104-~104: Did you mean “exporting”? Or maybe you should add a pronoun? In active voice, ‘allow’ + ‘to’ takes an object, usually a pronoun.
Context: ...False) - --save-dataset-meta` - allow to export dataset with saving dataset meta file...

(ALLOW_TO)

Additional comments not posted (6)

datumaro/plugins/yolo_format/format.py (1)

31-32: LGTM!

The addition of the YOLOv8ClassificationFormat class and the IMAGE_DIR_NO_LABEL constant enhances the structure of the code by providing support for the YOLOv8 classification format and a mechanism to handle images without labels.

datumaro/plugins/yolo_format/importer.py (1)

114-123: LGTM!

The new YOLOv8ClassificationImporter class looks good:

It correctly extends the Importer class.

The find_sources method implementation is logically sound and handles the cases appropriately.

The changes enhance the functionality of the YOLO format importer to support classification data.

Great job!

site/content/en/docs/user-manual/supported_formats.md (1)

181-184: LGTM!

The new section for the YOLOv8 classification format is well-structured and consistent with the rest of the document. The provided links will be helpful for users to understand and utilize this format effectively.

CHANGELOG.md (1)

50-51: LGTM!

The changelog entry is well-formatted, descriptive, and consistent with the feature addition mentioned in the summary. Great job!
datumaro/plugins/yolo_format/extractor.py (1)
691-701: Verify Python version compatibility due to use of assignment expressions

The code uses assignment expressions (the walrus operator :=) in the comprehension at line 699, which requires Python 3.8 or newer. If the project needs to support earlier Python versions, this will cause syntax errors. Please verify the minimum required Python version for the project and adjust accordingly.

To confirm the project's Python version compatibility, run the following script:

Verification successful

Python version compatibility verified and consistent with codebase

The project's setup.py file specifies a minimum Python version of 3.8, which is appropriate for the use of assignment expressions (walrus operator). This requirement is consistent with the code in question and the usage of assignment expressions throughout the codebase, particularly in the yolo_format plugin.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Check for the minimum required Python version in project files
# Expected result: Python version should be >= 3.8 due to usage of assignment expressions

# Searching for 'python_requires' in 'setup.py' or 'pyproject.toml'
grep -r "python_requires" setup.py pyproject.toml || echo "python_requires not specified."

# If 'python_requires' is not specified, check for assignment expressions usage
grep -r ".*:=.*" datumaro/ || echo "No assignment expressions found."
Length of output: 973
tests/unit/data_formats/test_yolo_format.py (1)

1804-1820: Verify the use of extractor name in dataset export

In the YOLOv8ClassificationExtractorTest.test_can_parse method, the dataset is exported using YOLOv8ClassificationExtractor.NAME. Since this is an extractor test, ensure that using the extractor's name for export is intended and functions correctly. Typically, a converter's name is used when exporting datasets.

datumaro/plugins/yolo_format/converter.py

datumaro/plugins/yolo_format/extractor.py

tests/unit/data_formats/test_yolo_format.py

Bobronium · 2024-09-23T12:07:38Z

Could you resolve coderabbitai comments by ether applying suggestions or writing why they are not applicable?

datumaro/plugins/yolo_format/extractor.py

Bobronium · 2024-09-27T09:55:26Z

Black is failing, but otherwise LGTM!

Bobronium · 2024-10-03T13:27:25Z

datumaro/plugins/yolo_format/converter.py

+
+
+class YOLOv8ClassificationConverter(Converter):
+    # https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects


The url seems to be unrelated to YOLOv8. I think it would make sense to add a similar one for v8.

Removed the url.
There is already a link to official docs for yolov8 in datumaro docs, so i dont see a point of adding a link to some unofficial guide here (also I did not manage to find something like this)

sonarcloud · 2024-10-04T09:26:51Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

yolov8 classification

4c0d79b

changelog entry

5a38e5e

coderabbitai bot reviewed Sep 19, 2024

View reviewed changes

support export/import without media

08fef26

Eldies force-pushed the dl/yolo8-classification branch 3 times, most recently from 7e52b98 to 07734fe Compare September 24, 2024 11:38

coderabbitai suggestion

5495d24

Eldies force-pushed the dl/yolo8-classification branch from 07734fe to 5495d24 Compare September 24, 2024 11:48

Eldies added 8 commits September 25, 2024 12:21

small changes

30016eb

renaming a method according to what it does

d827926

method for extracting image name from path

cf627b9

using relative paths

9bcddf2

fixing exporting

84c3bb7

changing custom extension to save initial ids

d35ee96

update docs

d270b8f

do not create empty folders for labels when save_media=false

e1f3c7e

Bobronium reviewed Sep 26, 2024

View reviewed changes

datumaro/plugins/yolo_format/extractor.py Outdated Show resolved Hide resolved

using {} instead of OrderedDict

8aaf915

Bobronium approved these changes Sep 27, 2024

View reviewed changes

Eldies added 3 commits September 27, 2024 14:03

black fixes

55f3922

fixing sonarcloud issues

82ff677

removing subfolders on export

4fdbcc6

Bobronium reviewed Oct 3, 2024

View reviewed changes

removed a link

d64de2e

Bobronium approved these changes Oct 4, 2024

View reviewed changes

Eldies merged commit e612d1b into develop Oct 4, 2024
20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YOLOv8 classification #59

YOLOv8 classification #59

Eldies commented Sep 19, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 19, 2024 •

edited

Loading

Review skipped

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

Bobronium commented Sep 23, 2024

Bobronium commented Sep 27, 2024

Bobronium Oct 3, 2024

Eldies Oct 4, 2024

sonarcloud bot commented Oct 4, 2024



		class YOLOv8ClassificationConverter(Converter):
		# https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

YOLOv8 classification #59

YOLOv8 classification #59

Conversation

Eldies commented Sep 19, 2024 • edited by coderabbitai bot Loading

Summary

How to test

Checklist

License

Summary by CodeRabbit

coderabbitai bot commented Sep 19, 2024 • edited Loading

Review skipped

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

Bobronium commented Sep 23, 2024

Bobronium commented Sep 27, 2024

Bobronium Oct 3, 2024

Choose a reason for hiding this comment

Eldies Oct 4, 2024

Choose a reason for hiding this comment

sonarcloud bot commented Oct 4, 2024

Quality Gate passed

Eldies commented Sep 19, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 19, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)