Skip to content

ultralytics/xview-yolov3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Ultralytics logo

πŸš€ Introduction

Welcome to the Ultralytics xview-yolov3 repository! This project provides the necessary code and instructions to train the powerful Ultralytics YOLOv3 object detection model on the challenging xView dataset. The primary goal is to support participants in the xView Challenge, which focuses on advancing the state-of-the-art in detecting objects within satellite imagery, a critical application of computer vision in remote sensing.

Ultralytics Actions Ultralytics Discord Ultralytics Forums Ultralytics Reddit

xView dataset example detections

πŸ“¦ Requirements

To successfully run this project, ensure your environment meets the following prerequisites:

  • Python: Version 3.6 or later. You can download Python from the official Python website.
  • Dependencies: Install the required packages using pip. It's recommended to use a virtual environment.
pip3 install -U -r requirements.txt

Key dependencies include:

  • numpy: Essential for numerical operations in Python.
  • scipy: Provides algorithms for scientific and technical computing.
  • torch: The core PyTorch library for deep learning.
  • opencv-python: The OpenCV library for computer vision tasks.
  • h5py: Enables interaction with data stored in HDF5 format.
  • tqdm: A utility for displaying progress bars in loops and command-line interfaces.

πŸ“₯ Download Data

Begin by downloading the necessary xView dataset files. You can obtain the data directly from the xView Challenge data download page. Ensure you have sufficient storage space, as satellite imagery datasets can be quite large.

πŸ‹οΈβ€β™‚οΈ Training

Training the YOLOv3 model on the xView dataset involves preprocessing the data and then running the training script.

Preprocessing Steps

Before initiating the training process, we perform several preprocessing steps on the target labels to enhance model performance:

  1. Outlier Removal: Outliers in the dataset are identified and removed using sigma-rejection to clean the data.
  2. Anchor Generation: A new set of 30 k-means anchors are generated specifically tailored for the c60_a30symmetric.cfg configuration file. This process utilizes the MATLAB script utils/analysis.m. The generated anchors help the model better predict bounding boxes of various sizes and aspect ratios present in the xView dataset.

k-means anchors plot

Starting the Training

Once the xView data is downloaded and placed in the expected directory, you can start training by executing the train.py script. You will need to configure the path to your xView data within the script:

  • Modify line 41 for local machine execution.
  • Modify line 43 if you are training in a cloud environment like Google Colab or Kaggle.
python train.py

Resuming Training

If your training session is interrupted, you can easily resume training from the last saved checkpoint. Use the --resume flag as shown below:

python train.py --resume 1

The script will automatically load the weights from the latest.pt file located in the checkpoints/ directory and continue the training process.

Training Details

During each training epoch, the system processes 8 randomly sampled 608x608 pixel chips extracted from each full-resolution image in the dataset. On hardware like an Nvidia GTX 1080 Ti, you can typically complete around 100 epochs per day.

Be mindful of overfitting, which can become a significant issue after approximately 200 epochs. Monitoring validation metrics is crucial. The best observed validation mean Average Precision (mAP) in experiments was 0.16 after 300 epochs (roughly 3 days of training), corresponding to a training mAP of 0.30.

Monitor the training progress by observing the loss plots for bounding box regression, objectness, and class confidence. These plots should ideally show decreasing trends, similar to the example below:

xView training loss plot

Image Augmentation πŸ“Έ

To improve model robustness and generalization, the datasets.py script applies various data augmentations to the full-resolution input images during training using OpenCV. The specific augmentations and their parameters are:

Augmentation Description
Translation +/- 1% (vertical and horizontal)
Rotation +/- 20 degrees
Shear +/- 3 degrees (vertical and horizontal)
Scale +/- 30%
Reflection 50% probability (vertical and horizontal)
HSV Saturation +/- 50%
HSV Intensity +/- 50%

Note: Augmentation is applied only during the training phase. During inference or validation, the original, unaugmented images are used. The corresponding bounding box coordinates are automatically adjusted to match the transformations applied to the images. Explore more augmentation techniques with Albumentations.

πŸ” Inference

After training completes, the model checkpoints (.pt files) containing the learned weights are saved in the checkpoints/ directory. You can use the detect.py script to perform inference on new or existing xView images using your trained model.

For example, to run detection on the image 5.tif from the training set using the best performing weights (best.pt), you would run:

python detect.py --weights checkpoints/best.pt --source path/to/5.tif

The script will process the image, detect objects, draw bounding boxes, and save the output image. An example output might look like this:

Example inference output on xView image

πŸ“ Citation

If you find this repository, the associated tools, or the xView dataset useful in your research or work, please consider citing the relevant sources:

DOI

For the xView dataset itself, please refer to the citation guidelines provided on the xView Challenge website.

πŸ‘₯ Contribute

🀝 We thrive on community contributions! Open-source projects like this benefit greatly from your input. Whether it's fixing bugs, adding features, or improving documentation, your help is valuable. Please see our Contributing Guide for more details on how to get started.

We also invite you to share your feedback through our Survey. Your insights help shape the future of Ultralytics projects.

A huge thank you πŸ™ to all our contributors for making our community vibrant and innovative!

Ultralytics open-source contributors

πŸ“œ License

Ultralytics offers two licensing options to accommodate different needs:

  • AGPL-3.0 License: Ideal for students, researchers, and enthusiasts, this OSI-approved open-source license promotes open collaboration and knowledge sharing. See the LICENSE file for full details.
  • Enterprise License: Designed for commercial use, this license allows integration of Ultralytics software and AI models into commercial products and services without the open-source requirements of AGPL-3.0. If your project requires an Enterprise License, please contact us via Ultralytics Licensing.

πŸ“¬ Contact

For bug reports, feature requests, or suggestions, please use the GitHub Issues page. For general questions, discussions, and community interaction, join our Discord server!


Ultralytics GitHub space Ultralytics LinkedIn space Ultralytics Twitter space Ultralytics YouTube space Ultralytics TikTok space Ultralytics BiliBili space Ultralytics Discord