This repository contains code and data to reproduce the results of the paper Deep Neural Networks for Active Wave Breaking Classification published by Scientific Reports.
Deep Neural Networks for Active Wave Breaking Classification
- 1. Dependencies
- 2. Data
- 3. Training
- 4. Model Performance
- 5. Using a Pre-trained Neural Network
- 6. Model Interpretation
- 7. Active Wave Breaking Segmentation
- 8. Wave Tracking
- 9. Wave Breaking Statistics
- Gallery
- Standard Variable Names
- Disclaimer
Using conda
:
- Windows
conda env create -f environment_win.yml
- Linux
conda env create -f environment_linux.yml
Package list
# install GIT
conda install git
# create a new environment
conda create --name tf python=3
# activate your new environment
conda activate tf
# If you have a nvidia GPU installed and properly configured
pip install --upgrade pip
pip install tensorflow
# extras
pip install -q git+https://github.com/tensorflow/examples.git
pip install tensorflow_addons
# conda packages
conda install -y numba natsort pandas scikit-learn scikit-image natsort matplotlib seaborn netCDF4 xarray ipython tqdm
pip install h5netcdf
# Extra thresholding methods
pip install pythreshold
# fitting circles to data
pip install miniball
# parallel computations
pip install pebble
- Refer to Manual Data Preparation.
Model | Link | Alternative link |
---|---|---|
Train (10k) | - | |
Train (20k) | - | |
Test (1k) | - | |
Black Sea (200k) | Upcoming | - |
La Jument 2019 (10k) | Upcoming | - |
Note: The training dataset used here is a smaller version (10k) of the published dataset so it can run on Google Colab. The 20K dataset takes over 6 hours to train and Google will disconnect your session.
The data needs to be in a folder which has sub-folders "0" and "1"
For example:
train
├───0
├───1
There are 5 backbones
implemented: VGG16
, ResNet50V2
, InceptionResNetV2
, MobileNetV2
and EfficientNet
Note that the weights from these pre-trained models will be reset and updated from the scratch here. These models have no knowledge of the present data and, consequently, transferred learning does not work well.
Example
python train.py --data "train/" --backbone "VGG16" --model "vgg_test" --logdir "logs/" --random-state 11 --validation-size 0.2 --learning-rate 0.00001 --epochs 200 --batch-size 64 --dropout 0.5 --input-size 256 256
Arguments:
-
--data
Input train data path. -
--model
Model name. -
--backbone
Which backbone to use. See above. -
--random-state
Random seed for reproducibility. Default is 11. -
--validation-size
Size of the validation dataset. Default is 0.2. -
--epochs
Number of epochs (iterations) to train the model. Default is 200. -
--batch-size
Number of images to process in each step. Decrease if running into memory issues. Default is 64. -
--dropout
Droput percentage. Default is 0.5. -
--input-size
Image input size. Decrease if running into memory issues. Default is 256x256px.
The neural network looks something like this:
Please use the links below to download pre-trained models:
Scientific Reports (20K dataset)
Model | Link | Alternative link |
---|---|---|
VGG16 | - | |
ResNet50V2 | - | |
InceptionResNetV2 | - | |
MobileNet | - | |
EfficientNet | - |
La Jument (10K dataset)
Model | Link | Alternative link |
---|---|---|
VGG16 | Upcoming | - |
ResNet50V2 | Upcoming | - |
InceptionResNetV2 | Upcoming | - |
MobileNet | Upcoming | - |
EfficientNet | Upcoming | - |
Note: Work in progress.
To evaluate a pre-trained model on test data, use the test
script.
Example:
python test.py --data "path/to/test/data/" --model "VGG16.h5" --threshold 0.5 -- output "path/to/results.csv"
Arguments:
-
--data
Input test data. Use same structure as when training. -
--model
Pre-trained model. -
--threshold
Threshold for binary classification. Default is 0.5 -
--output
path to save the results.
The classification report
with be printed on the screen. For example:
precision recall f1-score support
0.0 0.88 0.99 0.94 1025
1.0 0.87 0.23 0.36 175
accuracy 0.88 1200
macro avg 0.88 0.61 0.65 1200
weighted avg 0.88 0.88 0.85 1200
To summarize the model metrics do:
python metrics.py --data "path/to/data/" --model "VGG16.h5" --threshold 0.5 -- output "path/to/metrics.csv"
The arguments are the same as above.
The results look something like this:
VGG16 | Binary_Accuracy | True_Positives | False_Positives | True_Negatives | False_Negatives | Precision | Recall | AUC |
---|---|---|---|---|---|---|---|---|
Train | 0.89 | 771.00 | 248.00 | 5680.00 | 521.00 | 0.76 | 0.60 | 0.92 |
Validation | 0.87 | 100.00 | 19.00 | 1463.00 | 222.00 | 0.84 | 0.31 | 0.90 |
Test | 0.88 | 40.00 | 6.00 | 1019.00 | 135.00 | 0.87 | 0.23 | 0.82 |
To plot the training curves and a confusion matrix, do:
python plot_history_and_confusion_matrix.py --history "path/to/history.csv" --results "path/to/results.csv" --output "figure.png"
Arguments:
-
--history
Training history. Comes fromtrain_wave_breaking_classifier_v2.py
. -
--results
Classification results from the test data. Comes fromtest_wave_breaking_classifier.py
. -
--output
Figure name.
The table below summarizes the results presented in the paper. Results are sorted by AUC
.
Train
Model | Accuracy | TP | FP | TN | FN | Precision | Recall | AUC |
---|---|---|---|---|---|---|---|---|
ResNetV250 | 0.97 | 1414 | 198 | 13978 | 280 | 0.877 | 0.835 | 0.989 |
VGG16 | 0.93 | 855 | 273 | 13911 | 831 | 0.758 | 0.507 | 0.943 |
InceptionResnetV2 | 0.927 | 886 | 359 | 13823 | 802 | 0.712 | 0.525 | 0.932 |
EfficientNet | 0.772 | 1403 | 3346 | 10920 | 297 | 0.295 | 0.825 | 0.874 |
MobileNet | 0.904 | 436 | 268 | 13916 | 1250 | 0.619 | 0.259 | 0.848 |
Validation
Model | Accuracy | TP | FP | TN | FN | Precision | Recall | AUC |
---|---|---|---|---|---|---|---|---|
VGG16 | 0.932 | 221 | 65 | 3478 | 204 | 0.773 | 0.52 | 0.946 |
ResNetV250 | 0.919 | 197 | 97 | 3450 | 224 | 0.67 | 0.468 | 0.873 |
InceptionResnetV2 | 0.921 | 190 | 81 | 3466 | 231 | 0.701 | 0.451 | 0.93 |
EfficientNet | 0.809 | 353 | 687 | 2856 | 72 | 0.339 | 0.831 | 0.897 |
MobileNet | 0.908 | 123 | 64 | 3479 | 302 | 0.658 | 0.289 | 0.878 |
Test
Model | Accuracy | TP | FP | TN | FN | Precision | Recall | AUC |
---|---|---|---|---|---|---|---|---|
VGG16 | 0.876 | 106 | 80 | 945 | 69 | 0.57 | 0.606 | 0.855 |
ResNetV250 | 0.881 | 95 | 63 | 962 | 80 | 0.601 | 0.543 | 0.843 |
InceptionResnetV2 | 0.882 | 91 | 57 | 968 | 84 | 0.615 | 0.52 | 0.839 |
EfficientNet | 0.873 | 88 | 65 | 960 | 87 | 0.575 | 0.503 | 0.827 |
MobileNet | 0.875 | 30 | 5 | 1020 | 145 | 0.857 | 0.171 | 0.768 |
Create a dataset either manually or with the provided tools then use the predict
script. The data structure is as follows:
pred
├───images
├───img_00001.png
├───img_00002.png
├───...
├───img_0000X.png
Example:
python predict.py --data "pred/" --model "VGG16.h5" --threshold 0.5 --output "results.csv"
Arguments:
-
--data
Input test data. -
--model
Pre-trained model. -
--threshold
Threshold for binary classification. Default is 0.5 -
--output
A csv file with the classification results.
Use predict from naïve candidates
and the results from the naive wave breaking detector
and a pre-trained neural network to obtain only active wave breaking instances. This script runs on CPU
but can be much faster on GPU
.
Example:
python predict_from_naive_candidates.py --debug --input "naive_results.csv" --model "path/to/model.h5" --frames "path/to/frames/folder/" --region-of-interest "region_of_interest.csv" --output "robust_results.csv" --temporary-path "tmp" --frames-to-plot 1000 --threshold 0.5
Arguments:
-
--debug
Runs in debug mode and will save output plots. -
-i [--input]
Input data obtained fromnaive_wave_breaking_detector
. -
-m [--model]
Pre-trained Tensorflow model. -
-o [--output]
Output file name (see below for explanation). -
-frames [--frames]
Input path with images. -
--region-of-interest
File with region of interest. Useminimun bounding geometry
to generate a valid input file. -
-temporary-path
Output path for debug plots. -
--frames-to-process
Number of frames to process. -
--from-frame
Start frame. -
--regex
Regular expression to find input frames. Default is"[0-9]{6,}"
. -
--threshold
Threshold for activation in the last (sigmoid) layer of the model. Default is0.5
.Note: The input data must have at least the following entries:
ic
,jc
,ir
, andframe
.
The output of this script is a comma-separated value (csv) file. It looks like exactly like the output of naive wave breaking detector
but adding a extra column with the results of the classification.
Plot the results of the wave breaking detection algorithms. Can handle outputs of any algorithm, as long as the input data is correct. Ideally the results from cluster.py
are used as input.
Example:
python plot_wave_breaking_detection_results.py --input "clustered_events.csv" --output "path/to/output/" --frames "path/to/frames/" --region-of-interest "path/to/roi.csv" --frames-to-plot 1000
Arguments:
-
-i [--input]
Input csv file. -
-o [--output]
Output path. -
-frames-path
Path with frames. -
--region-of-interest
File with region of interest. Useminimun bounding geometry
to generate a valid input file. -
--frames-to-plot
Number of frames to plot. -
--from-frame
Number of frames to plot. -
--regex
Regular expression to find input frames. Default is"[0-9]{6,}"
.
Note: The input data must have at least the following entries: ic
, jc
, ir
, frame
, wave_breaking_event
.
Use interpret.py
to apply Grad-CAM to data samples. Organize your data as follows:
gradcam
├───images
├───img_00001.png
├───img_00002.png
├───...
├───img_0000X.png
Example:
python interpret.py --data "path/to/gradcam" --model "VGG16.h5" -o "path/to/output"
Arguments:
-
-data
Input image data path. -
-o [--output]
Output path. -
-model
pre-trained VGG16 model.
Note: This script will only work with VVG16 models.
The neural networks developed here can also be used for image segmentation. Please refer to Image Segmentation
.
Please refer to Wave Tracking
.
Please refer to Wave Breaking Statistics
.
La Jument:
Black Sea:
Aqua Alta:
Grad-CAM (Black Sea):
Image Segmentation (La Jument):
The following variables are standard across this repository and scripts that output these quantities should use these names. If a given script has extra output variables, these are documented in each script.
Variables
Variable | Description | |
---|---|---|
x |
x-coordinate in metric coordinates. | |
y |
y-coordinate in metric coordinates. | |
z |
z-coordinate in metric coordinates. | |
time |
date and time. Use a format that pandas.to_datetime() can understand. | |
frame |
sequential number. | |
i |
pixel coordinate in pixel units. Use Matplotlib coordinate system. | |
j |
pixel coordinate in pixel units. Use Matplotlib coordinate system. | |
ic |
center of a circle or ellipse in pixel coordinates. | |
jc |
center of a circle or ellipse in pixel coordinates. | |
xc |
center of a circle or ellipse in metric coordinates. | |
yc |
center of a circle or ellipse in metric coordinates. | |
ir |
radius in the i-direction. | |
jr |
radius in the j-direction. | |
xr |
radius in the x-direction. | |
yr |
radius in the y-direction. | |
theta_ij |
angle of rotation of an ellipse with respect to the x-axis counter-clockwise. | |
theta_xy |
angle of rotation of an ellipse with respect to the x-axis counter-clockwise. | |
wave_breaking_event |
unique wave breaking event id. | |
vx |
velocity in the x-direction in m/s. | |
vy |
velocity in the y-direction in m/s. | |
vi |
velocity in the x-direction in pixels/frame. | |
vj |
velocity in the y-direction in pixels/frame. |
There is no warranty for the program, to the extent permitted by applicable law except when otherwise stated in writing the copyright holders and/or other parties provide the program “as is” without warranty of any kind, either expressed or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. the entire risk as to the quality and performance of the program is with you. should the program prove defective, you assume the cost of all necessary servicing, repair or correction.