This a fork from the original implementation at https://github.com/munhouiani/Deep-Packet
-
SMOTE Implementation: create_train_test_set.py
-
Train/Test data reporting: data_reports.py
-
Test and collect metrics to evaluate model performance: test_cnn.py
-
Precision-Recall curves: ml/metrics.py
- Create an environment via conda
-
For Linux (CUDA 10.6)
conda env create -f env_linux_cuda116.yaml
-
Download the pre-processed dataset from small dataset
-
Create a directory called
processed_small
and extract the contents of the downloaded datasetmkdir processed_small tar -xvzf processed_small.tar.gz -C processed_small
python create_train_test_set.py --source ~/datasets/processed_small --train ~/datasets/undersampled_train_split --test ~/datasets/test_split --class_balancing under_sampling
- Minority classes (c): 2
- Nearest Neighbors (k): 5
- Amount of SMOTE (n): 1, 2, 3, 4, 5
python create_train_test_set.py --source ~/datasets/processed_small --train ~/datasets/smote_c2_n2_k5_train_split --test ~/datasets/test_split --class_balancing SMOTE+under_sampling -c 2 -n 2 -k 5 -t app --skip_test 1
Application Classification
python train_cnn.py -d ~/datasets/smote_c2_n1_k5_train_split/application_classification/train.parquet -m model/application_classification.cnn.model.smote.c2n1k5 -t app
Application Classification
python test_cnn.py -d ~/datasets/test_split/application_classification/test.parquet -m model/application_classification.cnn.model.smote.c2n1k5 -t app -p c2n1k5
python data_reports.py -p /path/to/datasets/test_split/application_classification/test.parquet -t app -o app_test_data_dist.png
python preprocessing.py -s /path/to/pcap_files -t /path/to/datasets/processed_new