Generalizing intrusion detection for heterogeneous networks: A stacked-unsupervised federated learning approach

This repository relates to our paper that describes the stacked-unsupervised federated learning (FL) approach to generalize on a cross-silo configuration for a flow-based network intrusion detection system (NIDS). The proposed approach we have looked over is a deep autoencoder in conjunction with an energy flow classifier in an ensemble learning task.

Our approach performs better than traditional local learning and naive cross-evaluation (training in one context and testing on another network data). Remarkably, the proposed approach demonstrates a sound performance in the case of non-iid data silos. Along with an informative feature in an ensemble architecture for unsupervised learning, we advise that the proposed FL-based NIDS results in a feasible approach for generalization between heterogeneous networks.

Reproducing this work

Install the requirements to reproduce this work:

Tested with Python 3.9.11

$ python -m venv venv
$ source venv\bin\activate
(venv) $ pip install --upgrade pip
(venv) $ pip install Cython
(venv) $ pip install -r requirements.txt

Choose one of the experiments as the full datasets* (run_full.sh), reduced datasets (run_reduced.sh), or the sampled datasets (run_sampled.sh). For instance, running the reduced datasets:

(venv) $ chmod +x run_reduced.sh
(venv) $ ./run_reduced.sh

* the full datasets are not part of this repository, see instructions on how to download the datasets inside the full_datasets folder.

Some possible configurations

To simulate other federated learning strategies of aggregation, the changes must be made to server.py according to Flower documentation.
To remove the EFC as part of the autoencoder, remove the argument --with-EFC from the shellscript files.
Select between just benign or benign and attack threshold for the autoencoder. Edit the file client.py, the test_eval assigned to distance_calc method refers to both thresholds, and assigned to the comparison of losses to threshold_benign for the only benign case.

Content of this repository

.
├── baselines.py	 --> calculate the baselines over sampled datasets
├── baselines_reduced.py --> calculate the baselines over reduced datasets
├── client.py		 --> the source code for federated learning clients
├── error_analysis	 --> data used for error analysis
├── Error Analysis.ipynb --> notebook with error analysis
├── full_datasets	 --> reference for downloading the full datasets
├── README.md		 --> this README
├── reduced_datasets	 --> the reduced datasets (*.csv.gz)
├── requirements.txt	 --> requirements of libraries and specific versions
├── run_full.sh		 --> to execute the proposed method over full datasets
├── run_reduced.sh	 --> to execute the proposed method over reduced datasets
├── run_sampled.sh	 --> to execute the proposed method over sampled datasets
├── sampled_datasets	 --> the sampled datasets (*.csv.gz)
├── server.py		 --> the source code for federated learning server
└── utils
    ├── generate_reduced_datasets.py	--> generate reduced datasets
    ├── load_data.py			--> code for loading datasets
    └── model.py			--> code for autoencoder

Cite this

@article{10.1016/j.cose.2023.103106,
title = {Generalizing intrusion detection for heterogeneous networks: A stacked-unsupervised federated learning approach},
journal = {Computers \& Security},
pages = {103106},
year = {2023},
issn = {0167-4048},
doi = {https://doi.org/10.1016/j.cose.2023.103106},
url = {https://www.sciencedirect.com/science/article/pii/S0167404823000160},
author = {Gustavo {de Carvalho Bertoli} and Lourenço Alves {Pereira Junior} and Osamu Saotome and Aldri Luiz {dos Santos}},
keywords = {Network Intrusion Detection, Generalization, Unsupervised Learning, Federated Learning, Network flows}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generalizing intrusion detection for heterogeneous networks: A stacked-unsupervised federated learning approach

Reproducing this work

Some possible configurations

Content of this repository

Cite this

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
error_analysis		error_analysis
full_datasets		full_datasets
reduced_datasets		reduced_datasets
sampled_datasets		sampled_datasets
utils		utils
.gitignore		.gitignore
Error Analysis.ipynb		Error Analysis.ipynb
README.md		README.md
baselines.py		baselines.py
baselines_reduced.py		baselines_reduced.py
client.py		client.py
requirements.txt		requirements.txt
run_full.sh		run_full.sh
run_reduced.sh		run_reduced.sh
run_sampled.sh		run_sampled.sh
server.py		server.py

c2dc/fl-unsup-nids

Folders and files

Latest commit

History

Repository files navigation

Generalizing intrusion detection for heterogeneous networks: A stacked-unsupervised federated learning approach

Reproducing this work

Some possible configurations

Content of this repository

Cite this

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages