Malicious traffic detection

Overview

An advanced solution for network traffic analysis, DDoS mitigation, and malicious behavior detection, leveraging NFStream, XGBoost, and integrated threat intelligence.

The graph shows the evolution of detected threats over time, with the x-axis representing seconds and the y-axis showing the number of detections

Features

Network traffic analysis

The tool provides comprehensive network traffic analysis capabilities through PCAP processing and network interface monitoring using NFStream.

Summary dashboard showing aggregated flow statistics grouped by source IP addresses, including total bytes transferred, packet counts, and flow duration

Flow statistics and pattern detection

It performs flow statistics aggregation and analysis to understand network behavior patterns. For threat detection, the system supports multiple detection methods including custom NFStream plugins, Sigma rules, and machine learning-based classification.

Detection results using Sigma rules, showing triggered alerts with their associated network flows and timestamps

Machine learning classification

The report generated by the tool includes predictions from the model in the form of confidence scores.

Model predictions with confidence scores for each analyzed network flow

Enrichment

The system enriches detections with additional threat intelligence data from ip-api.com for IP geolocation and GreyNoise for known malicious activity.

Geographical and network information for detected IP addresses, including country, ISP, and ASN details

Geographic visualization

For geographic analysis, the tool generates world maps showing the origins of suspicious IP addresses.

Interactive world map highlighting countries of origin for detected suspicious IP addresses, with color intensity indicating detection frequency

Installation

Install dependencies and synchronize the environment:

uv sync

Usage

Analyze network traffic

Run the analyzer on a PCAP file or network interface:

uv run -m mtd path/to/file.pcap

Usage: python -m mtd [OPTIONS] SOURCE

Arguments:
  source      TEXT  Input source such as a PCAP file or network interface  [required]

Options:
  --plugins PATH          Directories or files to load plugins from  [default: None]
  --sigma-paths PATH      Directories or files to load Sigma rules from  [default: None]
  --model-path PATH       Path to the model directory containing model.json and metadata.json  [default: None]
  --default-plugins TEXT  Specify which default plugins to load  [default: Sigma, GeoIP, ML, GreyNoise]
  --output PATH           Output file to write detections to  [default: None]
  --greynoise-api-key     GreyNoise API key  [default: None]
  --draw-map/--no-draw-map
                          Whether to plot detections on a map  [default: no-draw-map]
  --install-completion    Install shell completion for the CLI.
  --show-completion       Show completion for the current shell.
  --help                  Show this message and exit.

Plugin system

The tool uses NFStream's plugin system for detection rules. Plugins can be loaded from:

plugins/ directory in the main application folder
Custom paths specified via --plugins CLI option
Installed Python modules prefixed with mtd_

Data preparation

Converting pcap to flow data (csv)

Process malicious traffic (label=1):

find data/malicious -name '*.pcap' | xargs -n 1 uv run python pcap2flow.py 1

Process benign traffic (label=0):

find data/benign -name '*.pcap' | xargs -n 1 uv run python pcap2flow.py 0

Data preprocessing

Prepare data for model training:

uv run -m scripts.preprocess_data

Machine learning

The system implements a machine learning pipeline using XGBoost for traffic classification. The model demonstrates high accuracy in distinguishing between benign and malicious network flows.

Example decision tree from the XGBoost model ensemble, showing the decision paths for traffic classification

Initial model training

Train a new model:

uv run -m scripts.train

Available options:

uv run -m scripts.train --help

options:
  -h, --help            show this help message and exit
  --input INPUT         Path to input CSV file with training data
  --raw-data-dir RAW_DATA_DIR
                        Directory containing raw data files (default: data/raw)
  --processed-data-dir PROCESSED_DATA_DIR
                        Directory for processed data files (default: data/processed)
  --models-dir MODELS_DIR
                        Directory for saving model artifacts (default: models)
  --test-size TEST_SIZE
                        Proportion of data for testing (default: 0.2)
  --random-state RANDOM_STATE
                        Random state for reproducibility (default: 42)
  --target-benign-ratio TARGET_BENIGN_RATIO
                        Target ratio of benign traffic in the dataset (default: 0.7)
  --min-class-ratio MIN_CLASS_RATIO
                        Minimum acceptable ratio for any class (default: 0.1)
  --model-name MODEL_NAME
                        Name of the model (default: xgboost_binary)

Model retraining

Update existing model with new data:

uv run -m scripts.retrain \
  --model-path models/development/xgboost_20241222_225105_v1 \
  --input data/processed/combined_flows.csv

Retraining options:

uv run -m scripts.retrain --help

options:
  -h, --help            Show this help message and exit
  --model-path MODEL_PATH
                        Path to the existing model directory
  --input INPUT [INPUT ...]
                        Path(s) to input CSV file(s) with a 'Label' column (0 or 1)
  --test-size TEST_SIZE
                        Proportion of data for testing (default: 0.2)
  --random-state RANDOM_STATE
                        Random state for reproducibility (default: 42)

Notes

Input CSVs for training must include a binary 'Label' column (0 = benign, 1 = malicious)
Training data must contain both benign and malicious samples
Model versions increment automatically (v1 → v2, etc.)
Training artifacts are saved in version-specific directories

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.vscode		.vscode
images		images
mtd		mtd
plugins		plugins
scripts		scripts
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
__init__.py		__init__.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Malicious traffic detection

Overview

Features

Network traffic analysis

Flow statistics and pattern detection

Machine learning classification

Enrichment

Geographic visualization

Installation

Usage

Analyze network traffic

Plugin system

Data preparation

Converting pcap to flow data (csv)

Data preprocessing

Machine learning

Initial model training

Model retraining

Notes

Datasets

References

About

Contributors 2

Languages

hackerman70000/Malicious-traffic-detection

Folders and files

Latest commit

History

Repository files navigation

Malicious traffic detection

Overview

Features

Network traffic analysis

Flow statistics and pattern detection

Machine learning classification

Enrichment

Geographic visualization

Installation

Usage

Analyze network traffic

Plugin system

Data preparation

Converting pcap to flow data (csv)

Data preprocessing

Machine learning

Initial model training

Model retraining

Notes

Datasets

References

About

Resources

Stars

Watchers

Forks

Contributors 2

Languages