Skip to content

An advanced solution for network traffic analysis, DDoS mitigation, and malicious behavior detection, leveraging NFStream, XGBoost, and integrated threat intelligence.

Notifications You must be signed in to change notification settings

hackerman70000/Malicious-traffic-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Malicious traffic detection

Overview

An advanced solution for network traffic analysis, DDoS mitigation, and malicious behavior detection, leveraging NFStream, XGBoost, and integrated threat intelligence.

Real-time visualization of detected threats over time The graph shows the evolution of detected threats over time, with the x-axis representing seconds and the y-axis showing the number of detections

Features

Network traffic analysis

The tool provides comprehensive network traffic analysis capabilities through PCAP processing and network interface monitoring using NFStream.

Network flow statistics summary Summary dashboard showing aggregated flow statistics grouped by source IP addresses, including total bytes transferred, packet counts, and flow duration

Flow statistics and pattern detection

It performs flow statistics aggregation and analysis to understand network behavior patterns. For threat detection, the system supports multiple detection methods including custom NFStream plugins, Sigma rules, and machine learning-based classification.

Results from Sigma rule detections Detection results using Sigma rules, showing triggered alerts with their associated network flows and timestamps

Machine learning classification

The report generated by the tool includes predictions from the model in the form of confidence scores.

XGBoost model output showing detection confidence Model predictions with confidence scores for each analyzed network flow

Enrichment

The system enriches detections with additional threat intelligence data from ip-api.com for IP geolocation and GreyNoise for known malicious activity.

GeoIP enrichment data Geographical and network information for detected IP addresses, including country, ISP, and ASN details

Geographic visualization

For geographic analysis, the tool generates world maps showing the origins of suspicious IP addresses.

World map visualization of threat origins Interactive world map highlighting countries of origin for detected suspicious IP addresses, with color intensity indicating detection frequency

Installation

Install dependencies and synchronize the environment:

uv sync

Usage

Analyze network traffic

Run the analyzer on a PCAP file or network interface:

uv run -m mtd path/to/file.pcap
Usage: python -m mtd [OPTIONS] SOURCE

Arguments:
  source      TEXT  Input source such as a PCAP file or network interface  [required]

Options:
  --plugins PATH          Directories or files to load plugins from  [default: None]
  --sigma-paths PATH      Directories or files to load Sigma rules from  [default: None]
  --model-path PATH       Path to the model directory containing model.json and metadata.json  [default: None]
  --default-plugins TEXT  Specify which default plugins to load  [default: Sigma, GeoIP, ML, GreyNoise]
  --output PATH           Output file to write detections to  [default: None]
  --greynoise-api-key     GreyNoise API key  [default: None]
  --draw-map/--no-draw-map
                          Whether to plot detections on a map  [default: no-draw-map]
  --install-completion    Install shell completion for the CLI.
  --show-completion       Show completion for the current shell.
  --help                  Show this message and exit.

Plugin system

The tool uses NFStream's plugin system for detection rules. Plugins can be loaded from:

  1. plugins/ directory in the main application folder
  2. Custom paths specified via --plugins CLI option
  3. Installed Python modules prefixed with mtd_

Data preparation

Converting pcap to flow data (csv)

Process malicious traffic (label=1):

find data/malicious -name '*.pcap' | xargs -n 1 uv run python pcap2flow.py 1

Process benign traffic (label=0):

find data/benign -name '*.pcap' | xargs -n 1 uv run python pcap2flow.py 0

Data preprocessing

Prepare data for model training:

uv run -m scripts.preprocess_data

Machine learning

The system implements a machine learning pipeline using XGBoost for traffic classification. The model demonstrates high accuracy in distinguishing between benign and malicious network flows.

XGBoost decision tree visualization Example decision tree from the XGBoost model ensemble, showing the decision paths for traffic classification

Initial model training

Train a new model:

uv run -m scripts.train

Available options:

uv run -m scripts.train --help

options:
  -h, --help            show this help message and exit
  --input INPUT         Path to input CSV file with training data
  --raw-data-dir RAW_DATA_DIR
                        Directory containing raw data files (default: data/raw)
  --processed-data-dir PROCESSED_DATA_DIR
                        Directory for processed data files (default: data/processed)
  --models-dir MODELS_DIR
                        Directory for saving model artifacts (default: models)
  --test-size TEST_SIZE
                        Proportion of data for testing (default: 0.2)
  --random-state RANDOM_STATE
                        Random state for reproducibility (default: 42)
  --target-benign-ratio TARGET_BENIGN_RATIO
                        Target ratio of benign traffic in the dataset (default: 0.7)
  --min-class-ratio MIN_CLASS_RATIO
                        Minimum acceptable ratio for any class (default: 0.1)
  --model-name MODEL_NAME
                        Name of the model (default: xgboost_binary)

Model retraining

Update existing model with new data:

uv run -m scripts.retrain \
  --model-path models/development/xgboost_20241222_225105_v1 \
  --input data/processed/combined_flows.csv

Retraining options:

uv run -m scripts.retrain --help

options:
  -h, --help            Show this help message and exit
  --model-path MODEL_PATH
                        Path to the existing model directory
  --input INPUT [INPUT ...]
                        Path(s) to input CSV file(s) with a 'Label' column (0 or 1)
  --test-size TEST_SIZE
                        Proportion of data for testing (default: 0.2)
  --random-state RANDOM_STATE
                        Random state for reproducibility (default: 42)

Notes

  • Input CSVs for training must include a binary 'Label' column (0 = benign, 1 = malicious)
  • Training data must contain both benign and malicious samples
  • Model versions increment automatically (v1 → v2, etc.)
  • Training artifacts are saved in version-specific directories

Datasets

References

About

An advanced solution for network traffic analysis, DDoS mitigation, and malicious behavior detection, leveraging NFStream, XGBoost, and integrated threat intelligence.

Resources

Stars

Watchers

Forks

Languages