The prototype source code of the paper:
Realtime Robust Malicious Traffic Detection via Frequency Domain Analysis
Chuanpu Fu, Qi Li, Meng Shen, Ke Xu.
ACM Conference on Computer and Communications Security (CCS 2021)
@inproceedings{CCS21-Whisper,
author = {Chuanpu Fu and
Qi Li and
Meng Shen and
Ke Xu},
title = {Realtime Robust Malicious Traffic Detection via Frequency Domain Analysis},
booktitle = {{CCS} '21: 2021 {ACM} {SIGSAC} Conference on Computer and Communications
Security, Virtual Event, Republic of Korea, November 15 - 19, 2021},
pages = {3431--3446},
publisher = {{ACM}},
year = {2021},
}
Malicious traffic detection systems are designed to identify malicious traffic on the forwarding path. As a promising security paradigm, machine learning (ML) was leveraged for the zero-day attack issue. Due to the improper trade-off between feature scale and efficiency, the existing can not realize robust and realtime detection. We present the frequency domain features, which reduce the scale of traditional per-packet features, avoid information loss in the flow-level features. Finally, in this repo. Finally, we present the Whisper prototype, an end-to-end detector in a 10 Gb scale network in this repo.
For more details, plsease refer to our paper in ACM CCS 2021.
Feel free to contact me, when something went wrong.
Before software installation please check your hardware platform according to the testbed setup in the paper. Here I list some recommendations:
- Ensure all your NICs and CPUs supports Intel DPDK, find the versions using
lspci
andproc/cpuinfo
and check the lists in DPDK Support - Check the connectivity of fiber and laser modules using ICMP echo and static routing. Note that, direct connections are preferred to prevent errors.
- To adapt the packet rate of MAWI datasets, ensure the NICs support at least 10 Gbps throughput. Measuring the throughput using
iperf3
is recommended. - At least 10 GB of memory is needed, for the DPDK huge pages. And the server for Whisper main modules needs at least 17 cores.
-
Install compile toolchain.
The prototype was tested in Ubuntu 18.04 and 20.04. It is compiled bycmake
+ninja
+gcc
, please find the correct versions and install the tool chain usingapt-get
. -
Install DPDK.
Whisper used DPDK for highspeed packet parsering. Therefore, please refer to the DPDK Offical Guide and install the libraries. It is worth noting that, the compatibility of DPDK 21 is unknown and the version listed in the paper is preferred. -
Install LibPcap++.
Whisper used LibPcap++ encapsulated DPDK to reduce the size of the source code. Make sure the libpcap++ version is compatible with the DPDK version. Note that, the Libpcap++ with DPDK support can only be obtained via source code compiling. Here is the official the guide for Libpcap++ Installation. -
Install PyTorch C++
Whisper used Pytorch C++ to implement matrix and sequence transformations. Download the Offical released form Pytorch Release. The ABI for CPU only is enough and make sure you selected cxx11 supported version. -
Install mlpcak Whisper used mlpack for unsupervised learning. Please used the correct commands for C++ stable version in mlpack Installation.
Firstly, check the path of downloaded PyTorch C++ is configured in CMakeLists.txt correctly. Then compile the prototype source code.
mkdir build && cd $_
cmake -G Ninja ..
ninja
- Strange link stage warnings. After the compiling, we got the warnings from
ld
below, butninja
generated binary successfully. What is the impact of the abnormity?
/usr/bin/ld: /home/libtorch/lib/libtorch_cpu.so: .dynsym local symbol at index 149 (>= sh_info of 2)
Answer: The link stage warning is generated because of the mismatch of the compiler version for PyTorch and Whisper. You can find a closer version, but it has no side-effect from my experience.
- On the feasibility of deploying Whisper in cloud.
Answer: I have tried to deploy it on AWS EC2 and other commercial clouds. Finally, I succeeded with huge efforts but still cannot realize the throughput measured on the physical testbed due to the performance limitations of virtual network interfaces. Therefore, I do not recommend the deployment in a multi-tenant network because the . If you have some advice, please contact us.