This repo contains the codes for reproducing the experiments of the paper AutoPenBench: Benchmarking Generative Agents for Penetration Testing.
If you use these codes in your research, please cite the following paper:
@misc{gioacchini2024autopenbench,
title={AutoPenBench: Benchmarking Generative Agents for Penetration Testing},
author={Luca Gioacchini and Marco Mellia and Idilio Drago and Alexander Delsanto and Giuseppe Siracusano and Roberto Bifulco},
year={2024},
eprint={2410.03225},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2410.03225},
}
Note if you need the AutoPenBench source code, visit the official repository.
- Firstly ensure that you have
cmake
installed on your local machine. Open a terminal and run
cmake --version
If you need to install it, open a terminal and run
sudo apt update
sudo apt install cmake
- Now create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate
- Install the required libraries, the benchmarking framework AgentQuest and the AutoPenBench benchmark by running
./setup.sh
- To reproduce the experiments of the paper, simply open a terminal upon installation and virtual environment activation, then type:
python3 experiments/EXPERIMENT_FOLDER/EXPERIMENT_FILE
where EXPERIMENT_FOLDER
identifies the type of experiments (e.g. 00-model_selection
or 01-autonomous_agent
) and EXPERIMENT_FILE
identifies the experiment to run (e.g. gpt-4o-2024-08-06
for model_selection
or access_control
for 01-autonomous_agent
).
- the
experiments
folder contains the source codes of the experiments of the paper - the
logs
folder contains the raw experimental results which are processed in this notebook - the
genai
folder contains the well documented source codes of the generative agents cognitive architectures we designed and implemented in the paper - th
examples
folders contains a couple of notebooks showing how to make the autonomous and assisted agents interact with the benchmark