Skip to content

This repo contains the codes for the experiments of the paper "AutoPenBench: Benchmarking Generative Agents for Penetration Testing".

License

Notifications You must be signed in to change notification settings

lucagioacchini/genai-pentest-paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Benchmarking Generative Agents for Penetration Testing

This repo contains the codes for reproducing the experiments of the paper AutoPenBench: Benchmarking Generative Agents for Penetration Testing.

If you use these codes in your research, please cite the following paper:

@misc{gioacchini2024autopenbench,
      title={AutoPenBench: Benchmarking Generative Agents for Penetration Testing}, 
      author={Luca Gioacchini and Marco Mellia and Idilio Drago and Alexander Delsanto and Giuseppe Siracusano and Roberto Bifulco},
      year={2024},
      eprint={2410.03225},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
      url={https://arxiv.org/abs/2410.03225}, 
}

Note if you need the AutoPenBench source code, visit the official repository.

Contents

How to Reproduce the Experiments

  1. Firstly ensure that you have cmake installed on your local machine. Open a terminal and run
cmake --version

If you need to install it, open a terminal and run

sudo apt update
sudo apt install cmake
  1. Now create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate
  1. Install the required libraries, the benchmarking framework AgentQuest and the AutoPenBench benchmark by running
./setup.sh
  1. To reproduce the experiments of the paper, simply open a terminal upon installation and virtual environment activation, then type:
python3 experiments/EXPERIMENT_FOLDER/EXPERIMENT_FILE

where EXPERIMENT_FOLDER identifies the type of experiments (e.g. 00-model_selection or 01-autonomous_agent) and EXPERIMENT_FILE identifies the experiment to run (e.g. gpt-4o-2024-08-06 for model_selection or access_control for 01-autonomous_agent).

Structure of This Repo

  • the experiments folder contains the source codes of the experiments of the paper
  • the logs folder contains the raw experimental results which are processed in this notebook
  • the genai folder contains the well documented source codes of the generative agents cognitive architectures we designed and implemented in the paper
  • th examples folders contains a couple of notebooks showing how to make the autonomous and assisted agents interact with the benchmark