Benchmarking Generative Agents for Penetration Testing

This repo contains the codes for reproducing the experiments of the paper AutoPenBench: Benchmarking Generative Agents for Penetration Testing.

If you use these codes in your research, please cite the following paper:

@misc{gioacchini2024autopenbench,
      title={AutoPenBench: Benchmarking Generative Agents for Penetration Testing}, 
      author={Luca Gioacchini and Marco Mellia and Idilio Drago and Alexander Delsanto and Giuseppe Siracusano and Roberto Bifulco},
      year={2024},
      eprint={2410.03225},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
      url={https://arxiv.org/abs/2410.03225}, 
}

Note if you need the AutoPenBench source code, visit the official repository.

How to Reproduce the Experiments

Firstly ensure that you have cmake installed on your local machine. Open a terminal and run

cmake --version

If you need to install it, open a terminal and run

sudo apt update
sudo apt install cmake

Now create and activate a virtual environment

python3 -m venv .venv
source .venv/bin/activate

Install the required libraries, the benchmarking framework AgentQuest and the AutoPenBench benchmark by running

./setup.sh

To reproduce the experiments of the paper, simply open a terminal upon installation and virtual environment activation, then type:

python3 experiments/EXPERIMENT_FOLDER/EXPERIMENT_FILE

where EXPERIMENT_FOLDER identifies the type of experiments (e.g. 00-model_selection or 01-autonomous_agent) and EXPERIMENT_FILE identifies the experiment to run (e.g. gpt-4o-2024-08-06 for model_selection or access_control for 01-autonomous_agent).

Structure of This Repo

the experiments folder contains the source codes of the experiments of the paper
the logs folder contains the raw experimental results which are processed in this notebook
the genai folder contains the well documented source codes of the generative agents cognitive architectures we designed and implemented in the paper
th examples folders contains a couple of notebooks showing how to make the autonomous and assisted agents interact with the benchmark

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
examples		examples
experiments		experiments
genai		genai
logs		logs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Benchmarking Generative Agents for Penetration Testing

Contents

How to Reproduce the Experiments

Structure of This Repo

About

Contributors 2

Languages

License

lucagioacchini/genai-pentest-paper

Folders and files

Latest commit

History

Repository files navigation

Benchmarking Generative Agents for Penetration Testing

Contents

How to Reproduce the Experiments

Structure of This Repo

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

Languages