This is the code release for the paper, "N-Agent Ad Hoc Teamwork", published at NeurIPS 2024. If you find the code or paper useful for your work, please cite:
@inproceedings{wang2024naht,
title={N-Agent Ad Hoc Teamwork},
author={Wang, Caroline and Rahman, Arrasy and Durugkar, Ishan and Liebman, Elad and Stone, Peter},
booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
year={2024}
}
This codebase was built on top of the EPyMARL codebase.
The following additions and modifications were made by us:
- Custom implementation of IPPO, MAPPO, and POAM
- Support for training/evaluating MARL/AHT algorithms in an open environment (i.e. with uncontrolled teammates).
- Minor modifications to agent architectures to enable fair comparisons between the newly added methods and existing methods
- Minor modifications to the orders in which config files are loaded (values in the alg configs override default values)
This section covers installation instructions, configuring repo-wide user variables, and downloading uncontrolled agent policies.
- We recommend creating a conda environment. As of Nov. 2024, these installation instructions were verified with Python 3.10 and PyTorch 2.5. Note that
torch_scatter
is not required to reproduce the experiments in our paper, but is necessary to import the PAC method (inherited from ePyMARL).
conda create -n <my_env> python=3.10
conda install pytorch pytorch-cuda=12.1 -c pytorch -c nvidia
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.5.0+cu121.html
- Install environment and repository package requirements via
pip install -r requirements.txt
pip install -r env_requirement.txt
The env_requirements.txt
will install the following environments, used in our experiments:
- SMAC
- Our fork of MPE
- Our fork of Matrix Games
- Install StarCraft2 Please see the instructions in the SMAC codebase for instructions to install the StarCraft II game.
Configuring results directories: modify src/config/user_info.yaml
with your preferred base path for results, and preferred directory where uncontrolled agent policies will be stored.
By default, results will be written to the folder ./naht_results
, and uncontrolled agents should be stored at ./unctrl_agents
.
To run the NAHT experiments, you will need uncontrolled agent policies. You can either generate them using this codebase (see Instructions to Run CMARL Experiments) or you can download the agents used in the paper from this Google Drive link.
An example bash script showing how to train MARL agents on the 5v6
domain may be found in the top-level folder of this codebase, called train_agent.sh
.
To run a particular algorithm on a particular task, the following changes to the listed files are necessary :
train_agent.sh
:- Change the
--env-config
to one of the supported environment configs undersrc/config/envs/
. - Change the
--config
to the default config corresponding to the selected environment and task. Supported env/task combinations and corresponding hyperparameters may be found undersrc/config/default
. - Change the
--alg-config
argument to the name of the appropriate algorithm config file, found underconfig/algs/<env_name>
. The hyperparameters used for the paper are the values contained within these configs.
- Change the
An example bash script corresponding to running NAHT experiments on the 5v6
domain may be found in the top-level folder of this codebase, named train_naht.sh
.
To run NAHT experiments, a set of teammates to train with must be provided.
The teammates should be policy checkpoints that were generated by this codebase.
During training time, the information to reload the policies is automatically read out of the configs stored by Sacred.
Note that the teammate policies are assumed to be stored relative in the base_unctrl_agents
directory specified in src/config/user_info.yaml
. See "Getting Started" for links to download the agents used in this paper.
To run a particular algorithm on a particular task, the following changes to the listed files are necessary :
train_naht.sh
:- Change the
--env-config
to one of the supported environment configs undersrc/config/envs/
. - Change the
--config
to the config corresponding to the selected environment and task. Supported env/task combinations and corresponding hyperparameters for the NAHT experiments may be found undersrc/config/open
. - Change the
--alg-config
argument to the name of the appropriate algorithm config file, found underconfig/algs/<env_name>
. The hyperparameters used for the paper are the values contained within these configs.
- Change the
src/config/open/open_algs_<task>.yaml
:- Check that the checkpoint paths for each desired teammate type are correct (under
uncontrolled_agents
). Paths should be relative to thebase_unctrl_agents
specified insrc/config/user_info.yaml
- To run an NAHT experiment where the number of teammates are sampled, set
n_uncontrolled: null
. To run an AHT experiment where only a single agent is sampled, setn_uncontrolled: <max_agents> - 1
. - If running POAM, set
agent_loader
topoam_train_agent_loader
. Else, setagent_loader
tornn_train_agent_loader
- Check that the checkpoint paths for each desired teammate type are correct (under
Within the src/config/open
directory, there are configs with format, open_train_*.yaml
and open_algs_*.yaml
.
The first set of configs generated the results in the main paper, and correspond to a training teammate set consisting of a single seed of all types of training teammates.
The second set of configs correspond to the results presented in Section A.5.3. of the Appendix (Generalization to Unseen Teammate Types), where POAM/IPPO are trained on MAPPO, QMIX, IQL, and tested on IPPO and VDN.
The evaluation code may be run from src/nk_evaluation.py
.
The current evaluation is parallelized using a Condor cluster; if a condor cluster is not available, the global variable USE_CONDOR
may be set to False
.
The task that the evaluation should be run on must be specified under the ifmain
block.
The OOD evaluations may be run using the target_set_eval()
function.
All code to generate plots in the paper may be found at src/notebooks/paper_results.ipynb
.
Please contact the authors to get access to the data used to generate the results.