Official codebase for Pretrained Transformers as Universal Computation Engines. Contains demo notebook and scripts to reproduce experiments.
For a minimal demonstration of frozen pretrained transformers, see demo.ipynb
.
You can run the notebook which reproduces the Bit XOR experiment in a couple minutes, and visualizes the learned
attention maps.
No updates are currently planned but there may be new features added in the future.
Currently the repo supports the following tasks:
['bit-memory', 'bit-xor', 'listops', 'mnist', 'cifar10', 'cifar10-gray', 'remote-homology']
As well as the following models:
['gpt2', 'gpt2-medium', 'gpt2-large', 'gpt2-xl', 'vit', 'lstm']
Note that CIFAR-10 LRA is cifar10-gray
with a patch size of 1.
-
Install Anaconda environment:
$ conda env create -f environment.yml
-
Add
universal-computation/
to your PYTHONPATH, i.e. add this line to your~/.bashrc
:export PYTHONPATH=~/universal-computation:$PYTHONPATH
Datasets are stored in data/
.
MNIST and CIFAR-10 are automatically downloaded by PyTorch upon starting experiment.
Download the files for Listops from Long Range Arena.
Move the .tsv
files into data/listops
.
There should be three files: basic_test, basic_train, basic_val
.
The script evaluates on the validation set by default.
Install and download the files for Remote Homology from TAPE.
Move the files into data/tape
, i.e. there will exist a directory (and valid variant)
data/tape/remote_homology/remote_homology_train.lmdb
Inside, there should be two files, data.mdb
and lock.mdb
.
The script evaluates on the validation set by default.
You can run experiments with:
python scripts/run.py
Adding -w True
will log results to Weights and Biases.
@article{lu2021fpt,
title={Pretrained Transformers as Universal Computation Engines},
author={Kevin Lu and Aditya Grover and Pieter Abbeel and Igor Mordatch},
journal={arXiv preprint arXiv:2103.05247},
year={2021}
}
MIT