Goldphish is an arbitrage analyzer for the ethereum blockchain.
The entrypoints into this project are largely broken into two parts:
- Historical scrape of performed arbitrages -- found in
backtest/gather_samples
- Arbitrage seeking through history -- found in
backtest/top_of_block
Models for the exchanges are in pricers/
.
The transaction relayer smart contract is in contracts/
.
The optimization procedure is in find_circuit/find.py
.
This system is dockerized.
To build, run docker build -t goldphish .
This system requires access to a postgresql database and a go-ethereum (geth) archive node with a websocket JSON-rpc server. At the time of writing, an archive node consumes around 12 terabytes of space -- be warned!
We run postgres dockerized. For convenience, we also run it on a separate docker network (so we can get dns-resolution).
To create the network:
docker network create ethereum-measurement-net
Pull the docker image,
docker pull postgres:14
Then spawn postgres. NOTE: we are running with a weak password and open port (BAD!!!)
Replace YOUR_DATA_DIR_HERE
with the path to a directory that has at least 2T storage space, this is where your postgresql database files will live.
docker run \
-d \
--name ethereum-measurement-pg \
-e POSTGRES_PASSWORD=password \
-e POSTGRES_USER=measure \
-e POSTGRES_DB=eth_measure_db \
-p 0.0.0.0:5410:5432 \
-v YOUR_DATA_DIR_HERE:/var/lib/postgresql/data \
--network ethereum-measurement-net \
postgres:14
Let's configure it to allow more connections. Edit YOUR_DATA_DIR_HERE/postgresql.conf
and change:
- max_connections = 100
+ max_connections = 1000
...
- shared_buffers = 128MB
+ shared_buffers = 512MB
And then restart the container:
docker container restart ethereum-measurement-pg
We need a directory to persist some information (and logs), which we will call STORAGE_DIR
. It should have this structure:
.
logs/
tmp/
We parallelize work by chunks of blocks about 1 day long. Generate this table:
docker run \
--rm \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.top_of_block \
generate-sample
You should see logged generated sample
.
You will need to scrape the list of Uniswap, Sushiswap, Shibaswap, and Balancer exchanges.
Get started by setting up the database:
docker run \
--rm \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
goldphish \
python3 -m backtest.gather_samples \
--setup-db
And then also here:
docker run \
--rm \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
goldphish \
python3 -m backtest.gather_samples.fill_known_exchanges \
--setup-db
And finally, run the scrape:
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.gather_samples.fill_known_exchanges
You should see an ETA printed.
This scrapes the USD price of ETH using either the the Chainlink oracle, or if an early block, the MakerDAO price oracle.
First, setup the db:
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.gather_samples.fill_eth_price \
--setup-db
Then, run the scrape. Here we use N_WORKERS to represent the number of worker-processes you would like to use. We set it to 50.
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
./spawn_many_fill_eth_price.sh \
$N_WORKERS
Finalize the work (abt 1min):
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.gather_samples.fill_eth_price \
--finalize
This takes a while! We chose 50 workers. Be sure that you increased your postgresql connections.
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
./spawn_many_gather_samples.sh \
$N_WORKERS
You can watch the ETA here, in another bash session. This takes about 1-2 days.
If you would like to perform this on multiple machines, that is okay!
Get the docker container on the second (third, fourth...) machine and launch it similar to the command below.
Except, you'll want to set the environment variables PSQL_HOST and PSQL_PORT appropriately
(ie, your database machine's ip and port). Set these with -e PSQL_HOST=xxx.xxx.xxx.xxx
etc.
Scrape work is processed in random order, so the ETA should be somewhat reliable.
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
goldphish \
python3 tmp_eta_gather.py
This scrapes flashbots transaction from their server.
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.gather_samples.load_flashbots
We scrape linearly in time. Since activity is greater as time goes on, expect the ETA to grow as blocks per second slows. This is a common issue with linear scans.
It should take about 5 hours. This could probably be made faster by varying the http request batch sizes a bit smarter.
We need to figure out which exchanges were, in fact, 0x exchanges. This needs to be run twice, first for v3, then for v4.
First, v3 (should take about 30min - 1 hour):
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.gather_samples.fill_zerox --v3
Then, v4 (NOTE: This is parallelized, so a bit faster. we picked 50 workers.):
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
./spawn_many_fill_zerox.sh \
$N_WORKERS
NOTE: This will take a LONG time!
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
./spawn_fill_coinbase_xfer.sh \
$N_WORKERS
Do transaction re-ordering to determine who was backrunning. First, setup db and fill the work queue:
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.gather_samples.fill_backrunners --setup-db
Then, do the reordering. This is parallelized -- will take a while! You might want to spawn more workers on more machines. If you do that, be sure to specify the postgresql host/port as an environment variable (see section "Scrape Arbitrages").
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
./spawn_many_backrunner_detect.sh \
$N_WORKERS
You can watch the ETA here:
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
goldphish \
python3 tmp_eta_fill_backrunner.py
Setup some tables:
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.gather_samples.fill_odd_token_xfers --setup-db
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.gather_samples.fill_odd_token_xfers --setup-db
Fill odd token transfer table (mistaken NFTs)
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
./spawn_many_fill_odd_tokens.sh \
$N_WORKERS
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.gather_samples.fill_false_positive
First, set-up the database
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.gather_samples.fill_naive_gas_price --setup-db
Then, run the job (NOTE: this is parallelized).
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
./spawn_many_gas_price_fill.sh \
$N_WORKERS
First, set-up the database
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.gather_samples.fill_arb_sandwich --setup-db
Run the scrape:
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
./spawn_fill_arb_sandwich.sh \
$N_WORKERS
You can view the ETA here:
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
goldphish \
python3 tmp_eta_fill_arb_sandwich.py
THIS TAKES A LONG TIME!
Run the historical arbitrage opportunity search algorithm. You will likely need to have several hundred workers across several machines. We find that each worker will peg a CPU core -- ie, this is CPU-bound, so adding many more workers than CPU cores is not recommended.
About 500 workers should finish the job in about under 1 month.
Set up the job:
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.top_of_block seek-candidates --setup-db
And run the job. Here we show explicitly how to set the postgresql host IP and port number. This is not necessary if you are on the same docker network as the database (then defaults work fine). One can request cancellation of work by sending SIGHUP to the worker python processes, each worker should cleanly stop its current work unit and break off any remaining work into a new unit.
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
-e PSQL_PORT=5041 \
-e PSQL_HOST=XXX.XXX.XXX.XXX \
goldphish \
./spawn_many_searchers.sh \
$N_WORKERS
You can view the ETA here. Note that some warm-up time is required before the ETA computation works. During the start, the ETA will slowly rise, as the easier (early-history) blocks are processed. ETA will also under-estimate time remaining toward the end of the run, as the parallelism decreases because no work-units remain.
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
goldphish \
python3 tmp_eta.py
Set up database:
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.top_of_block fill-top-arbs --setup-db
Run the fill:
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
./spawn_many_fill_top_arbitrages.sh \
$N_WORKERS
This executes the candidate arbitrages, to check profitability. Takes quite a while!
This has two modes: 'all' and 'top arbitrages'. The 'all' mode runs arbitrages in order of decreasing priority. Priority was determined when generating block samples (a few steps back), and is a shuffle of the blockchain broken into day-long segments. This facilitates random sampling, presuming (and I do) that you do not have time to execute the entire thing. 'Top arbitrages' mode will relay all of the large arbitrages.
Setup DB
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.top_of_block do-relay --setup-db
Run 'all'
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
./spawn_many_relayers.sh \
$N_WORKERS
Watch the ETA at:
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
goldphish \
python3 tmp_eta_relay.py
Run 'top arbitrages'. First, we need to fill this record of exchange modification history.
To do that, first set-up the db:
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
python3 -m backtest.top_of_block do-arb-duration --setup-db
Then do the run:
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
./spawn_many_fill_modifications.sh \
$N_WORKERS
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
-e WEB3_HOST=ws://$GETH_NODE \
goldphish \
./spawn_many_relay_top_arbs.sh \
$N_WORKERS
Watch the ETA at:
docker run \
--rm -t \
--network ethereum-measurement-net \
-v $STORAGE_DIR:/mnt/goldphish \
goldphish \
python3 tmp_eta_relay_top_arbs.py