In this pipeline we perform radio interferomic data processing carrying out all the phases: rebinning, calibrationa and imaging. It is computed using the serverless architecture Lithops.
To execute this notebook you need:
- An AWS Account.
- Setup Lithops to work with AWS Lambda.
- Lithops version: 2.9.0
- Python 3.8
-
Clone this github and install the requirements in
requirements.txt
:$ git clone https://github.com/iAmJK44/serverless_benchmarks.git $ pip install -r requirements.txt
-
Download the data and extract it in a directory similar to
/home/user/Downloads/entire_ms/SB205.MS/
. Change the user name in the path of this line of the main of partition.py file:$ p = Partitioner("/home/user/Downloads/entire_ms/SB205.MS/")
In case you downloaded another set of data instead of SB205.MS, change the name too.
-
Setup Lithops for AWS backend.
-
Build the runtime in the
docker
directory :$ lithops runtime build -f Dockerfile serverless-extract:1
-
Configure Lithops to use the built runtime (e.g.
serverless-extract:1
). -
Create an S3 bucket named
serverless-genomics
to upload the data. -
Run
partition.py
located in partition directory. This will create and upload the .ms files to the S3 bucket divided in 70 partition by default.$ cd ./partition/ $ python3 partition.py
-
Run the
pipeline.py
file. This file performs all the phases of the pipeline [rebinning, calibration, imaging]:$ python3 pipeline.py
More information on how it works in this link.
-
The results obtained should look similar to the images in /stats/stats/ .
NOTE: you can change the names of the S3 bucket and the number of partitions editing the pipeline.py
and partition.py
files.