Badapple2

Bioassay data associative promiscuity pattern learning engine V2.

badapple_classic

If you want to use/recreate the classic version of badapple follow the instructions here.

badapple2

NOTE: In progress, the steps below are not final or complete

The steps below outline how one can generate the Badapple2 DB on their own system.

Make sure to inspect all bash scripts and modify variable definitions (mostly file paths) as needed before running them. When running bash scripts, make sure your conda environment is active (conda activate badapple2).

(1) Setup

System Requirements

Code is expected to work on Linux systems. Thus far all code has been tested on the following OS:

Distributor ID:	Linuxmint
Description:	Linux Mint 21.2
Release:	21.2
Codename:	victoria

Python Setup

Setup conda (see the Miniconda Site for more info)
- (Optional) I'd recommend using the libmamba solver for faster install times, see here
Install the Badapple2 environment: conda env create -f environment.yml
- This will create a new conda env with name badapple2. If you wish, you can change the first line of environment.yml prior to the command above to change the name.

PostgreSQL Setup

The steps below are common to installation of the badapple, badapple_classic, and badapple2 databases (DBs).

Install PostgreSQL with the RDKit cartridge (requires sudo): sudo apt install postgresql-14-rdkit
(Option 1) Make your user a superuser prior to DB setup:
1. Switch to postgres user: (base) <username>@<computer>:~$ sudo -i -u postgres
2. Make yourself a superuser: psql -c "CREATE ROLE <username> WITH SUPERUSER PASSWORD '<password>'"

(Option 2) If you don't want to make <username> a superuser, follow the steps below:

When running DB setup commands, prepend sudo -u postgres to DB setup commands. For example, instead of createdb <DB_NAME> use sudo -u postgres createdb <DB_NAME>.
After setting up the DB as postgres you can grant permissions to <username> to access the DB as <username> like so:

sudo -i -u postgres
psql -d <DB_NAME> -c "CREATE ROLE <username> WITH LOGIN PASSWORD '<password>'"
psql -d <DB_NAME> -c "GRANT SELECT ON ALL TABLES IN SCHEMA public TO <username>"
psql -d <DB_NAME> -c "GRANT SELECT ON ALL SEQUENCES IN SCHEMA public TO <username>"
psql -d <DB_NAME> -c "GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA public TO <username>"

(2) Preliminary

Additionally, before getting started, make sure you have the following files:

AID file: Text file listing all PubChem AIDs to be included in the DB.

(3) Input Data

The steps below outline how to mirror PubChem data to your system (much faster/more reliable than using PUG-REST API) and how to generate the 5 input TSVs we'll use in part (3). I would recommend saving all 5 of these TSVs to the same directory.

Run bash sh_scripts/mirror_pubchem.sh
- This will mirror PubChem Bioassay data on your system (~11 GB of space required).
- Files will be saved to {workdir}/bioassay.
Run bash sh_scripts/python/run_pubchem_assays_local.sh. This will generate 3 files:
- o_compound: TSV file with compound CIDs and isomeric SMILES.
- o_sid2cid: TSV file mapping compound id (CID) <=> substance id (SID)
- o_assaystats: TSV file with assay id (AID), substance id (SID), and activity outcome.
Run bash sh_scripts/python/run_generate_scaffolds.sh. This will generate 3 output files:
- o_mol: TSV file with compound canonical SMILES and their CIDs
- o_scaf: TSV file with all scaffolds and their IDs
- o_mol2scaf: TSV file mapping compound CID to scaffold ID(s)

(4) Initializing the DB

(Step 6 currently out of date, will update)

Install postgresql with the RDKit cartridge (requires sudo): apt install postgresql-14-rdkit
Run bash sh_scripts/db/create_db.sh
Connect to db with psql -d badapple2
Run CREATE EXTENSION rdkit;. This should return CREATE EXTENSION.
(Optional) You can test that the RDKit cartridge is working with the is_valid_smiles command:

badapple2=# select is_valid_smiles('O1OCCCC1');
 is_valid_smiles 
-----------------
 t
(1 row)

Run bash sh_scripts/db/load_db.sh
Done!

Name		Name	Last commit message	Last commit date
Latest commit History 177 Commits
badapple1_comparison		badapple1_comparison
docker		docker
figures		figures
sh_scripts		sh_scripts
src		src
Dockerfile_BA_classic		Dockerfile_BA_classic
LICENSE		LICENSE
README.md		README.md
compose_BA_classic.yml		compose_BA_classic.yml
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Badapple2

badapple_classic

badapple2

(1) Setup

System Requirements

Python Setup

PostgreSQL Setup

(2) Preliminary

(3) Input Data

(4) Initializing the DB

About

Releases 1

Packages

Languages

License

unmtransinfo/Badapple2

Folders and files

Latest commit

History

Repository files navigation

Badapple2

badapple_classic

badapple2

(1) Setup

System Requirements

Python Setup

PostgreSQL Setup

(2) Preliminary

(3) Input Data

(4) Initializing the DB

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages