This folder contains instructions and scripts for processing an Nsp3 Mac1 fragment screening dataset. To complete the analysis, rs-booster
(https://github.com/rs-station/rs-booster) must be installed. Here, we
- run
careless
with a multivariate prior, and for many values of the double-Wilsonr
parameter on 16 holo drug fragment-screening datasets. - inspect CCpred and difference map peak heights.
We start with MTZ files found in ./20221007_unscaled_unmerged
. These were converted from unmerged .hkl
files provided by Galen Correy, for (Schuller et al. Sci. Adv. 2021, DOI: 10.1126/sciadv.abf8711).
To run careless
, we use the script careless_runs/slurm-dw-array-grid.sh
, which starts a slurm
batch array job. This job requires careless_runs/slurm_params.txt
, in which we vary the double-Wilson r
value across the individual careless
runs. To call using slurm:
cd careless_runs
sbatch slurm-dw-array-grid.sh
Many bash
scripts require activating a conda
environment with careless
in it. Please take note that you are activating the right conda
environment!
Then, we evaluate the quality of the careless
results in the jupyter notebook Inspect_Careless_param_grid.ipynb
. Inspect_Careless_param_grid.ipynb
calls the run_ccs.sh
script that uses careless
to compute correlation coefficients of careless results, as well as the make_diffmap.sh
script that uses reciprocalspaceship
to compute difference maps and peak heights. This notebooks also plots figures. To minimize bias, we refine phases against the reference dataset merged with a univariate prior.
-
20221007_unscaled_unmerged
: a folder with sixty-five holo drug fragment screening datasets and one reference drug fragment screening dataset. Each subfolder contains a model as well as several.mtz
files that have been preprocessed usingscripts/hkl2mtz.sh
, as well asclean_datasets_compute_diffmaps.ipynb
, a Jupyter notebook that corrects for indexing ambiguities and adds metadata columns to the.mtz
files in preparation forcareless
. -
careless_runs
: a folder containing a script for runningcareless
as a batch array, as well as the resultant subfolders containing outputs from individual runs ofcareless
. -
pymol_dfs
: a folder with pymol inputs and outputs. -
scripts
: a folder with scripts and two mtzs.DFS_refine_mv.mtz
: an MTZ file with phases refined from7kqo.pdb
and a dataset merged with a multivariate prior (careless_runs/merge_20379915_20830_mono_mc1_10k_grid_6/dfs_1.mtz
).DFS_refine_uv.mtz
: an MTZ file with phases refined from7kqo.pdb
and a dataset merged with a multivariate prior (careless_runs/merge_20379918_5300_mono_mc1_10k_grid_2/dfs_1.mtz
).hkl2mtz.sh
: a script that uses careless to convert.hkl
files to.mtz
files.make_diffmap_all.sh
: A script that usesrs-booster
to compute difference maps given a folder withcareless
runs in it.make_diffmap_indiv.sh
: A script that usesrs-booster
to computes difference maps of a singlecareless
run.run_ccs_all.sh
: A script that computes CCpred given a folder withcareless
runs in it.
-
Inspect_Careless_param_grid-DFS.ipynb
: a notebook for evaluating the quality of thecareless
results. -
compute_pandda_peak_height.ipynb
: a notebook for peak height calculation of the Z-maps.