Skip to content

Commit

Permalink
adapt to Snakemake 8; move env, config, annot export into the result …
Browse files Browse the repository at this point in the history
…folder to be self-contained
  • Loading branch information
sreichl committed Sep 13, 2024
1 parent a9d7d18 commit dadc9d3
Show file tree
Hide file tree
Showing 9 changed files with 29 additions and 28 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[![DOI](https://zenodo.org/badge/483638364.svg)](https://zenodo.org/doi/10.5281/zenodo.10689139)

# Single-cell RNA sequencing (scRNA-seq) Differential Expression Analysis & Visualization Snakemake Workflow
A [Snakemake](https://snakemake.readthedocs.io/en/stable/) workflow for performing differential expression analyses (DEA) of processed (multimodal) scRNA-seq data powered by the R package [Seurat's](https://satijalab.org/seurat/index.html) functions [FindMarkers](https://satijalab.org/seurat/reference/findmarkers) and [FindAllMarkers](https://satijalab.org/seurat/reference/findallmarkers).
A [Snakemake 8](https://snakemake.readthedocs.io/en/stable/) workflow for performing differential expression analyses (DEA) of processed (multimodal) scRNA-seq data powered by the R package [Seurat's](https://satijalab.org/seurat/index.html) functions [FindMarkers](https://satijalab.org/seurat/reference/findmarkers) and [FindAllMarkers](https://satijalab.org/seurat/reference/findallmarkers).

This workflow adheres to the module specifications of [MR.PARETO](https://github.com/epigen/mr.pareto), an effort to augment research by modularizing (biomedical) data science. For more details, instructions and modules check out the project's repository. Please consider starring and sharing modules that are useful to you, this helps me in prioritizing my efforts!

Expand Down
4 changes: 3 additions & 1 deletion config/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,12 @@

You need one configuration file and one annotation file to run the complete workflow. You can use the provided example as starting point. If in doubt read the comments in the config and/or try the default values.

- project configuration (config/config.yaml): different for every project/dataset and configures the analyses to be performed.
- project configuration (`config/config.yaml`): different for every project/dataset and configures the analyses to be performed.
- sample annotation (sample_annotation): CSV file consisting of five columns
- name: name of the dataset/analysis (tip: keep it short, but descriptive and distinctive).
- data: path to the input Seurat object as .rds.
- assay: the Seurat assay to be used (e.g., SCT or RNA).
- metadata: column name of the metadata that should be used to group cells for comparison (e.g., condition or cell_type).
- control: name of the class/level that should be used as control in the comparison (e.g., untreated) or "ALL" to compare every class against the rest (e.g., useful to find cluster markers; one vs all)

Set workflow-specific `resources` or command line arguments (CLI) in the workflow profile `workflow/profiles/default.config.yaml`, which supersedes global Snakemake profiles.
1 change: 0 additions & 1 deletion config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
##### RESOURCES #####
mem: '32000'
threads: 1 # only DEA rule is multicore and gets 8*threads
partition: 'shortq'

##### GENERAL #####
annotation: /path/to/MyData_dea_seurat_annotation.csv
Expand Down
16 changes: 9 additions & 7 deletions workflow/Snakefile
Original file line number Diff line number Diff line change
@@ -1,12 +1,16 @@

##### global workflow dependencies #####
conda: "envs/global.yaml"

##### libraries #####
import os
import sys
import pandas as pd
import yaml
from snakemake.utils import min_version

min_version("7.15.2")
##### set minimum snakemake version #####
min_version("8.20.1")

##### module name #####
module_name = "dea_seurat"
Expand Down Expand Up @@ -48,17 +52,15 @@ rule all:
analysis = analyses,
feature_list = feature_lists + ['FILTERED'],
),
envs = expand(os.path.join(config["result_path"],'envs',module_name,'{env}.yaml'),env=['seurat','volcanos','ggplot','heatmap']),
configs = os.path.join(config["result_path"],'configs',module_name,'{}_config.yaml'.format(config["project_name"])),
annotations = os.path.join(config["result_path"],'configs',module_name,'{}_annot.csv'.format(config["project_name"])),
feature_lists = expand(os.path.join(config["result_path"],'configs',module_name,'{feature_list}.txt'), feature_list = feature_lists),
envs = expand(os.path.join(result_path,'envs','{env}.yaml'),env=['seurat','volcanos','ggplot','heatmap']),
configs = os.path.join(result_path,'configs','{}_config.yaml'.format(config["project_name"])),
annotations = os.path.join(result_path,'configs','{}_annot.csv'.format(config["project_name"])),
feature_lists = expand(os.path.join(result_path,'configs','{feature_list}.txt'), feature_list = feature_lists),
resources:
mem_mb=config.get("mem", "16000"),
threads: config.get("threads", 1)
log:
os.path.join("logs","rules","all.log"),
params:
partition=config.get("partition"),


##### load rules #####
Expand Down
7 changes: 7 additions & 0 deletions workflow/envs/global.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
channels:
- conda-forge
- bioconda
- nodefaults
dependencies:
- numpy=2.0.1
- pandas=2.2.2
3 changes: 3 additions & 0 deletions workflow/profiles/default/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
default-resources:
slurm_partition: shortq
slurm_extra: "'--qos=shortq'"
2 changes: 0 additions & 2 deletions workflow/rules/dea.smk
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ rule dea:
log:
os.path.join("logs","rules","dea_{analysis}.log"),
params:
partition=config.get("partition"),
assay = lambda w: annot_dict["{}".format(w.analysis)]["assay"],
metadata = lambda w: annot_dict["{}".format(w.analysis)]["metadata"],
control = lambda w: annot_dict["{}".format(w.analysis)]["control"],
Expand Down Expand Up @@ -52,7 +51,6 @@ rule aggregate:
log:
os.path.join("logs","rules","aggregate_{analysis}.log"),
params:
partition=config.get("partition"),
assay = lambda w: annot_dict["{}".format(w.analysis)]["assay"],
metadata = lambda w: annot_dict["{}".format(w.analysis)]["metadata"],
control = lambda w: annot_dict["{}".format(w.analysis)]["control"],
Expand Down
20 changes: 6 additions & 14 deletions workflow/rules/envs_export.smk
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# one rule per used conda environment to document the exact versions and builds of the used software
rule env_export:
output:
report(os.path.join(config["result_path"],'envs','dea_seurat','{env}.yaml'),
report(os.path.join(result_path,'envs','{env}.yaml'),
caption="../report/software.rst",
category="Software",
subcategory="{}_{}".format(config["project_name"], module_name)
Expand All @@ -13,8 +13,6 @@ rule env_export:
threads: config.get("threads", 1)
log:
os.path.join("logs","rules","env_{env}.log"),
params:
partition=config.get("partition"),
shell:
"""
conda env export > {output}
Expand All @@ -23,7 +21,7 @@ rule env_export:
# add configuration files to report
rule config_export:
output:
configs = report(os.path.join(config["result_path"],'configs','dea_seurat','{}_config.yaml'.format(config["project_name"])),
configs = report(os.path.join(result_path,'configs','{}_config.yaml'.format(config["project_name"])),
caption="../report/configs.rst",
category="Configuration",
subcategory="{}_{}".format(config["project_name"], module_name)
Expand All @@ -33,8 +31,6 @@ rule config_export:
threads: config.get("threads", 1)
log:
os.path.join("logs","rules","config_export.log"),
params:
partition=config.get("partition"),
run:
with open(output["configs"], 'w') as outfile:
yaml.dump(config, outfile, sort_keys=False, width=1000, indent=2)
Expand All @@ -44,18 +40,16 @@ rule annot_export:
input:
config["annotation"],
output:
annot = report(os.path.join(config["result_path"],'configs','dea_seurat','{}_annot.csv'.format(config["project_name"])),
annot = report(os.path.join(result_path,'configs','{}_annot.csv'.format(config["project_name"])),
caption="../report/configs.rst",
category="Configuration",
subcategory="{}_{}".format(config["project_name"], module_name)
)
resources:
mem_mb=1000, #config.get("mem_small", "16000"),
mem_mb=1000,
threads: config.get("threads", 1)
log:
os.path.join("logs","rules","annot_export.log"),
params:
partition=config.get("partition"),
shell:
"""
cp {input} {output}
Expand All @@ -66,18 +60,16 @@ rule feature_list_export:
input:
get_feature_list_path,
output:
feature_lists = report(os.path.join(config["result_path"],'configs',module_name,'{feature_list}.txt'),
feature_lists = report(os.path.join(result_path,'configs','{feature_list}.txt'),
caption="../report/feature_lists.rst",
category="Configuration",
subcategory="{}_{}".format(config["project_name"], module_name)
),
resources:
mem_mb=1000, #config.get("mem_small", "16000"),config.get("mem", "16000"),
mem_mb=1000,
threads: config.get("threads", 1)
log:
os.path.join("logs","rules","feature_list_export_{feature_list}.log"),
params:
partition=config.get("partition"),
shell:
"""
cp {input} {output}
Expand Down
2 changes: 0 additions & 2 deletions workflow/rules/visualize.smk
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ rule volcanos:
log:
os.path.join("logs","rules","volcanos_{analysis}_{feature_list}.log"),
params:
partition=config.get("partition"),
assay = lambda w: annot_dict["{}".format(w.analysis)]["assay"],
metadata = lambda w: annot_dict["{}".format(w.analysis)]["metadata"],
control = lambda w: annot_dict["{}".format(w.analysis)]["control"],
Expand Down Expand Up @@ -51,7 +50,6 @@ rule heatmap:
log:
os.path.join("logs","rules","lfc_heatmap_{analysis}_{feature_list}.log"),
params:
partition=config.get("partition"),
assay = lambda w: annot_dict["{}".format(w.analysis)]["assay"],
metadata = lambda w: annot_dict["{}".format(w.analysis)]["metadata"],
control = lambda w: annot_dict["{}".format(w.analysis)]["control"],
Expand Down

0 comments on commit dadc9d3

Please sign in to comment.