Skip to content

Commit

Permalink
add readme for configuration
Browse files Browse the repository at this point in the history
  • Loading branch information
Richard Stöckl committed Aug 30, 2024
1 parent d33cdd5 commit 3be59ab
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 1 deletion.
34 changes: 34 additions & 0 deletions config/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Usage and configuration

Here is a rough overview:
1. Install [conda](https://docs.conda.io/en/latest/miniconda.html) (mamba or miniconda is fine).
2. Install snakemake with:
```bash
conda install -c conda-forge -c bioconda snakemake
```
3. Download checkm2 database (via `wget https://zenodo.org/api/files/fd3bc532-cd84-4907-b078-2e05a1e46803/checkm2_database.tar.gz`)
4. Download GTDB-Tk database (via `wget https://data.gtdb.ecogenomic.org/releases/release220/220.0/auxillary_files/gtdbtk_package/full_package/gtdbtk_r220_data.tar.gz`)
3. [Download the latest release from this repo](https://github.com/richardstoeckl/basecallNanopore/releases/latest) and cd into it
4. Edit the `config/config.yaml` to provide the paths to your results/logs directories, and the paths to the databases you downloaded, as well as any parameters you might want to change.
5. Edit the `config/sampleData.csv` file with the specific details for each assembly you want to check. Depending on what you enter here, the pipeline will automatically adjust what will be done.

---

# General configuration

To configure this workflow, modify `config/config.yaml` according to your needs, following the explanations provided in the file.

## "Main" section

Here you should provide the paths to your intermediary/results/logs directories. The `interim` directory will contain larger intermediary files. The `results` directory will contain the final output of the pipeline. The `log`directory will be used to store the log files for each step.
Here you should also write the name of your sample data file (see [relevant section below](#sampleData-file-setup)).

## "Tools" section

Here you should give the paths to the databases needed for some of the tools.


# sampleData file setup

The setup of the samples is specified via comma-separated values files (`.csv`).
You can use the `config/sampleData.csv`file as a template.
2 changes: 1 addition & 1 deletion config/config.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
main:
sampleData: "sampleData.csv" # "config/sampleData_tests.csv is the sample file that can be used for testing the pipeline setup"
sampleData: "sampleData.csv" # "config/sampleData.csv is the sample file that can be used for testing the pipeline setup"
logPath: "logs/"
interimPath: "interim/"
resultPath: "results/"
Expand Down

0 comments on commit 3be59ab

Please sign in to comment.