Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NF_RCP-F_1.0.4-RC1 #29

Draft
wants to merge 60 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
3e69f06
fix: ensure workflow resources are always in root data directory
J-81 May 3, 2023
9662c14
feat: update version from 1.0.3 to 1.0.4
J-81 May 8, 2023
3b7e0ba
feat: allow trim-galore! to autodetect adapter type
J-81 May 8, 2023
8a158b1
refactor: remove deprecated tests and test settings
J-81 May 8, 2023
03618d9
feat: add NF_RCP plugin for dp_tools update
J-81 May 9, 2023
f15c099
feat: convert from dp_tools 1.1.8 style usage to 1.3.2 (plugin)
J-81 May 9, 2023
36e0e69
feat: 48 fast test validates runsheet migration
J-81 May 9, 2023
b3684a4
feat: finish migration to updated dp_tools
J-81 May 10, 2023
dca4fda
fix: bind sample at definition
J-81 May 11, 2023
81c7cbe
Merge remote-tracking branch 'upstream/master' into DEV_NF_RCP-F
J-81 May 25, 2023
2a56552
docs[dppd]: update DPPD with workflow update
J-81 May 25, 2023
3b5ceca
ci: add github action
J-81 Jul 10, 2023
da02ccc
ci: add nf-test install
J-81 Jul 10, 2023
56c8eca
ci: fix install of nf-test
J-81 Jul 10, 2023
486e602
ci: debugging
J-81 Jul 10, 2023
e43dd3f
ci: debugging, changing test launch location
J-81 Jul 10, 2023
ce2fbf3
ci: debugging, changing test launch location
J-81 Jul 10, 2023
66e1da5
ci[debug]: nf-test path availability
J-81 Jul 10, 2023
b27388f
ci[debug]: get test data from fork
J-81 Jul 11, 2023
1092e80
ci[debug]: Add tests for modules actions
J-81 Jul 11, 2023
2dd8fce
ci[debug]: Add tag
J-81 Jul 11, 2023
11a22f0
ci[debug]: Adjust pathing in test
J-81 Jul 11, 2023
d15b39b
ci[debug]: Assess pathing
J-81 Jul 11, 2023
7ba75d0
ci[debug]: Assess pathing
J-81 Jul 11, 2023
4e317f2
ci[debug]: Update test asset pathing
J-81 Jul 11, 2023
33c63ce
ci[debug]: Update test asset pathing
J-81 Jul 11, 2023
5d35b61
ci[debug]: Update test asset pathing
J-81 Jul 11, 2023
47bebcd
ci[debug]: Update test asset pathing
J-81 Jul 11, 2023
7d2cc24
ci[debug]: Update test asset pathing
J-81 Jul 11, 2023
1e6a9cb
ci[debug]: Assess pathing
J-81 Jul 11, 2023
53363e4
ci[debug]: Assess pathing
J-81 Jul 11, 2023
1b6e325
ci[debug]: Update pathing
J-81 Jul 11, 2023
b5013e9
ci[debug]: Update pathing
J-81 Jul 11, 2023
574eb79
ci[debug]: Bump and attempt to resolve eof error on command.sh
J-81 Jul 11, 2023
fc89c5e
ci[debug]: Bump and attempt to resolve eof error on command.sh
J-81 Jul 11, 2023
3362429
ci[clean]: Remove extra prints
J-81 Jul 11, 2023
09726a5
ci: only run based on filter compared to last commit
J-81 Jul 11, 2023
1500019
feat: remove extra print
J-81 Jul 11, 2023
6719218
feat: add summary.txt output and rework tests
J-81 Jul 11, 2023
ac5dbce
ci[fix]: update tap location
J-81 Jul 11, 2023
16ecc58
ci: bump to trigger tests
J-81 Jul 11, 2023
5e8416e
ci: update outputs format; update tag
J-81 Jul 11, 2023
35c9823
test: update dge tests
J-81 Jul 11, 2023
802f9b1
feat: git ignore *.pyc and .nextflow folders
J-81 Jul 11, 2023
abc9f7e
feat: add gitpod yaml
J-81 Jul 11, 2023
af0b716
feat: remove deprecated conda support files
J-81 Jul 11, 2023
cbf6055
feat: rename and reformat dge to DGE_BY_DESEQ2
J-81 Jul 11, 2023
9d710ef
Merge remote-tracking branch 'origin' into DEV_NF_RCP-F
J-81 Aug 8, 2023
6a0d105
test
J-81 Aug 8, 2023
70488e6
fix: bump to unreleased new version for act support
J-81 Aug 8, 2023
62fe768
feat: update spell ignore
J-81 Aug 8, 2023
a83006b
feat: rework to only check links on create due to slower speed
J-81 Aug 8, 2023
ee188eb
docs: update CHANGELOG
J-81 Aug 8, 2023
16cbbeb
feat: add minimal size full pipeline
J-81 Aug 8, 2023
ed053bd
fix[ci]: update nextflow setup version
J-81 Aug 8, 2023
4dd300c
fix[ci]: Test location
J-81 Aug 9, 2023
85754aa
ci: trigger on any new branch or tag
J-81 Aug 9, 2023
43f5b5f
ci: trigger on all pushes
J-81 Aug 9, 2023
43f1147
ci: trigger on all pushes
J-81 Aug 9, 2023
a8c65d3
fix: importing DGE_BY_DESEQ2
J-81 Aug 9, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .codespellignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
RNAseq
OTU
otu
groupD
groupd
7 changes: 4 additions & 3 deletions .github/workflows/check_typos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,9 @@ jobs:
fail-fast: false
steps:
- uses: actions/checkout@v3
- uses: codespell-project/actions-codespell@master
- uses: codespell-project/codespell-problem-matcher@v1
- uses: codespell-project/[email protected]
with:
check_filenames: true
skip: "*.yml,*.cff,*.js,*.lock"
ignore_words_list: RNAseq
skip: "*.yml,*.cff,*.js,*.lock,*.pdf,*.ipynb"
ignore_words_file: ".codespellignore"
97 changes: 97 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
name: CI Updated Modules Testing
# This workflow runs the pipeline with the minimal test dataset to check that it completes without any syntax errors
on:
push:
branches:
- DEV_NF_RCP-F

env:
NXF_ANSI_LOG: false

jobs:
changes:
name: Check for changes
runs-on: ubuntu-latest
outputs:
# Expose matched filters as job 'modules' output variable
modules: ${{ steps.filter.outputs.changes }}
steps:
- uses: actions/checkout@v3

- uses: dorny/paths-filter@v2
id: filter
with:
base: 'DEV_NF_RCP-F'
filters: "RNAseq/Workflow_Documentation/NF_RCP-F/workflow_code/tests/config/nftest_modules.yml"

test:
name: ${{ matrix.tags }} ${{ matrix.profile }} ${{ matrix.NXF_VER }}
runs-on: ubuntu-latest
needs: changes
if: needs.changes.outputs.modules != '[]'
strategy:
matrix:
NXF_VER:
- "22.10.1"
- "latest-everything"
profile:
- "docker"
- "singularity"
tags: ["${{ fromJson(needs.changes.outputs.modules) }}"]

steps:
- name: Check out pipeline code
uses: actions/checkout@v3

- name: Install Nextflow
uses: nf-core/setup-nextflow@be72b1dc0f932cea69aef64479ac863a86516c0c
with:
version: "${{ matrix.NXF_VER }}"

- name: Set up Singularity
if: matrix.profile == 'singularity'
uses: eWaterCycle/setup-singularity@v5
with:
singularity-version: 3.7.1

- name: Install nf-test
id: nf-test
run: |
curl -fsSL https://code.askimed.com/install/nf-test | bash
chmod u+x nf-test
echo "bin_path=$(pwd)/nf-test" >> $GITHUB_OUTPUT


- name: Hash Github Workspace
id: hash_workspace
run: |
echo "digest=$(echo RNA_3.10.1_${{ github.workspace }} | md5sum | cut -c 1-25)" >> $GITHUB_OUTPUT

- name: Cache test data
id: cache-testdata
uses: actions/cache@v3
with:
path: RNAseq/Workflow_Documentation/NF_RCP-F/workflow_code/test-datasets
key: ${{ steps.hash_workspace.outputs.digest }}

- name: Check out test data
if: steps.cache-testdata.outputs.cache-hit != 'true'
uses: actions/checkout@v3
with:
repository: J-81/test-datasets-extended
ref: NF_RCP-F
path: RNAseq/Workflow_Documentation/NF_RCP-F/workflow_code/test-datasets

# Test the module
- name: Run nf-test
run: |
cd ${GITHUB_WORKSPACE}/RNAseq/Workflow_Documentation/NF_RCP-F/workflow_code/
${{ steps.nf-test.outputs.bin_path}} test \
--profile=${{ matrix.profile }} \
--tag ${{ matrix.tags }} \
--tap=test.tap

- uses: pcolby/tap-summary@v1
with:
path: >-
${GITHUB_WORKSPACE}/RNAseq/Workflow_Documentation/NF_RCP-F/workflow_code/test.tap
77 changes: 77 additions & 0 deletions .github/workflows/ci_minimal_full_pipeline.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
name: CI Minimal Dataset Full Pipeline
# This workflow runs the pipeline with the minimal test dataset to check that it completes without any syntax errors
on: push

env:
NXF_ANSI_LOG: false

jobs:
Minimal_Dataset_Full_Pipeline:
name: ${{ matrix.tags }} ${{ matrix.profile }} ${{ matrix.NXF_VER }}
runs-on: ubuntu-latest
strategy:
matrix:
NXF_VER:
- "22.10.1"
- "latest-everything"
profile:
- "docker"
- "singularity"

steps:
- name: Check out pipeline code
uses: actions/checkout@v3

- name: Install Nextflow
uses: nf-core/[email protected]
with:
version: "${{ matrix.NXF_VER }}"

- name: Set up Singularity
if: matrix.profile == 'singularity'
uses: eWaterCycle/setup-singularity@v5
with:
singularity-version: 3.7.1

- name: Install nf-test
id: nf-test
run: |
curl -fsSL https://code.askimed.com/install/nf-test | bash
chmod u+x nf-test
echo "bin_path=$(pwd)/nf-test" >> $GITHUB_OUTPUT


- name: Hash Github Workspace
id: hash_workspace
run: |
echo "digest=$(echo RNA_3.10.1_${{ github.workspace }} | md5sum | cut -c 1-25)" >> $GITHUB_OUTPUT

- name: Cache test data
id: cache-testdata
uses: actions/cache@v3
with:
path: RNAseq/Workflow_Documentation/NF_RCP-F/workflow_code/test-datasets
key: ${{ steps.hash_workspace.outputs.digest }}

- name: Check out test data
if: steps.cache-testdata.outputs.cache-hit != 'true'
uses: actions/checkout@v3
with:
repository: J-81/test-datasets-extended
ref: NF_RCP-F
path: RNAseq/Workflow_Documentation/NF_RCP-F/workflow_code/test-datasets

# Test the module
- name: Run nf-test on minimal core test datasets
run: |
cd ${GITHUB_WORKSPACE}/RNAseq/Workflow_Documentation/NF_RCP-F/workflow_code/
${{ steps.nf-test.outputs.bin_path}} test \
--profile=${{ matrix.profile }} \
--tag core \
--tap=test.tap \
tests/*.test

- uses: pcolby/tap-summary@v1
with:
path: >-
${GITHUB_WORKSPACE}/RNAseq/Workflow_Documentation/NF_RCP-F/workflow_code/test.tap
5 changes: 3 additions & 2 deletions .github/workflows/markdown-link-check.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
name: Check Markdown links

on: push

on:
create: # runs when a reference (branch or tag) is created

jobs:
markdown-link-check:
runs-on: ubuntu-latest
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
.nextflow
*.pyc
13 changes: 13 additions & 0 deletions .gitpod.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
image: nfcore/gitpod:latest

vscode:
extensions:
- ms-python.python
- eamodio.gitlens
- GitHub.copilot
- REditorSupport.r
- esbenp.prettier-vscode # Markdown/CommonMark linting and style checking for Visual Studio Code
- mechatroner.rainbow-csv # Highlight columns in csv files in different colors
- nextflow.nextflow # Nextflow syntax highlighting
- oderwat.indent-rainbow # Highlight indentation level
- streetsidesoftware.code-spell-checker # Spelling checker for source code
5 changes: 2 additions & 3 deletions RNAseq/Pipeline_GL-DPPD-7101_Versions/GL-DPPD-7101-F.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ The DESeq2 Normalization and DGE step, [step 9](#9-normalize-read-counts-perform

- Fixed rare edge case where groupwise mean and standard deviations could become misassociated to incorrect groups. This had affected [step 9f](#9f-prepare-genelab-dge-tables-with-annotations-on-datasets-with-ercc-spike-in) and [step 9i](#9i-prepare-genelab-dge-tables-with-annotations-on-datasets-without-ercc-spike-in).

- [Step 2a](#2a-trimfilter-raw-data) adapter type argument removed in favor of using the built in TrimGalore! adapter [autodetection](https://github.com/FelixKrueger/TrimGalore/blob/0.6.7/Docs/Trim_Galore_User_Guide.md#adapter-auto-detection).
---

# Table of contents
Expand Down Expand Up @@ -122,7 +123,7 @@ The DESeq2 Normalization and DGE step, [step 9](#9-normalize-read-counts-perform
|tximport|1.27.1|[https://github.com/mikelove/tximport](https://github.com/mikelove/tximport)|
|tidyverse|1.3.1|[https://www.tidyverse.org](https://www.tidyverse.org)|
|stringr|1.4.1|[https://github.com/tidyverse/stringr](https://github.com/tidyverse/stringr)|
|dp_tools|1.1.8|[https://github.com/J-81/dp_tools](https://github.com/J-81/dp_tools)|
|dp_tools|1.3.3|[https://github.com/J-81/dp_tools](https://github.com/J-81/dp_tools)|
|pandas|1.5.0|[https://github.com/pandas-dev/pandas](https://github.com/pandas-dev/pandas)|
|seaborn|0.12.0|[https://seaborn.pydata.org/](https://seaborn.pydata.org/)|
|matplotlib|3.6.0|[https://matplotlib.org/stable](https://matplotlib.org/stable)|
Expand Down Expand Up @@ -204,7 +205,6 @@ trim_galore --gzip \
--path_to_cutadapt /path/to/cutadapt \
--cores NumberOfThreads \
--phred33 \
--illumina \ # if adapters are not illumina, replace with adapters used
--output_dir /path/to/TrimGalore/output/directory \
--paired \ # only for PE studies, remove this parameter if raw data are SE
sample1_R1_raw.fastq.gz sample1_R2_raw.fastq.gz sample2_R1_raw.fastq.gz sample2_R2_raw.fastq.gz
Expand All @@ -218,7 +218,6 @@ trim_galore --gzip \
- `--path_to_cutadapt` - specify path to cutadapt software if it is not in your `$PATH`
- `--cores` - specify the number of threads available on the server node to perform trimming
- `--phred33` - instructs cutadapt to use ASCII+33 quality scores as Phred scores for quality trimming
- `--illumina` - defines the adapter sequence to be trimmed as the first 13bp of the Illumina universal adapter `AGATCGGAAGAGC`
- `--output_dir` - the output directory to store results
- `--paired` - indicates paired-end reads - both reads, forward (R1) and reverse (R2) must pass length threshold or else both reads are removed
- `sample1_R1_raw.fastq.gz sample1_R2_raw.fastq.gz sample2_R1_raw.fastq.gz sample2_R2_raw.fastq.gz` – the input reads are specified as a positional argument, paired-end read files are listed pairwise such that the forward reads (*R1_raw.fastq.gz) are immediately followed by the respective reverse reads (*R2_raw.fastq.gz) for each sample
Expand Down
13 changes: 13 additions & 0 deletions RNAseq/Workflow_Documentation/NF_RCP-F/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,19 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

### Added
- Github action support for CI testing (a83006ba91b1209e1857fefd96e9ff950ebb0cdc)

### Fixed
- Workflow usage files will all follow output directory set by workflow user (3e69f06432f62b7924d2e043ef4768c5d09bf614)
### Changed
- TrimGalore! will now use autodetect for adaptor type (3b7e0bab4017e90481359c48f9cf7c8837ed54d2)
- V&V migrated from dp_tools version 1.1.8 to 1.3.2 including:
- Migration of V&V protocol code to this codebase instead of dp_tools (b3684a4c1db5df06eab20916ef7e130c410c147c)
- Fix for sample wise checks reusing same samples (dca4fdad7518ac9ead3ee2e4c5f57ac0fe25c715)

## [1.0.3](https://github.com/nasa/GeneLab_Data_Processing/tree/NF_RCP-F_1.0.3/RNAseq/Workflow_Documentation/NF_RCP-F) - 2023-01-25

### Added
Expand Down
20 changes: 10 additions & 10 deletions RNAseq/Workflow_Documentation/NF_RCP-F/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,9 +101,9 @@ All files required for utilizing the NF_RCP-F GeneLab workflow for processing RN
copy of latest NF_RCP-F version on to your system, the code can be downloaded as a zip file from the release page then unzipped after downloading by running the following commands:

```bash
wget https://github.com/nasa/GeneLab_Data_Processing/releases/download/NF_RCP-F_1.0.3/NF_RCP-F_1.0.3.zip
wget https://github.com/nasa/GeneLab_Data_Processing/releases/download/NF_RCP-F_1.0.4/NF_RCP-F_1.0.4.zip

unzip NF_RCP-F_1.0.3.zip
unzip NF_RCP-F_1.0.4.zip
```

<br>
Expand All @@ -115,10 +115,10 @@ unzip NF_RCP-F_1.0.3.zip
Although Nextflow can fetch Singularity images from a url, doing so may cause issues as detailed [here](https://github.com/nextflow-io/nextflow/issues/1210).

To avoid this issue, run the following command to fetch the Singularity images prior to running the NF_RCP-F workflow:
> Note: This command should be run in the location containing the `NF_RCP-F_1.0.3` directory that was downloaded in [step 2](#2-download-the-workflow-files) above. Depending on your network speed, fetching the images will take ~20 minutes.
> Note: This command should be run in the location containing the `NF_RCP-F_1.0.4` directory that was downloaded in [step 2](#2-download-the-workflow-files) above. Depending on your network speed, fetching the images will take ~20 minutes.

```bash
bash NF_RCP-F_1.0.3/bin/prepull_singularity.sh NF_RCP-F_1.0.3/config/software/by_docker_image.config
bash NF_RCP-F_1.0.4/bin/prepull_singularity.sh NF_RCP-F_1.0.4/config/software/by_docker_image.config
```


Expand All @@ -134,15 +134,15 @@ export NXF_SINGULARITY_CACHEDIR=$(pwd)/singularity

### 4. Run the Workflow

While in the location containing the `NF_RCP-F_1.0.3` directory that was downloaded in [step 2](#2-download-the-workflow-files), you are now able to run the workflow. Below are three examples of how to run the NF_RCP-F workflow:
While in the location containing the `NF_RCP-F_1.0.4` directory that was downloaded in [step 2](#2-download-the-workflow-files), you are now able to run the workflow. Below are three examples of how to run the NF_RCP-F workflow:
> Note: Nextflow commands use both single hyphen arguments (e.g. -help) that denote general nextflow arguments and double hyphen arguments (e.g. --ensemblVersion) that denote workflow specific parameters. Take care to use the proper number of hyphens for each argument.

<br>

#### 4a. Approach 1: Run the workflow on a GeneLab RNAseq dataset with automatic retrieval of Ensembl reference fasta and gtf files

```bash
nextflow run NF_RCP-F_1.0.3/main.nf \
nextflow run NF_RCP-F_1.0.4/main.nf \
-profile singularity \
--gldsAccession GLDS-194
```
Expand All @@ -154,7 +154,7 @@ nextflow run NF_RCP-F_1.0.3/main.nf \
> Note: The `--ref_source` and `--ensemblVersion` parameters should match the reference source and version number of the local reference fasta and gtf files used

```bash
nextflow run NF_RCP-F_1.0.3/main.nf \
nextflow run NF_RCP-F_1.0.4/main.nf \
-profile singularity \
--gldsAccession GLDS-194 \
--ensemblVersion 107 \
Expand All @@ -170,7 +170,7 @@ nextflow run NF_RCP-F_1.0.3/main.nf \
> Note: Specifications for creating a runsheet manually are described [here](examples/runsheet/README.md).

```bash
nextflow run NF_RCP-F_1.0.3/main.nf \
nextflow run NF_RCP-F_1.0.4/main.nf \
-profile singularity \
--runsheetPath </path/to/runsheet>
```
Expand All @@ -179,7 +179,7 @@ nextflow run NF_RCP-F_1.0.3/main.nf \

**Required Parameters For All Approaches:**

* `NF_RCP-F_1.0.3/main.nf` - Instructs Nextflow to run the NF_RCP-F workflow
* `NF_RCP-F_1.0.4/main.nf` - Instructs Nextflow to run the NF_RCP-F workflow

* `-profile` - Specifies the configuration profile(s) to load, `singularity` instructs Nextflow to setup and use singularity for all software called in the workflow

Expand Down Expand Up @@ -225,7 +225,7 @@ nextflow run NF_RCP-F_1.0.3/main.nf \
All parameters listed above and additional optional arguments for the RCP workflow, including debug related options that may not be immediately useful for most users, can be viewed by running the following command:

```bash
nextflow run NF_RCP-F_1.0.3/main.nf --help
nextflow run NF_RCP-F_1.0.4/main.nf --help
```

See `nextflow run -h` and [Nextflow's CLI run command documentation](https://nextflow.io/docs/latest/cli.html#run) for more options and details common to all nextflow workflows.
Expand Down
Binary file not shown.
Loading
Loading