Skip to content

Commit

Permalink
Merge branch 'dev' into merging-template-3.11
Browse files Browse the repository at this point in the history
  • Loading branch information
JoseEspinosa authored Jan 8, 2025
2 parents 5b4bca5 + 1f8f208 commit 7fa318e
Show file tree
Hide file tree
Showing 31 changed files with 217 additions and 81 deletions.
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ jobs:
- "test_colabfold_webserver"
- "test_colabfold_download"
- "test_esmfold"
- "test_split_fasta"
isMaster:
- ${{ github.base_ref == 'master' }}
# Exclude conda and singularity on dev
Expand Down
13 changes: 7 additions & 6 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Enhancements & fixes

- [[#177](https://github.com/nf-core/proteinfold/issues/177)]- Fix typo in some instances of model preset `alphafold2_ptm`.
- [[#177](https://github.com/nf-core/proteinfold/issues/177)] - Fix typo in some instances of model preset `alphafold2_ptm`.
- [[PR #178](https://github.com/nf-core/proteinfold/pull/178)] - Enable running multiple modes in parallel.
- [[#179](https://github.com/nf-core/proteinfold/issues/179)]- Produce an interactive html report for the predicted structures.
- [[#180](https://github.com/nf-core/proteinfold/issues/180)]- Implement Fooldseek.
- [[#188](https://github.com/nf-core/proteinfold/issues/188)]- Fix colabfold image to run in gpus.
- [[#179](https://github.com/nf-core/proteinfold/issues/179)] - Produce an interactive html report for the predicted structures.
- [[#180](https://github.com/nf-core/proteinfold/issues/180)] - Implement Fooldseek.
- [[#188](https://github.com/nf-core/proteinfold/issues/188)] - Fix colabfold image to run in gpus.
- [[PR ##205](https://github.com/nf-core/proteinfold/pull/205)] - Change input schema from `sequence,fasta` to `id,fasta`.
- [[PR #210](https://github.com/nf-core/proteinfold/pull/210)]- Moving post-processing logic to a subworkflow, change wave images pointing to oras to point to https and refactor module to match nf-core folder structure.
- [[#214](https://github.com/nf-core/proteinfold/issues/214)]- Fix colabfold image to run in cpus after [#188](https://github.com/nf-core/proteinfold/issues/188) fix.
- [[PR #210](https://github.com/nf-core/proteinfold/pull/210)] - Moving post-processing logic to a subworkflow, change wave images pointing to oras to point to https and refactor module to match nf-core folder structure.
- [[#214](https://github.com/nf-core/proteinfold/issues/214)] - Fix colabfold image to run in cpus after [#188](https://github.com/nf-core/proteinfold/issues/188) fix.
- [[#235](https://github.com/nf-core/proteinfold/issues/235)] - Update samplesheet to new version (switch from `sequence` column to `id`).

## [[1.1.1](https://github.com/nf-core/proteinfold/releases/tag/1.1.1)] - 2025-07-30

Expand Down
54 changes: 31 additions & 23 deletions assets/comparison_template.html

Large diffs are not rendered by default.

20 changes: 10 additions & 10 deletions assets/report_template.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion assets/samplesheet.csv
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
sequence,fasta
id,fasta
T1024,https://raw.githubusercontent.com/nf-core/test-datasets/proteinfold/testdata/sequences/T1024.fasta
T1026,https://raw.githubusercontent.com/nf-core/test-datasets/proteinfold/testdata/sequences/T1026.fasta
2 changes: 1 addition & 1 deletion bin/generate_comparison_report.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ def generate_output(plddt_data, name, out_dir, generate_tsv, pdb):
linecolor="black",
gridcolor="WhiteSmoke",
),
legend=dict(y=0, x=1),
legend=dict(yanchor="bottom", y=0.02, xanchor="right", x=1, bordercolor="Black", borderwidth=1),
plot_bgcolor="white",
width=600,
height=600,
Expand Down
2 changes: 1 addition & 1 deletion bin/generate_report.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ def generate_output_images(msa_path, plddt_data, name, out_dir, in_type, generat
linecolor="black",
gridcolor="WhiteSmoke",
),
legend=dict(yanchor="bottom", y=0, xanchor="right", x=1.3),
legend=dict(yanchor="bottom", y=0.02, xanchor="right", x=1, bordercolor="Black", borderwidth=1),
plot_bgcolor="white",
width=600,
height=600,
Expand Down
32 changes: 25 additions & 7 deletions conf/modules_alphafold2.config
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,18 @@ if (params.alphafold2_mode == 'standard') {
params.max_template_date ? "--max_template_date ${params.max_template_date}" : ''
].join(' ').trim()
publishDir = [
path: { "${params.outdir}/alphafold2/${params.alphafold2_mode}" },
mode: 'copy',
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
[
path: { "${params.outdir}/alphafold2/${params.alphafold2_mode}" },
mode: 'copy',
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
pattern: '*.*'
],
[
path: { "${params.outdir}/alphafold2/${params.alphafold2_mode}/top_ranked_structures" },
mode: 'copy',
saveAs: { "${meta.id}.pdb" },
pattern: '*_alphafold2.pdb'
]
]
}
}
Expand All @@ -54,7 +63,7 @@ if (params.alphafold2_mode == 'split_msa_prediction') {
withName: 'RUN_ALPHAFOLD2_MSA' {
ext.args = params.max_template_date ? "--max_template_date ${params.max_template_date}" : ''
publishDir = [
path: { "${params.outdir}/alphafold2/${params.alphafold2_mode}" },
path: { "${params.outdir}/alphafold2_${params.alphafold2_mode}" },
mode: 'copy',
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -64,9 +73,18 @@ if (params.alphafold2_mode == 'split_msa_prediction') {
if(params.use_gpu) { accelerator = 1 }
ext.args = params.use_gpu ? '--use_gpu_relax=true' : '--use_gpu_relax=false'
publishDir = [
path: { "${params.outdir}/alphafold2/${params.alphafold2_mode}" },
mode: 'copy',
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
[
path: { "${params.outdir}/alphafold2/${params.alphafold2_mode}" },
mode: 'copy',
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
pattern: '*.*'
],
[
path: { "${params.outdir}/alphafold2/${params.alphafold2_mode}/top_ranked_structures" },
mode: 'copy',
saveAs: { "${meta.id}.pdb" },
pattern: '*_alphafold2.pdb'
]
]
}
}
Expand Down
32 changes: 24 additions & 8 deletions conf/modules_colabfold.config
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,18 @@ if (params.colabfold_server == 'webserver') {
params.host_url ? "--host-url ${params.host_url}" : ''
].join(' ').trim()
publishDir = [
path: { "${params.outdir}/colabfold/${params.colabfold_server}" },
mode: 'copy',
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
pattern: '*.*'
[
path: { "${params.outdir}/colabfold/${params.colabfold_server}" },
mode: 'copy',
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
pattern: '*.*'
],
[
path: { "${params.outdir}/colabfold/${params.colabfold_server}/top_ranked_structures" },
mode: 'copy',
saveAs: { "${meta.id}.pdb" },
pattern: '*_relaxed_rank_001*.pdb'
]
]
}
}
Expand Down Expand Up @@ -67,10 +75,18 @@ if (params.colabfold_server == 'local') {
params.use_templates ? '--templates' : ''
].join(' ').trim()
publishDir = [
path: { "${params.outdir}/colabfold/${params.colabfold_server}" },
mode: 'copy',
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
pattern: '*.*'
[
path: { "${params.outdir}/colabfold/${params.colabfold_server}" },
mode: 'copy',
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
pattern: '*.*'
],
[
path: { "${params.outdir}/colabfold/${params.colabfold_server}/top_ranked_structures" },
mode: 'copy',
saveAs: { "${meta.id}.pdb" },
pattern: '*_relaxed_rank_001*.pdb'
],
]
}
}
Expand Down
10 changes: 9 additions & 1 deletion conf/modules_esmfold.config
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,19 @@ process {
withName: 'RUN_ESMFOLD' {
ext.args = {params.use_gpu ? '' : '--cpu-only'}
publishDir = [
path: { "${params.outdir}/esmfold" },
[
path: { "${params.outdir}/esmfold/default" },
mode: 'copy',
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
pattern: '*.*'
],
[
path: { "${params.outdir}/esmfold/default/top_ranked_structures" },
mode: 'copy',
saveAs: { "${meta.id}.pdb" },
pattern: '*.pdb'
]
]
}

withName: 'NFCORE_PROTEINFOLD:ESMFOLD:MULTIQC' {
Expand Down
2 changes: 1 addition & 1 deletion conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ params {
// Input data to test alphafold2 analysis
mode = 'alphafold2'
alphafold2_mode = 'standard'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet.csv'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet.csv'
alphafold2_db = "${projectDir}/assets/dummy_db_dir"
}

Expand Down
2 changes: 1 addition & 1 deletion conf/test_alphafold_download.config
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ params {
// Input data to test alphafold2 analysis
mode = 'alphafold2'
alphafold2_mode = 'standard'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet.csv'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet.csv'
}

process {
Expand Down
2 changes: 1 addition & 1 deletion conf/test_alphafold_split.config
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ params {
// Input data to test alphafold2 splitting MSA from prediction analysis
mode = 'alphafold2'
alphafold2_mode = 'split_msa_prediction'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet.csv'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet.csv'
alphafold2_db = "${projectDir}/assets/dummy_db_dir"
}

Expand Down
2 changes: 1 addition & 1 deletion conf/test_colabfold_download.config
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ params {
// Input data to test colabfold analysis
mode = 'colabfold'
colabfold_server = 'webserver'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet.csv'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet.csv'
}

process {
Expand Down
2 changes: 1 addition & 1 deletion conf/test_colabfold_local.config
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ params {
mode = 'colabfold'
colabfold_server = 'local'
colabfold_db = "${projectDir}/assets/dummy_db_dir"
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet.csv'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet.csv'
}

process {
Expand Down
2 changes: 1 addition & 1 deletion conf/test_colabfold_webserver.config
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ params {
mode = 'colabfold'
colabfold_server = 'webserver'
colabfold_db = "${projectDir}/assets/dummy_db_dir"
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet.csv'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet.csv'
}

process {
Expand Down
2 changes: 1 addition & 1 deletion conf/test_esmfold.config
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ params {
// Input data to test esmfold
mode = 'esmfold'
esmfold_db = "${projectDir}/assets/dummy_db_dir"
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet.csv'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet.csv'
}

process {
Expand Down
2 changes: 1 addition & 1 deletion conf/test_full.config
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,6 @@ params {
// Input data for full test of alphafold standard mode
mode = 'alphafold2'
alphafold2_mode = 'standard'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet.csv'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet.csv'
alphafold2_db = 's3://proteinfold-dataset/test-data/db/alphafold_mini'
}
2 changes: 1 addition & 1 deletion conf/test_full_alphafold_multimer.config
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,6 @@ params {
mode = 'alphafold2'
alphafold2_mode = 'standard'
alphafold2_model_preset = 'multimer'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet_multimer.csv'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet_multimer.csv'
alphafold2_db = 's3://proteinfold-dataset/test-data/db/alphafold_mini'
}
2 changes: 1 addition & 1 deletion conf/test_full_alphafold_split.config
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,6 @@ params {
// Input data to test colabfold with a local server analysis
mode = 'alphafold2'
alphafold2_mode = 'split_msa_prediction'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet.csv'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet.csv'
alphafold2_db = 's3://proteinfold-dataset/test-data/db/alphafold_mini'
}
2 changes: 1 addition & 1 deletion conf/test_full_colabfold_local.config
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ params {
mode = 'colabfold'
colabfold_server = 'local'
colabfold_model_preset = 'alphafold2_ptm'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet.csv'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet.csv'
colabfold_db = 's3://proteinfold-dataset/test-data/db/colabfold_mini'
}
process {
Expand Down
2 changes: 1 addition & 1 deletion conf/test_full_colabfold_webserver.config
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,6 @@ params {
mode = 'colabfold'
colabfold_server = 'webserver'
colabfold_model_preset = 'alphafold2_ptm'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet.csv'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet.csv'
colabfold_db = 's3://proteinfold-dataset/test-data/db/colabfold_mini'
}
2 changes: 1 addition & 1 deletion conf/test_full_colabfold_webserver_multimer.config
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,6 @@ params {
mode = 'colabfold'
colabfold_server = 'webserver'
colabfold_model_preset = 'alphafold2_multimer_v3'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet_multimer.csv'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet_multimer.csv'
colabfold_db = 's3://proteinfold-dataset/test-data/db/colabfold_mini'
}
2 changes: 1 addition & 1 deletion conf/test_full_esmfold.config
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,6 @@ params {
// Input data for full test of esmfold monomer
mode = 'esmfold'
esmfold_model_preset = 'monomer'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet.csv'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet.csv'
esmfold_db = 's3://proteinfold-dataset/db/esmfold'
}
2 changes: 1 addition & 1 deletion conf/test_full_esmfold_multimer.config
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,6 @@ params {
// Input data for full test of esmfold multimer
mode = 'esmfold'
esmfold_model_preset = 'multimer'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet_multimer.csv'
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet_multimer.csv'
esmfold_db = 's3://proteinfold-dataset/test-data/db/esmfold'
}
38 changes: 38 additions & 0 deletions conf/test_split_fasta.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for running minimal tests
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Defines input files and everything required to run a fast and simple pipeline test.
Use as follows:
nextflow run nf-core/proteinfold -profile test_colabfold_local,<docker/singularity> --outdir <OUTDIR>
----------------------------------------------------------------------------------------
*/

stubRun = true

// Limit resources so that this can run on GitHub Actions
process {
resourceLimits = [
cpus: 4,
memory: '15.GB',
time: '1.h'
]
}

params {
config_profile_name = 'Test profile'
config_profile_description = 'Minimal test dataset to check pipeline function'

// Input data to test colabfold with the colabfold webserver analysis
mode = 'colabfold'
colabfold_server = 'local'
split_fasta = true
colabfold_db = "${projectDir}/assets/dummy_db_dir"
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.2/samplesheet_multimer.csv'
}

process {
withName: 'MMSEQS_COLABFOLDSEARCH|COLABFOLD_BATCH' {
container = 'biocontainers/gawk:5.1.0'
}
}
15 changes: 7 additions & 8 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,8 @@ The directories listed below will be created in the output directory after the p
<details markdown="1">
<summary>Output files</summary>

- `AlphaFold2/`
- `<SEQUENCE NAME>/` that contains the computed MSAs, unrelaxed structures, relaxed structures, ranked structures, raw model outputs, prediction metadata, and section timings
- `<SEQUENCE NAME>.alphafold.pdb` that is the structure with the highest pLDDT score (ranked first)
- `<SEQUENCE NAME>_plddt_mqc.tsv` that presents the pLDDT scores per residue for each of the 5 predicted models
- `alphafold2/standard/` or `alphafold2/split_msa_prediction/` based on the selected mode. It contains the computed MSAs, unrelaxed structures, relaxed structures, ranked structures, raw model outputs, prediction metadata, and section timings. Specifically, `<SEQUENCE NAME>_plddt_mqc.tsv` presents the pLDDT scores per residue for each of the 5 predicted models.
- `top_ranked_structures/<SEQUENCE NAME>.pdb` that is the structure with the highest pLDDT score per input (ranked first)
- `DBs/` that contains symbolic links to the downloaded database and parameter files

</details>
Expand Down Expand Up @@ -91,7 +89,8 @@ Below you can find an indicative example of the TSV file with the pLDDT scores p
<details markdown="1">
<summary>Output files</summary>

- `colabfold/webserver/` or `colabfold/local/` based on the selected mode that contains the computed MSAs, unrelaxed structures, relaxed structures, ranked structures, raw model outputs and scores, prediction metadata, logs and section timings
- `colabfold/webserver/` or `colabfold/local/` based on the selected mode. It contains the computed MSAs, unrelaxed structures, relaxed structures, ranked structures, raw model outputs, prediction metadata, and section timings. Specifically, `<SEQUENCE NAME>_plddt_mqc.tsv` presents the pLDDT scores per residue for each of the 5 predicted models.
- `top_ranked_structures/<SEQUENCE NAME>.pdb` that is the structure with the highest pLDDT score per input (ranked first)
- `DBs/` that contains symbolic links to the downloaded database and parameter files

</details>
Expand All @@ -115,9 +114,9 @@ Below you can find some indicative examples of the output images produced by Col
<details markdown="1">
<summary>Output files</summary>

- `esmfold/`
- `<SEQUENCE NAME>.pdb` that is the structure with the highest pLDDT score (ranked first)
- `<SEQUENCE NAME>_plddt_mqc.tsv` that presents the pLDDT scores per residue for each of the 5 predicted models
- `esmfold/default`
contains the predicted structures. Specifically, `<SEQUENCE NAME>_plddt_mqc.tsv` presents the pLDDT scores per residue for each of the predicted models.
- `top_ranked_structures/<SEQUENCE NAME>.pdb` that is the structure with the highest pLDDT score per input (ranked first)
- `DBs/` that contains symbolic links to the downloaded database and parameter files

</details>
Expand Down
Loading

0 comments on commit 7fa318e

Please sign in to comment.