Skip to content

Commit

Permalink
Merge branch 'main' into yashpatel-allow-empty-csv-fields
Browse files Browse the repository at this point in the history
  • Loading branch information
nwiltsie authored Jun 3, 2024
2 parents 9254f7a + bc3f67e commit 3b5f70f
Show file tree
Hide file tree
Showing 6 changed files with 162 additions and 26 deletions.
24 changes: 0 additions & 24 deletions .github/workflows/cicd-base.yaml

This file was deleted.

17 changes: 17 additions & 0 deletions .github/workflows/static-analysis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
name: CI

on:
push:
branches:
- main
pull_request:
branches:
- main

jobs:
CICD-base:
runs-on: ubuntu-latest

steps:
- uses: uclahs-cds/tool-static-analysis@v1
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,17 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm
- Functional test for `bam_parser.parse_bam_header()`.
- Dump parameters with `json_extractor.store_params_json()`
- Option to allow empty fields in a CSV to be parsed without error
- Function to save process logs, even after failure

### Fixed
- Fixed retry for potentially undefined variable `proc_name_keys`. #57
- Fix `schema.check_write_permission` when `null` is returned by `getParentFile` #59

### Changed
- Allow `set_resources_allocation` to use maximum allocation values defined in params
- Add `def`s to multiple variables that should not be globally defined
- Allow specific schema validation function to accept pre-loaded schema as input
- Add try-catch to schema validation to log the parameter being failed to validate.

---

Expand Down
69 changes: 69 additions & 0 deletions config/methods/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,75 @@ methods {
}
```

### Capture process outputs, even after failures
`methods.setup_process_afterscript` is a drop-in replacement for this method of capturing process log files:

```Nextflow
// Old technique
process xxx {
publishDir path: "${params.log_output_dir}/process-log",
pattern: ".command.*",
mode: "copy",
saveAs: { "${task.process.replace(':', '/')}-${sample_id}/log${file(it).getName()}" }
output:
path(".command.*")
```

The above method does not produce any output files if the process fails. In order to always capture the log files, use the following:

```Nextflow
includeConfig "/path/to/common_methods.config"
...
methods {
...
methods.setup_process_afterscript()
}
```

> [!NOTE]
> `methods.setup_process_afterscript()` duplicates the effect of defining `publishDir` and `output: path(".command.*")` for every process. Although there is no harm in using both techniques simultaneously, for clarity it is recommended to only use one.
The output path is controlled by the following two [custom process directives](https://www.nextflow.io/docs/latest/process.html#ext):

| Name | Default Value |
| --- | --- |
| `process.ext.log_dir` | `task.process.replace(':', '/')` |
| `process.ext.log_dir_suffix` | `''` |

Each of the above can be overridden by individual processes.

The final directory path is `${params.log_output_dir}/process-log/${task.ext.log_dir}${task.ext.log_dir_suffix}`.

#### Disabling for individual processes

```Nextflow
process xxx {
ext capture_logs: false
```

#### Combining with `afterScript`

> [!IMPORTANT]
> This method sets `process.afterScript`. If you define another `afterScript` (either globally or for a specific process), you must use this technique to combine both scripts.
The `afterScript` closure established by `methods.setup_process_afterscript()` is also stored as `process.ext.commonAfterScript`. If a process requires another `afterScript`, it should be combined with the one defined here like so:

```Nextflow
process xxx {
afterScript {
[
"echo 'Before the common script'",
task.ext.commonAfterScript ?: "",
"echo 'After the common script'"
].join("\n")
}
```

Due to the [Elvis operator](https://groovy-lang.org/operators.html#_elvis_operator), the above snippet is safe to use even if `methods.setup_process_afterscript()` is not used.


## References
1. `nf-core` - https://nf-co.re/
2. `nf-code modules` - https://github.com/nf-core/sarek/blob/ad2b34f39fead34d7a09051e67506229e827e892/conf/modules.config
65 changes: 65 additions & 0 deletions config/methods/common_methods.config
Original file line number Diff line number Diff line change
Expand Up @@ -391,6 +391,71 @@ methods {
process.containerOptions = {-> (task.containsKey('cpus')) ? "--cpu-shares ${default_cpu_shares} --cpus ${task.cpus}" : "--cpu-shares ${default_cpu_shares}"}
}

/**
* Configure all processes to save their command files in the output
* directory.
*
* This add a custom process directive that, when used as the afterScript,
* will copy all of the process's .command.* files into the output
* directory.
* Processes can customize the output directory by setting
* `process.ext.log_dir` and `process.ext.log_dir_suffix`. Both may be
* closures.
*
* Inspired by https://github.com/nextflow-io/nextflow/issues/1166#issuecomment-502467562
*/
setup_process_afterscript = {
process.ext.log_dir = {
"${task.process.replace(':', '/')}"
}

process.ext.capture_logs = true

process.ext.commonAfterScript = {
if (!task.ext.capture_logs) {
return ""
}

process_log_dir = [
"${params.log_output_dir}",
"process-log",
"${task.ext.log_dir}${task.ext.log_dir_suffix ?: ''}"
].join("/")

// Handle relative paths
if (process_log_dir.substring(0, 1) != "/") {
process_log_dir = "${launchDir}/${process_log_dir}"
}

return """\
readonly LOG_DIR="${process_log_dir}"
mkdir -p "\${LOG_DIR}"
for filename in .command.*; do
[ -e "\${filename}" ] || continue
cp "\${filename}" "\${LOG_DIR}/log\${filename}"
done
""".stripIndent()
}

/*
Set the default afterScript. If individual processes override
afterScript, they can restore this functionality like so (this is safe
to include even if setup_process_afterscript is not called):
afterScript {
[
"echo 'Before the common'",
task.ext.commonAfterScript ?: "",
"echo 'After the common'"
].join("\n")
}
*/

process.afterScript = process.ext.commonAfterScript
}


/**
* Resolve the absolute path of a file relative to the current config file.
*/
Expand Down
10 changes: 8 additions & 2 deletions config/schema/schema.config
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ schema {
*/
check_path = { String p, String mode ->
def File file = new File(p)
file = file.getAbsoluteFile()
if (mode == 'w') {
schema.check_write_permission(file)
return
Expand Down Expand Up @@ -92,7 +93,7 @@ schema {
/**
* Check whether the required properties are defined for the corresponding parameter from the
* schema yaml file.
*
*
* @throws IllegalArgumentException when config file is missing required properties.
*/
check_required_properties = { String name, Map properties ->
Expand Down Expand Up @@ -247,7 +248,12 @@ schema {
validate = { String file_path="${projectDir}/config/schema.yaml" ->
def params_schema = schema.load_schema(file_path)
params_schema.each { key, val ->
schema.validate_parameter(params, key, val)
try {
schema.validate_parameter(params, key, val)
} catch (Exception ex) {
System.out.println "Failed to validate parameter key: ${val}"
throw ex
}
}
}

Expand Down

0 comments on commit 3b5f70f

Please sign in to comment.