Skip to content

Releases: openpipelines-bio/openpipeline

OpenPipelines.bio v1.0.4

17 Dec 13:24
18cc6c8
Compare
Choose a tag to compare

BUG FIXES

  • scvi_leiden workflow: fix the input layer argument of the workflow not being passed to the scVI component (PR #939, backported from PR #936 and PR #938).

OpenPipelines.bio v2.0.0

17 Dec 13:28
60028cb
Compare
Choose a tag to compare

BREAKING CHANGES

  • velocity/scvelo: update scvelo to 0.3.3, which also removes support for using loom input files. The component now uses a MuData object as input. Several arguments were added to support selecting different inputs from the MuData file: counts_layer, modality, layer_spliced, layer_unspliced, layer_ambiguous. An output_h5mu argument was has been added (PR #932).

  • src/annotate/onclass and src/annotate/celltypist: Input parameter for gene name layers of input datasets has been updated to --input_var_gene_names and reference_var_gene_names (PR #919).

  • Several components under src/scgpt (cross_check_genes, tokenize_pad, binning) now processes the input (query) datasets differently. Instead of subsetting datasets based on genes in the model vocabulary and/or highly variable genes, these components require an input .var column with a boolean mask specifying this information. The results are written back to the original input data, preserving the dataset structure (PR #832).

  • query/cellxgene_census: The default output layer has been changed from .layers["counts"] to .X to be more aligned with the standard OpenPipelines format (PR #933).
    Use argument --output_layer_counts counts to revert the behaviour to the previous default.

  • Added cell multiplexing support to the from_cellranger_multi_to_h5mu component and the cellranger_multi workflow. For the from_cellranger_multi_to_h5mu component, the output argument now requires a value containing a wildcard character *, which will be replaced by the sample ID to form the final output file names. Additionally, a sample_csv argument is added to the from_cellragner_multi_to_h5mu component which describes the sample name per output file. No change is required for the output_h5mu argument from the cellranger_multi workflow, the workflow will just emit multiple events in case of a multiplexed run, one for each sample. The id of the events (and default output file names) are set by --sample_ids (in case of cell multiplexing), or (as before) by the user provided id for the input (PR #803 and PR #902).

  • demux/bcl_convert: update BCL convert from 3.10 to 4.2 (PR #774).

  • demux/cellranger_mkfastq, mapping/cellranger_count, mapping/cellranger_multi and reference/build_cellranger_reference: update cellranger to 8.0.1 (PR #774 and PR #811).

  • Removed --disable_library_compatibility_check in favour of --check_library_compatibility to the mapping/cellranger_multi component and the ingestion/cellranger_multi workflow (PR #818).

  • lianapy: bumped version to 1.3.0 (PR #827 and PR #862). Additionally, groupby is now a required argument.

  • concat: this component was deprecated and has now been removed, use concatenate_h5mu instead (PR #796).

  • The workflows folder in the root of the project no longer contains symbolic links to the build workflows in target.
    Using any workflows that was previously linked in this directory will now result in an error which will indicate
    the location of the workflow to be used instead (PR #796).

  • XGBoost: bump version to 2.0.3 (PR #646).

  • Several components: update anndata to 0.11.1 and mudata to 0.3.1 (PR #645 and PR #901), and scanpy to 1.10.4 (PR #901).

  • filter/filter_with_hvg: this component was deprecated and has now been removed. Use feature_annotation/highly_variable_features_scanpy instead (PR #843).

  • dataflow/concat: this component was deprecated and has now been removed. Use dataflow/concatenate_h5mu instead (PR #857).

  • convert/from_h5mu_to_seurat: bump seurat to latest version (PR #850).

  • workflows/ingestion/bd_rhapsody: Upgrade BD Rhapsody 1.x to 2.x, thereby changing the interface of the workflow (PR #846).

  • mapping/bd_rhapsody: Upgrade BD Rhapsody 1.x to 2.x, thereby changing the interface of the workflow (PR #846).

  • reference/make_bdrhap_reference: Upgrade BD Rhapsody 1.x to 2.x, thereby changing the interface of the workflow (PR #846).

  • reference/build_star_reference: Rename mapping/star_build_reference to reference/build_star_reference (PR #846).

  • reference/cellranger_mkgtf: Rename reference/mkgtf to reference/cellranger_mkgtf (PR #846).

  • labels_transfer/xgboost: Align interface with new annotation workflow

    • Store label probabilities instead of uncertainties
    • Take .h5mu format as an input instead of .h5ad
  • reference/build_cellranger_arc_reference: a default value of "output" is now specified for the argument --genome, inline with reference/build_cellranger_reference component. Additionally, providing a value for --organism is no longer required and its default value of Homo Sapiens has been removed (PR #864).

NEW FUNCTIONALITY

Important

Workflows from the workflows/annotation and workflows/integration/scgpt_leiden namespaces, plus their newly implemented dependencies, are not yet considered to be part of the stable public API. Their functionality and interface may be subject to change.

  • velocyto_to_h5mu: now writes counts to .X (PR #932)

  • qc/calculate_atac_qc_metrics: new component for calculating ATAC QC metrics (PR #868).

  • workflows/annotation/scgpt_integration_knn workflow: Cell-type annotation based on scGPT integration with KNN label transfer (PR #875).

  • CI: Use params.resources_test in test workflows in order to point to an alternative location (e.g. a cache) (PR #889).

  • Added demux/cellranger_atac_mkfastq component: demultiplex raw sequencing data for ATAC experiments (PR #726).

  • process_samples, process_batches and rna_multisample workflows: added functionality to scale the log-normalized
    gene expression data to unit variance and zero mean. The scaled data will be output to a different layer and the
    representation with reduced dimensions will be created and stored in addition to the non-scaled data (PR #733).

  • transform/scaling: add --input_layer and --output_layer arguments (PR #733).

  • CI: added checking of mudata contents for multiple workflows (PR #783).

  • Added multiple arguments to the cellranger_multi workflow in order to maintain feature parity with the mapping/cellranger_multi component (PR #803).

  • convert/from_cellranger_to_h5mu: add support for antigen analysis.

  • Added demux/cellranger_atac_mkfastq component: demultiplex raw sequencing data for ATAC experiments (PR #726).

  • Added reference/build_cellranger_reference component: build reference file compatible with ATAC and ATAC+GEX experiments (PR #726).

  • demux/bcl_convert: add support for no lane splitting (PR #804).

  • reference/cellranger_mkgtf component: Added cellranger mkgtf as a standalone component (PR #771).

  • scgpt/cross_check_genes component: Added a gene-model cross check component for scGPT (PR #758).

  • scgpt/embedding: component: Added scGPT embedding component (PR #761)

  • scgpt/tokenize_pad: component: Added scGPT padding and tokenization component (PR #754).

  • scgpt/binning component: Added a scGPT pre-processing binning component (PR #765).

  • workflows/integration/scgpt_leiden workflow with scGPT integration followed by Leiden clustering (PR #794).

  • scgpt/cell_type_annotation component: Added scGPT cell type annotation component (PR #798).

  • resources_test_scripts/scGPT.sh: Added script to include scGPT test resources (PR #800).

  • transform/clr component: Added the option to set the axis along which to apply CLR. Possible to override
    on workflow level as well (PR #767).

  • annotate/celltypist component: Added a CellTypist annotation component (PR #825).

  • dataflow/split_h5mu component: Added a component to split a single h5mu file into multiple h5mu files based on the values of an .obs column (PR #824).

  • workflows/test_workflows/ingestion components & workflows/ingestion: Added standalone components for integration testing of ingestion workflows (PR #801).

  • workflows/ingestion/make_reference: Add additional arguments passed through to the STAR and BD Rhapsody reference components (PR #846).

  • annotate/random_forest_annotation component: Added a random forest cell type annotation component (PR #848).

  • dataflow/concatenate_h5mu: data from .uns, both originating from the global and per-modality slots, is now retained in the final concatenated output object. Additionally, added the uns_merge_mode argument in order to tune the behavior when conflicting keys are detected across samples (PR #859).

  • dimred/densmap component: Added a densMAP dimensionality reduction component (PR #748).

  • annotate/scanvi component: Added a component to annotate cells using scANVI (PR #833).

  • transform/bpcells_regress_out component: Added a component to regress out effects of confounding variables in the count matrix using BPCells (PR #863).

  • transform/regress_out: Allow providing 'input' and 'output' layers for scanpy regress_out functionality (PR #863).

  • workflows/ingestion/make_reference: add possibility to build CellRanger ARC references. Added --motifs_file, --non_nuclear_contigs and --output_cellranger_arc arguments (PR #864).

  • Test resources (reference_gencodev41_chr1): switch reference genome for CellRanger to ARC variant (PR #864).

  • transform/bpcells_regress_out component: Added a component to regress out effects of confounding variables in the count matrix using BPCells (PR #863).

  • transform/regress_out: Allow providing 'input' and 'output' layers for scanpy regress_out functionality (PR #863).

  • Added transform/tfidf component: normalize ATAC data with TF-IDF (PR #870).

  • Added dimred/lsi component (PR #552).

  • metadata/duplicate_obs component: Added a component to make a copy from one .obs field or index to another .obs field within...

Read more

OpenPipelines.bio v2.0.0-rc.2

17 Dec 13:00
eed65a5
Compare
Choose a tag to compare
Pre-release

BUG FIXES

  • annotate/popv: fix popv raising ValueError when an accelerator (e.g. GPU) is unavailable (PR #915).

MINOR CHANGES

  • dataflow/split_h5mu: Optimize resource usage of the component (PR #913).

OpenPipelines.bio v2.0.0-rc.1

17 Dec 13:00
Compare
Choose a tag to compare
Pre-release

BREAKING CHANGES

  • Added cell multiplexing support to the from_cellranger_multi_to_h5mu component and the cellranger_multi workflow. For the from_cellranger_multi_to_h5mu component, the output argument now requires a value containing a wildcard character *, which will be replaced by the sample ID to form the final output file names. Additionally, a sample_csv argument is added to the from_cellragner_multi_to_h5mu component which describes the sample name per output file. No change is required for the output_h5mu argument from the cellranger_multi workflow, the workflow will just emit multiple events in case of a multiplexed run, one for each sample. The id of the events (and default output file names) are set by --sample_ids (in case of cell multiplexing), or (as before) by the user provided id for the input (PR #803 and PR #902).

  • demux/bcl_convert: update BCL convert from 3.10 to 4.2 (PR #774).

  • demux/cellranger_mkfastq, mapping/cellranger_count, mapping/cellranger_multi and reference/build_cellranger_reference: update cellranger to 8.0.1 (PR #774 and PR #811).

  • Removed --disable_library_compatibility_check in favour of --check_library_compatibility to the mapping/cellranger_multi component and the ingestion/cellranger_multi workflow (PR #818).

  • lianapy: bumped version to 1.3.0 (PR #827 and PR #862). Additionally, groupby is now a required argument.

  • concat: this component was deprecated and has now been removed, use concatenate_h5mu instead (PR #796).

  • The workflows folder in the root of the project no longer contains symbolic links to the build workflows in target.
    Using any workflows that was previously linked in this directory will now result in an error which will indicate
    the location of the workflow to be used instead (PR #796).

  • XGBoost: bump version to 2.0.3 (PR #646).

  • Several components: update anndata to 0.11.1 and mudata to 0.3.1 (PR #645 and PR #901), and scanpy to 1.10.4 (PR #901).

  • filter/filter_with_hvg: this component was deprecated and has now been removed. Use feature_annotation/highly_variable_features_scanpy instead (PR #843).

  • dataflow/concat: this component was deprecated and has now been removed. Use dataflow/concatenate_h5mu instead (PR #857).

  • convert/from_h5mu_to_seurat: bump seurat to latest version (PR #850).

  • workflows/ingestion/bd_rhapsody: Upgrade BD Rhapsody 1.x to 2.x, thereby changing the interface of the workflow (PR #846).

  • mapping/bd_rhapsody: Upgrade BD Rhapsody 1.x to 2.x, thereby changing the interface of the workflow (PR #846).

  • reference/make_bdrhap_reference: Upgrade BD Rhapsody 1.x to 2.x, thereby changing the interface of the workflow (PR #846).

  • reference/build_star_reference: Rename mapping/star_build_reference to reference/build_star_reference (PR #846).

  • reference/cellranger_mkgtf: Rename reference/mkgtf to reference/cellranger_mkgtf (PR #846).

  • labels_transfer/xgboost: Align interface with new annotation workflow

    • Store label probabilities instead of uncertainties
    • Take .h5mu format as an input instead of .h5ad
  • reference/build_cellranger_arc_reference: a default value of "output" is now specified for the argument --genome, inline with reference/build_cellranger_reference component. Additionally, providing a value for --organism is no longer required and its default value of Homo Sapiens has been removed (PR #864).

MAJOR CHANGES

  • Bump popv to 0.4.2 (PR #901)

NEW FUNCTIONALITY

  • Added demux/cellranger_atac_mkfastq component: demultiplex raw sequencing data for ATAC experiments (PR #726).

  • process_samples, process_batches and rna_multisample workflows: added functionality to scale the log-normalized
    gene expression data to unit variance and zero mean. The scaled data will be output to a different layer and the
    representation with reduced dimensions will be created and stored in addition to the non-scaled data (PR #733).

  • transform/scaling: add --input_layer and --output_layer arguments (PR #733).

  • CI: added checking of mudata contents for multiple workflows (PR #783).

  • Added multiple arguments to the cellranger_multi workflow in order to maintain feature parity with the mapping/cellranger_multi component (PR #803).

  • convert/from_cellranger_to_h5mu: add support for antigen analysis.

  • Added demux/cellranger_atac_mkfastq component: demultiplex raw sequencing data for ATAC experiments (PR #726).

  • Added reference/build_cellranger_reference component: build reference file compatible with ATAC and ATAC+GEX experiments (PR #726).

  • demux/bcl_convert: add support for no lane splitting (PR #804).

  • reference/cellranger_mkgtf component: Added cellranger mkgtf as a standalone component (PR #771).

  • scgpt/cross_check_genes component: Added a gene-model cross check component for scGPT (PR #758).

  • scgpt/embedding: component: Added scGPT embedding component (PR #761)

  • scgpt/tokenize_pad: component: Added scGPT padding and tokenization component (PR #754).

  • scgpt/binning component: Added a scGPT pre-processing binning component (PR #765).

  • workflows/integration/scgpt_leiden workflow with scGPT integration followed by Leiden clustering (PR #794).

  • scgpt/cell_type_annotation component: Added scGPT cell type annotation component (PR #798).

  • resources_test_scripts/scGPT.sh: Added script to include scGPT test resources (PR #800).

  • transform/clr component: Added the option to set the axis along which to apply CLR. Possible to override
    on workflow level as well (PR #767).

  • annotate/celltypist component: Added a CellTypist annotation component (PR #825).

  • dataflow/split_h5mu component: Added a component to split a single h5mu file into multiple h5mu files based on the values of an .obs column (PR #824).

  • workflows/test_workflows/ingestion components & workflows/ingestion: Added standalone components for integration testing of ingestion workflows (PR #801).

  • workflows/ingestion/make_reference: Add additional arguments passed through to the STAR and BD Rhapsody reference components (PR #846).

  • annotate/random_forest_annotation component: Added a random forest cell type annotation component (PR #848).

  • dataflow/concatenate_h5mu: data from .uns, both originating from the global and per-modality slots, is now retained in the final concatenated output object. Additionally, added the uns_merge_mode argument in order to tune the behavior when conflicting keys are detected across samples (PR #859).

  • dimred/densmap component: Added a densMAP dimensionality reduction component (PR #748).

  • annotate/scanvi component: Added a component to annotate cells using scANVI (PR #833).

  • transform/bpcells_regress_out component: Added a component to regress out effects of confounding variables in the count matrix using BPCells (PR #863).

  • transform/regress_out: Allow providing 'input' and 'output' layers for scanpy regress_out functionality (PR #863).

  • workflows/ingestion/make_reference: add possibility to build CellRanger ARC references. Added --motifs_file, --non_nuclear_contigs and --output_cellranger_arc arguments (PR #864).

  • Test resources (reference_gencodev41_chr1): switch reference genome for CellRanger to ARC variant (PR #864).

  • transform/bpcells_regress_out component: Added a component to regress out effects of confounding variables in the count matrix using BPCells (PR #863).

  • transform/regress_out: Allow providing 'input' and 'output' layers for scanpy regress_out functionality (PR #863).

  • Added transform/tfidf component: normalize ATAC data with TF-IDF (PR #870).

  • Added dimred/lsi component (PR #552).

  • metadata/duplicate_obs component: Added a component to make a copy from one .obs field or index to another .obs field within the same MuData object (PR #874, PR #899).

  • annotate/onclass: component: Added a component to annotate cell types using OnClass (PR #844).

  • annotate/svm component: Added a component to annotate cell types using support vector machine (SVM) (PR #845).

  • metadata/duplicate_var component: Added a component to make a copy from one .var field or index to another .var field within the same MuData object (PR #877, PR #899).

  • filter/subset_obsp component: Added a component to subset an .obsp matrix by column based on the value of an .obs field. The resulting subset is moved to an .obsm field (PR #888).

  • labels_transfer/knn component: Enable using additional distance functions for KNN classification (PR #830) and allow to perform KNN classification based on a pre-calculated neighborhood graph (PR #890).

MINOR CHANGES

  • Several components: bump python version (PR #901).

  • resources_test_scripts/cellranger_atac_tiny_bcl.sh script: generate counts from fastq files using CellRanger atac count (PR #726).

  • cellbender_remove_background_v0_2: update base image to nvcr.io/nvidia/pytorch:23.12-py3 (PR #646).

  • Bump scvelo to 0.3.2 (PR #828).

  • Pin numpy<2 for several components (PR #815).

  • Added resources_test_scripts/cellranger_atac_tiny_bcl.sh script: download tiny bcl file with an ATAC experiment, download a motifs file, demultiplex bcl files to reads in fastq format (PR #726).

  • mapping/cellranger_multi component now outputs logs on failure of the cellranger multi process (PR #766).

  • Bump viash-actions to v6 (PR #821).

  • reference/make_reference: Do not try to extract genome fasta and transcriptome gtf if they are not gzipped (PR #856).

  • Changes related to syncing the test resources (PR #867):

    • Add .info.test_resources to _viash.yaml to specify where test resources need to be synced from.
    • download/sync_test_resources: Use `.inf...
Read more

OpenPipelines.bio v1.0.3

09 Aug 11:28
12b273e
Compare
Choose a tag to compare

BUG FIXES

  • qc/calculate_qc_metrics: increase total counts accuracy with low precision floating dtypes as input layer (PR #853, backported from PR #852).

OpenPipelines.bio v0.12.7

09 Aug 14:03
f0e836d
Compare
Choose a tag to compare

BUG FIXES

  • qc/calculate_qc_metrics: increase total counts accuracy with low precision floating dtypes as input layer (PR #854, backported from PR #852).

OpenPipelines.bio v1.0.2

22 Jul 08:17
2963adb
Compare
Choose a tag to compare

BUG FIXES

  • dataflow/concatenate_h5mu: fix writing out multidimensional annotation dataframes (e.g. .varm) that had their
    data dtype (dtype) changed as a result of adding more observations after concatenation, causing TypeError.
    One notable example of this happening is when one of the samples does not have a multimodal annotation dataframe
    which is present in another sample; causing the values being filled with NA (PR #842, backported from PR #837).

OpenPipelines.bio v1.0.1

19 Jun 12:15
0ead6c6
Compare
Choose a tag to compare

BUG FIXES

  • Bump viash to 0.8.6 (PR #816, backported from #815). This changes the at-runtime generated nextflow process from an in-memory to an on-disk temporary file, which should cause less issues with Nextflow Fusion.

OpenPipelines.bio v1.0.0

12 Jun 12:14
Compare
Choose a tag to compare

BREAKING CHANGES

  • query/cellxgene_census: Refactored the interface, documentation and internal workings of this component (PR #621).

    • Renamed arguments to align with standard OpenPipelines notations and cellxgene census API:
      • --input_database became --input_uri
      • --cellxgene_release became --census_version
      • --cell_query became --obs_value_filter
      • --cells_filter_columns became --cell_filter_grouping
      • --min_cells_filter_columns became --cell_filter_minimum_count
      • --modality became --output_modality
      • Removed --dataset_id since it was no longer being used.
      • Added --add_dataset_meta to add metadata to the output MuData object.
    • Documentation of the component and its arguments was improved.
  • Docker image names now use / instead of _ between the name of the component and the namespace (PR #712).

  • Change separator for arguments with multiple inputs from : to ; (PR #700 and #707). Now, all arguments with multiple: true will use ; as the separator.
    This change was made to be able to deal with file paths that contain :, e.g. s3://my-bucket/my:file.txt. Furthermore, the ; separator will become
    the default separator for all arguments with multiple: true in Viash >= 0.9.0.

  • This project now uses viash version 0.8.4 to build components and workflows. Changes related to this version update should
    be mostly backwards compatible with respect to the results and execution of the pipelines. From a development perspective,
    drastic updates have been made to the developemt workflow.

    Development related changes:

    • Bump viash version to 0.8.4 (PR #598, PR#638, #697 and #706) in the project configuration.
    • All pipelines no longer use the anonymous workflow. Instead, these workflows were given
      a name which was added to the viash config as the entrypoint to the pipeline (PR #598).
    • Removed the workflows folder and moved its contents to new locations:
      1. The resources_test_scripts folder now resides in the root of the project (PR #605).

      2. All workflows have been moved to the src/workflows folder (PR #605).
        This implies that workflows must now be build using viash (ns) build, just like with components.

      3. Adjust GitHub Actions to account for new workflow paths (PR #605).

      4. In order to be backwards compatible, the workflows folder now contains symbolic
        links to the build workflows in target. This is not a problem when using the repository for pipeline
        execution. However, if a developer wishes to contribute to the project, symlink support should be enabled
        in git using git config core.symlinks=true. Alternatively, use
        git clone -c core.symlinks=true [email protected]:openpipelines-bio/openpipeline.git when cloning the
        repository. This avoids the symlinks being resolved (PR #628).
        4bis. With PR #668, the workflows have been renamed. This does not hamper the backwards compatibility
        of the symlinks that have been described in 4, because they still use the original location
        which includes the original name.
        * multiomics/rna_singlesample has been renamed to rna/process_single_sample,
        * multiomics/rna_multisample has been renamed to rna/rna_multisample,
        * multiomics/prot_multisample became prot/prot_multisample,
        * multiomics/prot_singlesample became prot/prot_singlesample,
        * multiomics/full_pipeline was moved to multiomics/process_samples,
        * multiomics/multisample has been renamed to multiomics/process_batches,
        * multiomics/integration/initialize_integration changed to multiomics/dimensionality_reduction,
        * finally, all workflows at multiomics/integration/* were moved to integration/*

      5. Removed the workflows/utils folder. Functionality that was provided by the DataflowHelper
        and WorkflowHelper is now being provided by viash when the workflow is being build (PR #605).

    End-user facing changes:

    • The concat component had been deprecated and will be removed in a future release.
      It's functionality has been copied to the concatenate_h5mu component because the name is in
      conflict with the concat operator from nextflow (PR #598).
    • prot_singlesample, rna_singlesample, prot_multisample and rna_multisample: QC statistics
      are now only calculated once where needed. This means that the mitochondrial gene detection is
      performed in the rna_singlesample pipeline and the other count based statistics are calculated
      during the prot_multisample and rna_multisample pipelines. In both cases, the qc pipeline
      is being used, but only parts of that workflow are activated by parametrization. Previously
      the count based statistics were calculated in both the singlesample and multisample pipelines,
      with the results from the multisample pipelines overwriting the previous results. What is breaking here
      is that the qc statistics are not being added to the results of the singlesample worklows.
      This is not an issue when using the full_pipeline because in this case the singlesample and
      multisample workflows are executed in-tandem. If you wish to execute the singlesample workflows
      in a seperate manner and still include count based statistics, please run the qc pipeline
      on the result of the singlesample workflow (PR #604).
    • filter/filter_with_hvg has been renamed to feature_annotation/highly_variable_features_scanpy, along with the following changes (PR #667).
      • --do_filter was removed
      • --n_top_genes has been renamed to --n_top_features
    • full_pipeline, multisample and rna_multisample: Renamed arguments (PR #667).
      • --filter_with_hvg_var_output became --highly_variable_features_obs_batch_key
      • --filter_with_hvg_obs_batch_key became --highly_variable_features_var_output
    • rna_multisample: Renamed arguments (PR #667).
      • --filter_with_hvg_n_top_genes became --highly_variable_features_n_top_features
      • --filter_with_hvg_flavor became --highly_variable_features_flavor
  • Renamed obsm_metrics to uns_metrics for the cellranger_mapping workflow because the cellranger metrics are stored in .uns and not .obsm (PR #610).

  • mapping/cellranger_mkfastq: update from cellranger 6.0.2 to 7.0.1 (PR #675)

BUG FIXES

  • mapping/cellranger_multi: Fix the regex for the fastq input files to allow dropping the lane from the input file names (e.g. _L001) (PR #778).

  • workflows/rna/rna_singlesample: Fix argument passing top_n_vars and obs_name_mitochondrial_fraction to the qc subworkflow (PR #779).

  • rna_singlesample: fixed a bug where selecting the column for the filtering with mitochondrial fractions
    using obs_name_mitochondrial_fraction was done with the wrong column name, causing ValueError (PR #743).

  • Fix publishing in process_samples and process_batches (PR #759).

  • Cellranger multi: Fix using a relative input path for --vdj_inner_enrichment_primers (PR #717)

  • dataflow/split_modalities: remove unused compression argument. Use output_compression instead (PR #714).

  • metadata/grep_annotation_column: fix calculating fraction when an input observation has no counts, which caused
    the result to be out of bounds.

  • Fix --output argument not working for several workflows (PR #740).

  • transform/log1p: fix --input_layer argument not functioning (PR #678).

  • dataflow/concat and dataflow/concatenate_h5mu: Fix an issue where using --mode move on samples with non-overlapping features would cause var_names to become unaligned to the data (PR #653).

  • filter/filter_with_scrublet: (Testing) Fix duplicate test function names (PR #641).

  • dataflow/concatenate_h5mu and dataflow/concat: Fix TypeError when using mode 'move' and a column with conflicting metadata does not exist across all samples (PR #631).

  • dataflow/concatenate_h5mu and dataflow/concat: Fix an issue where joining columns with different datatypes caused TypeError (PR #619).

  • qc/calculate_qc_metrics: Resolved an issue where statistics based on the input columns selected with --var_qc_metrics were incorrect when these input columns were encoded in pd.BooleanDtype() (PR #685).

  • move_obsm_to_obs: fix setting output columns when they already exist (PR #690).

  • workflows/dimensionality_reduction workflow: nearest neighbour calculations no longer recalcalates the PCA when obm_input --obsm_pca is not set to X_pca.

  • feature_annotation/highly_variable_scanpy: fix .X being used to remove observations with 0 counts when --layer has been specified.

  • filter/filter_with_counts: fix --layer argument not being used.

  • transform/normalize_total: fix incorrect layer being written to the output when the input layer if not .X.

  • src/workflows/qc: fix input layer not being taken into account when calculating the fraction of mitochondrial genes (always used .X).

  • convert/from_cellranger_multi_to_h5mu: fix metric values not repesented as percentages being devided by 100. (#704).

NEW FUNCTIONALITY

  • dimred/tsne component: Added a tSNE dimensionality reduction component (PR #742).

  • multisample pipeline: This workflow now works when provided multimple unimodal files or multiple multimodal files, in addition to the previously supported single multimodal file (PR #606). The modalities are processed independently from each other:

    • As before, a single multimodal file is split into several unimodal MuData objects, e...
Read more

OpenPipelines.bio v1.0.0-rc6

10 Jun 11:03
Compare
Choose a tag to compare
Pre-release

BUG FIXES

  • dataflow/concatenate_h5mu: fix regression bug where observations are no longer linked to the correct metadata
    after concatenation (PR #807)