Skip to content

Standardize conversion workflow #1270

Standardize conversion workflow

Standardize conversion workflow #1270

GitHub Actions / JUnit Test Report failed Oct 4, 2024 in 0s

1 tests run, 0 passed, 0 skipped, 1 failed.

Annotations

Check failure on line 1 in nf

See this annotation in the file changed.

@github-actions github-actions / JUnit Test Report

nf.test-dataset_cellrangermulti_aligner

Assertion failed: 

23 of 27 assertions failed
Raw output
Nextflow stdout:

ERROR ~ Error executing process > 'NFCORE_SCRNASEQ:SCRNASEQ:MTX_CONVERSION:MTX_TO_H5AD (PBMC_10K)'

Caused by:
  Process `NFCORE_SCRNASEQ:SCRNASEQ:MTX_CONVERSION:MTX_TO_H5AD (PBMC_10K)` terminated with an error exit status (1)


Command executed [/home/runner/work/scrnaseq/scrnaseq/./workflows/../subworkflows/local/../../modules/local/templates/mtx_to_h5ad_cellranger.py]:

  #!/usr/bin/env python
  
  # Set numba chache dir to current working directory (which is a writable mount also in containers)
  import os
  
  os.environ["NUMBA_CACHE_DIR"] = "."
  
  import scanpy as sc
  import pandas as pd
  import argparse
  from anndata import AnnData
  import platform
  
  def _mtx_to_adata(
      input: str,
      sample: str,
  ):
  
      adata = sc.read_10x_h5(input)
      adata.var["gene_symbols"] = adata.var_names
      adata.var.set_index("gene_ids", inplace=True)
      adata.obs["sample"] = sample
  
      # reorder columns for 10x mtx files
      adata.var = adata.var[["gene_symbols", "feature_types", "genome"]]
  
      return adata
  
  
  def format_yaml_like(data: dict, indent: int = 0) -> str:
      """Formats a dictionary to a YAML-like string.
      Args:
          data (dict): The dictionary to format.
          indent (int): The current indentation level.
      Returns:
          str: A string formatted as YAML.
      """
      yaml_str = ""
      for key, value in data.items():
          spaces = "  " * indent
          if isinstance(value, dict):
              yaml_str += f"{spaces}{key}:\n{format_yaml_like(value, indent + 1)}"
          else:
              yaml_str += f"{spaces}{key}: {value}\n"
      return yaml_str
  
  def dump_versions():
      versions = {
          "NFCORE_SCRNASEQ:SCRNASEQ:MTX_CONVERSION:MTX_TO_H5AD": {
              "python": platform.python_version(),
              "scanpy": sc.__version__,
              "pandas": pd.__version__
          }
      }
  
      with open("versions.yml", "w") as f:
          f.write(format_yaml_like(versions))
  
  def input_to_adata(
      input_data: str,
      output: str,
      sample: str,
  ):
      print(f"Reading in {input_data}")
  
      # open main data
      adata = _mtx_to_adata(input_data, sample)
  
      # standard format
      # index are gene IDs and symbols are a column
      adata.var['gene_versions'] = adata.var.index
      adata.var['gene_ids'] = adata.var['gene_versions'].str.split('.').str[0]
      adata.var.index = adata.var["gene_ids"].values
      adata.var = adata.var.drop("gene_ids", axis=1)
      adata.var_names_make_unique()
  
      # write results
      adata.write_h5ad(f"{output}", compression="gzip")
      print(f"Wrote h5ad file to {output}")
  
      # dump versions
      dump_versions()
  
      return adata
  
  #
  # Run main script
  #
  
  # create the directory with the sample name
  os.makedirs("PBMC_10K", exist_ok=True)
  
  # input_type comes from NF module
  adata = input_to_adata(
      input_data="null_feature_bc_matrix.h5",
      output="PBMC_10K/PBMC_10K_null_matrix.h5ad",
      sample="PBMC_10K"
  )

Command exit status:
  1

Command output:
  Reading in null_feature_bc_matrix.h5

Command error:
  ba94160d36b7: Verifying Checksum
  ba94160d36b7: Download complete
  f628b9cff8d1: Verifying Checksum
  f628b9cff8d1: Download complete
  7716ca300600: Pull complete
  4f4fb700ef54: Pull complete
  7834e8feb904: Pull complete
  5ac55ff04773: Pull complete
  77c7a930b7cc: Pull complete
  c864db06f68b: Pull complete
  f628b9cff8d1: Pull complete
  ba94160d36b7: Pull complete
  6f63df1cb8dd: Verifying Checksum
  6f63df1cb8dd: Download complete
  6f63df1cb8dd: Pull complete
  53269d96152a: Verifying Checksum
  53269d96152a: Download complete
  53269d96152a: Pull complete
  54ba407d13f5: Verifying Checksum
  54ba407d13f5: Download complete
  54ba407d13f5: Pull complete
  Digest: sha256:fbd40d3d00751ac0df11564b3697006ecf8604af48960833910d32755033575f
  Status: Downloaded newer image for community.wave.seqera.io/library/scanpy:1.10.2--e83da2205b92a538
  Reading in null_feature_bc_matrix.h5
  Traceback (most recent call last):
    File ".command.sh", line 94, in <module>
      adata = input_to_adata(
              ^^^^^^^^^^^^^^^
    File ".command.sh", line 67, in input_to_adata
      adata = _mtx_to_adata(input_data, sample)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File ".command.sh", line 19, in _mtx_to_adata
      adata = sc.read_10x_h5(input)
              ^^^^^^^^^^^^^^^^^^^^^
    File "/opt/conda/lib/python3.12/site-packages/legacy_api_wrap/__init__.py", line 80, in fn_compatible
      return fn(*args_all, **kw)
             ^^^^^^^^^^^^^^^^^^^
    File "/opt/conda/lib/python3.12/site-packages/scanpy/readwrite.py", line 203, in read_10x_h5
      with h5py.File(str(filename), "r") as f:
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/opt/conda/lib/python3.12/site-packages/h5py/_hl/files.py", line 562, in __init__
      fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/opt/conda/lib/python3.12/site-packages/h5py/_hl/files.py", line 235, in make_fid
      fid = h5f.open(name, flags, fapl=fapl)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
    File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
    File "h5py/h5f.pyx", line 102, in h5py.h5f.open
  FileNotFoundError: [Errno 2] Unable to synchronously open file (unable to open file: name = 'null_feature_bc_matrix.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

Work dir:
  /home/runner/work/scrnaseq/scrnaseq/.nf-test/tests/c7d364337517d05e62c09dbb6cb11b88/work/a1/8cc860db1a65e4a1479dcee1592aa8

Container:
  community.wave.seqera.io/library/scanpy:1.10.2--e83da2205b92a538

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '/home/runner/work/scrnaseq/scrnaseq/.nf-test/tests/c7d364337517d05e62c09dbb6cb11b88/meta/nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting

 -- Check '/home/runner/work/scrnaseq/scrnaseq/.nf-test/tests/c7d364337517d05e62c09dbb6cb11b88/meta/nextflow.log' file for details
Nextflow stderr: