Skip to content

Commit

Permalink
Bump version: 0.28.2 → 0.29.0
Browse files Browse the repository at this point in the history
Bump version: 0.28.2 → 0.29.0
  • Loading branch information
Yenaled authored Oct 9, 2024
2 parents fd3a198 + a03279a commit 43eddc8
Show file tree
Hide file tree
Showing 40 changed files with 974 additions and 280 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python: [3.7, 3.8, 3.9 ]
python: [3.8, 3.9 ]
os: [ubuntu-20.04]
name: Test on Python ${{ matrix.python }}
steps:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jobs:
- name: Setup python
uses: actions/setup-python@v1
with:
python-version: '3.7'
python-version: '3.8'
architecture: x64
- name: Install dependencies
run: pip install -r dev-requirements.txt
Expand Down
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -109,3 +109,9 @@ venv.bak/
.DS_Store
.vscode/
.Rhistory

# PyCharm
/.idea/

# Temp files
/scratch/
3 changes: 2 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@

test:
rm -f .coverage
nosetests --verbose --with-coverage --cover-package kb_python tests/* tests/dry/*
pytest --verbose --cov=kb_python tests/* tests/dry/* && coverage report && coverage xml
# nosetests --verbose --with-coverage --cover-package kb_python tests/* tests/dry/*

check:
flake8 kb_python && echo OK
Expand Down
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# kb-python
![github version](https://img.shields.io/badge/Version-0.28.0-informational)
![github version](https://img.shields.io/badge/Version-0.29.0-informational)
[![pypi version](https://img.shields.io/pypi/v/kb-python)](https://pypi.org/project/kb-python/0.28.0/)
![python versions](https://img.shields.io/pypi/pyversions/kb_python)
![status](https://github.com/pachterlab/kb_python/workflows/CI/badge.svg)
Expand All @@ -10,7 +10,7 @@

`kb-python` is a python package for processing single-cell RNA-sequencing. It wraps the [`kallisto` | `bustools`](https://www.kallistobus.tools) single-cell RNA-seq command line tools in order to unify multiple processing workflows.

`kb-python` was developed by [Kyung Hoi (Joseph) Min](https://twitter.com/lioscro) and [A. Sina Booeshaghi](https://twitter.com/sinabooeshaghi) while in [Lior Pachter](https://twitter.com/lpachter)'s lab at Caltech. If you use `kb-python` in a publication please [cite*](#cite):
`kb-python` was first developed by [Kyung Hoi (Joseph) Min](https://twitter.com/lioscro) and [A. Sina Booeshaghi](https://twitter.com/sinabooeshaghi) while in [Lior Pachter](https://twitter.com/lpachter)'s lab at Caltech. If you use `kb-python` in a publication please [cite*](#cite):
```
Melsted, P., Booeshaghi, A.S., et al.
Modular, efficient and constant-memory single-cell RNA-seq preprocessing.
Expand All @@ -34,7 +34,7 @@ There are no prerequisite packages to install. The `kallisto` and `bustools` bin

## Usage

`kb` consists of four subcommands
`kb` consists of five subcommands
```bash
$ kb
usage: kb [-h] [--list] <CMD> ...
Expand All @@ -44,6 +44,7 @@ positional arguments:
compile Compile `kallisto` and `bustools` binaries from source
ref Build a kallisto index and transcript-to-gene mapping
count Generate count matrices from a set of single-cell FASTQ files
extract Extract reads that were pseudoaligned to specific genes/transcripts (or extract all reads that were / were not pseudoaligned)
```

### `kb ref`: generate a pseudoalignment index
Expand Down
5 changes: 3 additions & 2 deletions dev-requirements.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
bumpversion==0.6.0
coverage==5.1
coverage==5.2.1
flake8==3.8.2
nose==1.3.7
pytest==8.2.2
pytest-cov==5.0.0
pre-commit==2.4.0
sphinx>=3.3.1
sphinx-autoapi>=1.5.1
Expand Down
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
author = 'Kyung Hoi (Joseph) Min'

# The full version, including alpha/beta/rc tags
release = '0.28.2'
release = '0.29.0'
master_doc = 'index'

# -- General configuration ---------------------------------------------------
Expand Down
6 changes: 3 additions & 3 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
Welcome to kb-python's documentation!
=====================================

This page contains **DEVELOPER** documentation for ``kb-python`` version ``0.28.2``.
This page contains **DEVELOPER** documentation for ``kb-python`` version ``0.29.0``.
For user documentation and tutorials, please go to `kallisto | bustools <https://www.kallistobus.tools/>`_.

Development Prerequisites
Expand All @@ -18,7 +18,7 @@ necessary packages by running::
pip install -r requirements.txt
pip install -r dev-requirements.txt

Code qualty and unit tests are strictly enforced for every pull request via
Code quality and unit tests are strictly enforced for every pull request via
Github actions.

Code Quality
Expand All @@ -33,7 +33,7 @@ at the root of the repository.

Unit-testing
""""""""""""
``kb-python`` uses ``nose`` to run unit tests. There is a convenient Makefile
``kb-python`` uses ``pytest`` to run unit tests. There is a convenient Makefile
rule in place to run all tests.::

make test
Expand Down
2 changes: 1 addition & 1 deletion kb_python/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '0.28.2'
__version__ = '0.29.0'
Binary file modified kb_python/bins/darwin/bustools/bustools
Binary file not shown.
Binary file modified kb_python/bins/darwin/kallisto/kallisto
Binary file not shown.
Binary file added kb_python/bins/darwin/kallisto/kallisto_k64
Binary file not shown.
Binary file added kb_python/bins/darwin/kallisto/kallisto_optoff
Binary file not shown.
Binary file not shown.
Binary file modified kb_python/bins/darwin/m1/bustools/bustools
Binary file not shown.
Binary file modified kb_python/bins/darwin/m1/kallisto/kallisto
Binary file not shown.
Binary file added kb_python/bins/darwin/m1/kallisto/kallisto_k64
Binary file not shown.
Binary file added kb_python/bins/darwin/m1/kallisto/kallisto_optoff
Binary file not shown.
Binary file not shown.
Binary file modified kb_python/bins/linux/bustools/bustools
Binary file not shown.
Binary file modified kb_python/bins/linux/kallisto/kallisto
Binary file not shown.
Binary file added kb_python/bins/linux/kallisto/kallisto_k64
Binary file not shown.
Binary file added kb_python/bins/linux/kallisto/kallisto_optoff
Binary file not shown.
Binary file added kb_python/bins/linux/kallisto/kallisto_optoff_k64
Binary file not shown.
Binary file modified kb_python/bins/windows/bustools/bustools.exe
Binary file not shown.
Binary file modified kb_python/bins/windows/kallisto/kallisto.exe
Binary file not shown.
Binary file added kb_python/bins/windows/kallisto/kallisto_k64.exe
Binary file not shown.
Binary file not shown.
Binary file not shown.
20 changes: 17 additions & 3 deletions kb_python/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,14 @@ def get_provided_kallisto_path() -> Optional[str]:
Returns:
Path to the binary, `None` if not found
"""
bin_filename = 'kallisto.exe' if PLATFORM == 'windows' else 'kallisto'
bin_name = 'kallisto'
if '_KALLISTO_OPTOFF' in globals():
if _KALLISTO_OPTOFF:
bin_name = f'{bin_name}_optoff'
if '_KALLISTO_KMER_64' in globals():
if _KALLISTO_KMER_64:
bin_name = f'{bin_name}_k64'
bin_filename = f'{bin_name}.exe' if PLATFORM == 'windows' else bin_name
path = os.path.join(BINS_DIR, PLATFORM, CPU, 'kallisto', bin_filename)
if not os.path.isfile(path):
return None
Expand All @@ -54,11 +61,18 @@ def get_provided_bustools_path() -> Optional[str]:
return path


def set_special_kallisto_binary(k64: bool, optoff: bool):
global _KALLISTO_KMER_64
global _KALLISTO_OPTOFF
_KALLISTO_KMER_64 = k64
_KALLISTO_OPTOFF = optoff


def get_compiled_kallisto_path(alias: str = COMPILED_DIR) -> Optional[str]:
"""Finds platform-dependent kallisto binary compiled with `compile`.
Args:
Alias: Alias of compiled binary.
alias: Alias of compiled binary.
Returns:
Path to the binary, `None` if not found
Expand All @@ -74,7 +88,7 @@ def get_compiled_bustools_path(alias: str = COMPILED_DIR) -> Optional[str]:
"""Finds platform-dependent bustools binary compiled with `compile`.
Args:
Alias: Alias of compiled binary.
alias: Alias of compiled binary.
Returns:
Path to the binary, `None` if not found
Expand Down
91 changes: 83 additions & 8 deletions kb_python/count.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,11 @@ def kallisto_bus(
demultiplexed: bool = False,
batch_barcodes: bool = False,
numreads: int = None,
lr: bool = False,
lr_thresh: float = 0.8,
lr_error_rate: float = None,
union: bool = False,
no_jump: bool = False,
) -> Dict[str, str]:
"""Runs `kallisto bus`.
Expand Down Expand Up @@ -133,6 +138,11 @@ def kallisto_bus(
demultiplexed: Whether FASTQs are demultiplexed, defaults to `False`
batch_barcodes: Whether sample ID should be in barcode, defaults to `False`
numreads: Maximum number of reads to process from supplied input
lr: Whether to use lr-kallisto in read mapping, defaults to `False`
lr_thresh: Sets the --threshold for lr-kallisto, defaults to `0.8`
lr_error_rate: Sets the --error-rate for lr-kallisto, defaults to `None`
union: Use set union for pseudoalignment, defaults to `False`
no_jump: Disable pseudoalignment "jumping", defaults to `False`
Returns:
Dictionary containing paths to generated files
Expand Down Expand Up @@ -194,6 +204,16 @@ def kallisto_bus(
command += ['--rf-stranded']
if inleaved:
command += ['--inleaved']
if lr:
command += ['--long']
if lr and lr_thresh:
command += ['-r', str(lr_thresh)]
if lr and lr_error_rate:
command += ['-e', str(lr_error_rate)]
if union:
command += ['--union']
if no_jump:
command += ['--no-jump']
if batch_barcodes:
command += ['--batch-barcodes']
if is_batch:
Expand Down Expand Up @@ -224,12 +244,14 @@ def kallisto_quant_tcc(
matrix_to_files: bool = False,
matrix_to_directories: bool = False,
no_fragment: bool = False,
lr: bool = False,
lr_platform: str = 'ONT',
) -> Dict[str, str]:
"""Runs `kallisto quant-tcc`.
Args:
mtx_path: Path to counts matrix
saved_index_path: Path to index.saved
saved_index_path: Path to index
ecmap_path: Path to ecmap
t2g_path: Path to T2G
out_dir: Output directory path
Expand All @@ -241,6 +263,8 @@ def kallisto_quant_tcc(
matrix_to_files: Whether to write quant-tcc output to files, defaults to `False`
matrix_to_directories: Whether to write quant-tcc output to directories, defaults to `False`
no_fragment: Whether to disable quant-tcc effective length normalization, defaults to `False`
lr: Whether to use lr-kallisto in quantification, defaults to `False`
lr_platform: Sets the --platform for lr-kallisto, defaults to `ONT`
Returns:
Dictionary containing path to output files
Expand All @@ -255,6 +279,10 @@ def kallisto_quant_tcc(
command += ['-e', ecmap_path]
command += ['-g', t2g_path]
command += ['-t', threads]
if lr:
command += ['--long']
if lr and lr_platform:
command += ['-P', lr_platform]
if flens_path and not no_fragment:
command += ['-f', flens_path]
if l and not no_fragment:
Expand Down Expand Up @@ -1178,6 +1206,14 @@ def count(
no_fragment: bool = False,
numreads: int = None,
store_num: bool = False,
lr: bool = False,
lr_thresh: float = 0.8,
lr_error_rate: float = None,
lr_platform: str = 'ONT',
union: bool = False,
no_jump: bool = False,
quant_umis: bool = False,
keep_flags: bool = False,
) -> Dict[str, Union[str, Dict[str, str]]]:
"""Generates count matrices for single-cell RNA seq.
Expand Down Expand Up @@ -1242,6 +1278,14 @@ def count(
no_fragment: Whether to disable quant-tcc effective length normalization, defaults to `False`
numreads: Maximum number of reads to process from supplied input
store_num: Whether to store read numbers in BUS file, defaults to `False`
lr: Whether to use lr-kallisto in read mapping, defaults to `False`
lr_thresh: Sets the --threshold for lr-kallisto, defaults to `0.8`
lr_error_rate: Sets the --error-rate for lr-kallisto, defaults to `None`
lr_platform: Sets the --platform for lr-kallisto, defaults to `ONT`
union: Use set union for pseudoalignment, defaults to `False`
no_jump: Disable pseudoalignment "jumping", defaults to `False`
quant_umis: Whether to run quant-tcc when there are UMIs, defaults to `False`
keep_flags: Preserve flag column when sorting BUS file, defaults to `False`
Returns:
Dictionary containing paths to generated files
Expand Down Expand Up @@ -1292,7 +1336,12 @@ def count(
demultiplexed=demultiplexed,
batch_barcodes=batch_barcodes,
numreads=numreads,
n=store_num
n=store_num,
lr=lr,
lr_thresh=lr_thresh,
lr_error_rate=lr_error_rate,
union=union,
no_jump=no_jump
)
else:
logger.info(
Expand All @@ -1309,7 +1358,7 @@ def count(
temp_dir=temp_dir,
threads=threads,
memory=memory,
store_num=store_num
store_num=store_num and not keep_flags
)
correct = True
if whitelist_path and whitelist_path.upper() == "NONE":
Expand Down Expand Up @@ -1404,6 +1453,9 @@ def update_results_with_suffix(current_results, new_results, suffix):
technology.upper() in ('BULK', 'SMARTSEQ2', 'SMARTSEQ3')
) or ignore_umis
quant = cm and tcc
if quant_umis:
quant = True
no_fragment = True
suffix_to_inspect_filename = {'': ''}
if (technology.upper() == 'SMARTSEQ3'):
suffix_to_inspect_filename = {
Expand Down Expand Up @@ -1518,6 +1570,8 @@ def update_results_with_suffix(current_results, new_results, suffix):
matrix_to_files=matrix_to_files,
matrix_to_directories=matrix_to_directories,
no_fragment=no_fragment,
lr=lr,
lr_platform=lr_platform,
)
update_results_with_suffix(
unfiltered_results, quant_result, suffix
Expand Down Expand Up @@ -1695,6 +1749,14 @@ def count_nac(
batch_barcodes: bool = False,
numreads: int = None,
store_num: bool = False,
lr: bool = False,
lr_thresh: float = 0.8,
lr_error_rate: float = None,
lr_platform: str = 'ONT',
union: bool = False,
no_jump: bool = False,
quant_umis: bool = False,
keep_flags: bool = False,
) -> Dict[str, Union[Dict[str, str], str]]:
"""Generates RNA velocity matrices for single-cell RNA seq.
Expand Down Expand Up @@ -1756,6 +1818,14 @@ def count_nac(
batch_barcodes: Whether sample ID should be in barcode, defaults to `False`
numreads: Maximum number of reads to process from supplied input
store_num: Whether to store read numbers in BUS file, defaults to `False`
lr: Whether to use lr-kallisto in read mapping, defaults to `False`
lr_thresh: Sets the --threshold for lr-kallisto, defaults to `0.8`
lr_error_rate: Sets the --error-rate for lr-kallisto, defaults to `None`
lr_platform: Sets the --platform for lr-kallisto, defaults to `ONT`
union: Use set union for pseudoalignment, defaults to `False`
no_jump: Disable pseudoalignment "jumping", defaults to `False`
quant_umis: Whether to run quant-tcc when there are UMIs, defaults to `False`
keep_flags: Preserve flag column when sorting BUS file, defaults to `False`
Returns:
Dictionary containing path to generated index
Expand Down Expand Up @@ -1803,7 +1873,12 @@ def count_nac(
demultiplexed=demultiplexed,
batch_barcodes=batch_barcodes,
numreads=numreads,
n=store_num
n=store_num,
lr=lr,
lr_thresh=lr_thresh,
lr_error_rate=lr_error_rate,
union=union,
no_jump=no_jump
)
else:
logger.info(
Expand All @@ -1820,7 +1895,7 @@ def count_nac(
temp_dir=temp_dir,
threads=threads,
memory=memory,
store_num=store_num
store_num=store_num and not keep_flags
)
correct = True
if whitelist_path and whitelist_path.upper() == "NONE":
Expand Down Expand Up @@ -2073,8 +2148,8 @@ def update_results_with_suffix(current_results, new_results, suffix):
if batch_barcodes else None for prefix in prefixes
],
genes_paths=[
unfiltered_results[prefix][f'txnames{suffix}'] if tcc
else unfiltered_results[prefix].get(f'genes{suffix}')
unfiltered_results[prefix][f'ec{suffix}'] if tcc else
unfiltered_results[prefix].get(f'genes{suffix}')
for prefix in prefixes
],
t2g_path=t2g_path,
Expand Down Expand Up @@ -2975,7 +3050,7 @@ def update_results_with_suffix(current_results, new_results, suffix):
for prefix in prefixes
],
genes_paths=[
unfiltered_results[prefix][f'txnames{suffix}'] if tcc else
unfiltered_results[prefix][f'ec{suffix}'] if tcc else
unfiltered_results[prefix].get(f'genes{suffix}')
for prefix in prefixes
],
Expand Down
Loading

0 comments on commit 43eddc8

Please sign in to comment.