Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.3.0 #189

Closed
wants to merge 110 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
110 commits
Select commit Hold shift + click to select a range
be00257
Initial 0.3.0: checkpointing, more input formats, HTML report
sjfleming May 3, 2022
11f48a4
Dockerfile and WDL
sjfleming May 3, 2022
1336bf9
Test suite
sjfleming May 3, 2022
d38184d
gitignore files that might be created by a user
sjfleming May 3, 2022
7ed3f1c
Improve error messages from anndata loading
sjfleming May 25, 2022
1dba16a
Update approx to match paper
sjfleming May 25, 2022
f8d7056
Cohort mode for fpr
sjfleming May 25, 2022
56a9349
Tweaks
sjfleming May 25, 2022
bc6ca71
Update WDL
sjfleming May 25, 2022
7e88d62
Update .gitignore
sjfleming May 25, 2022
d466202
Working on documentation
sjfleming May 25, 2022
b9d2953
Several posterior summarization methods
sjfleming Jun 18, 2022
d1b42a8
Allow FPR=0 for cohort mode
sjfleming Jun 27, 2022
c38f38e
Include empty drop exact simulated values in output
sjfleming Jun 27, 2022
be71f8f
Do not use the "fast" MCKP calc: memory issues
sjfleming Jun 27, 2022
739592f
Add NPZ support for input type, tweaks to hyperparameter finding
sjfleming Jul 12, 2022
01db3c1
Update Dockerfile for torch's jit folder expectations
sjfleming Jul 12, 2022
d35ebd7
Add barcodes_file and genes_file to support NPZ
sjfleming Jul 12, 2022
23ac0a6
Update documentation
sjfleming Jul 26, 2022
100aa54
Fix readthedocs failure to render
sjfleming Aug 10, 2022
cb765a6
Update Dockerfile
sjfleming Aug 11, 2022
bae7db9
MTX reader accept arbitrary length genes and barcodes
sjfleming Sep 22, 2022
0ce6927
Beginning of major refactor for output estimation
sjfleming Sep 23, 2022
297da0b
Posterior regularization
sjfleming Sep 30, 2022
435ca63
Finalizing mean-targeting posterior regularization
sjfleming Oct 19, 2022
5518679
Remove numba; move CDF posterior from pandas to torch
sjfleming Oct 20, 2022
22dfa10
Posterior regularization implemented
sjfleming Dec 9, 2022
824bdf1
Optimizer argument variable change
sjfleming Feb 23, 2023
c7ad8bc
Update test_posterior.py
sjfleming Feb 23, 2023
39a57ce
use kde instead of hist, and other tweaks (#136)
jamestwebber Feb 23, 2023
8f20430
Streamline file loading in io.py (#135)
jamestwebber Feb 23, 2023
f7d4d30
Update io.py
sjfleming Feb 23, 2023
6df0d8f
Allow Peaks as features, and warn rather than raise AssertionError
sjfleming Feb 24, 2023
195d30b
Fix feature exclusion, metadata writing, and new prior estimation
sjfleming Mar 7, 2023
7a46bdf
Update setup.py
sjfleming Mar 9, 2023
8bcf454
Pandas 1.4.0+ breaks estimation.py
sjfleming Mar 9, 2023
6e02777
New matplotlib versions break legend dot sizing
sjfleming Mar 9, 2023
93e827a
Replace pandas at with loc, rescuing v1.4.0+
sjfleming Mar 9, 2023
46a7fa7
Update Dockerfile
sjfleming Mar 9, 2023
7d14238
Enable benchmarking runs on any git commit
sjfleming Mar 9, 2023
1f49c82
WDL to validate pytorch cuda build in a docker image
sjfleming Mar 9, 2023
fdc62ce
Attempt to rescue legend dot sizes
sjfleming Mar 10, 2023
fe78051
Update gitignore
sjfleming Mar 10, 2023
365ae91
Fix for legacy v2 files
sjfleming Mar 16, 2023
d247426
Benchmarking WDLs and scripts
sjfleming Mar 16, 2023
2a8d377
Add BatchNorm to decoder (#186)
sjfleming Mar 23, 2023
e5a96f8
Python packaging
sjfleming Mar 23, 2023
ce74b4a
Bug resulted in off by one, excluding a feature
sjfleming Mar 25, 2023
6949089
Bug fix for analyzed features and downstream and report
sjfleming Mar 25, 2023
c6f058e
Ease up on warnings
sjfleming Mar 25, 2023
563ebd2
Modify the tiny test example
sjfleming Mar 25, 2023
13a149e
Update docs
sjfleming Mar 25, 2023
4823e7d
Create .dockstore.yml
sjfleming Mar 27, 2023
012958c
WDL formatting and documentation
sjfleming Mar 27, 2023
3595b07
Installation via pip; Dockstore WDL; and docker image tags
sjfleming Mar 27, 2023
9d25006
Badges
sjfleming Mar 27, 2023
398fc18
Update README.rst
sjfleming Mar 28, 2023
dbe6715
Create .dockerignore
sjfleming Apr 5, 2023
fe0af29
Add benchmark dataset
sjfleming Apr 5, 2023
447abd4
Add blank arrays to file formats missing inputs
sjfleming Apr 5, 2023
2cc767a
Update downloads badge
sjfleming Apr 5, 2023
e8fb311
Update .dockerignore
sjfleming Apr 5, 2023
0285ad2
Update README.rst
sjfleming Apr 5, 2023
c68ea8f
Merge branch 'master' into sf_dev_0.3.0_postreg
sjfleming Apr 5, 2023
74abce5
Fix packaging
sjfleming Apr 5, 2023
a558364
Massage final_elbo_fail_fraction and epoch_elbo_fail_fraction
sjfleming Apr 5, 2023
7080ebf
Github actions for tests
sjfleming Apr 5, 2023
34ebf6c
Change requirements filenames to lowercase
sjfleming Apr 6, 2023
2474bb5
Include dev requirement scikit-learn
sjfleming Apr 6, 2023
5b34655
Handle all pytest warnings
sjfleming Apr 6, 2023
74ac669
Remove gmm.py, now obsolete
sjfleming Apr 6, 2023
98456de
Delete argparse.py
sjfleming Apr 6, 2023
fa9c4f0
Change default batch_size to an exponent of 2
sjfleming Apr 6, 2023
479f2a1
Skip last minibatch if size is smaller than cutoff
sjfleming Apr 6, 2023
d83e7e7
Remove input cells from html report
sjfleming Apr 6, 2023
0336d58
Fix prior finding bug and add input arguments to force priors (#195)
sjfleming Apr 14, 2023
5cd5403
Address MCKP memory issues (#196)
sjfleming Apr 14, 2023
60daf6e
Docker updates
sjfleming Apr 14, 2023
93ef376
Rename miniwdl check task
sjfleming Apr 14, 2023
b29f21b
Add some time information to log
sjfleming Apr 18, 2023
66b2e3f
Update report.py
sjfleming Apr 18, 2023
736d6df
Fix intermittent scheduler and train_loader bug
sjfleming Apr 19, 2023
434174a
Make some CLI methods static
sjfleming Apr 20, 2023
1ef75ad
Fix bug resulting from len(dataloader)
sjfleming Apr 20, 2023
a2e6022
Full integration test
sjfleming Apr 20, 2023
f995092
Remove epoch argument from train_epoch()
sjfleming Apr 20, 2023
30d84ad
Update test_train.py
sjfleming Apr 20, 2023
cdd504f
Update test_train.py
sjfleming Apr 20, 2023
cbb666f
Update posterior.py
sjfleming Apr 20, 2023
a7d09ae
Shorten lines
sjfleming Apr 20, 2023
cab2ff1
Add multiprocessing option for the MCKP estimator (#201)
sjfleming Apr 21, 2023
9329b47
Add soft constraints to the model for semi-supervision of cell probab…
sjfleming Apr 21, 2023
ae3b89d
Add the mutiprocessing MCKP estimator arg to WDL
sjfleming Apr 24, 2023
e7a6193
Include intermediate updates in log
sjfleming Apr 24, 2023
516dbad
Log message when saving posterior
sjfleming Apr 25, 2023
57b27af
Write posterior as h5, and some estimator improvements (#208)
sjfleming Apr 28, 2023
fdbbe3d
Update Dockerfile
sjfleming Apr 28, 2023
5da72db
Smaller dots for report PCA plot
sjfleming Apr 28, 2023
31414be
Ignore extra args when computing workflow run hash
sjfleming Apr 28, 2023
0edf95a
Packaging fix: version has changed file location
sjfleming Apr 28, 2023
f72facf
Attempt to address #209
sjfleming Apr 28, 2023
d0f2d23
Include report template, closes #209
sjfleming Apr 28, 2023
cb15a6e
Fix edge case bug in report
sjfleming Apr 28, 2023
727253b
Fix plot limits on output PDF learning curve
sjfleming Apr 28, 2023
50cfdab
Appropriately recover posterior from checkpoint file
sjfleming Apr 28, 2023
5936fc9
Fix posterior write test
sjfleming Apr 28, 2023
79e39f4
Line indents
sjfleming May 19, 2023
8d99fbd
Update cuda_check_inputs.json
sjfleming May 19, 2023
d81bada
Add a test to pip install straight from github (#218)
sjfleming May 19, 2023
88857e3
Fix issues with encoder and several bug fixes (#236)
sjfleming Aug 6, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
**/.git
.github
examples
docs
wdl
dist
cellbender.egg-info
.pytest_cache
*.h5
*.tar.gz
6 changes: 6 additions & 0 deletions .dockstore.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
version: 1.2

workflows:
- subclass: WDL
primaryDescriptorPath: /wdl/cellbender_remove_background.wdl
name: cellbender_remove_background
17 changes: 17 additions & 0 deletions .github/workflows/miniwdl_check.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
name: 'validate WDL'
on: [pull_request]
env:
MINIWDL_VERSION: 1.8.0
jobs:
miniwdl-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.7'
- name: 'Install miniwdl and Run Validation Test'
run: |
pip install miniwdl==$MINIWDL_VERSION;
./cellbender/remove_background/tests/miniwdl_check_wdl.sh;
shell: bash
45 changes: 45 additions & 0 deletions .github/workflows/run_packaging_check.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Package for PyPI

name: 'packaging'

on: pull_request

jobs:
build:

runs-on: 'ubuntu-latest'
strategy:
matrix:
python-version: ['3.7']

steps:
- name: 'Checkout repo'
uses: actions/checkout@v3

- name: 'Set up Python ${{ matrix.python-version }}'
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
cache: 'pip'

- name: 'Install packaging software'
run: pip install --upgrade setuptools build twine

- name: 'Print package versions'
run: pip list

- name: 'Package code using build'
run: python -m build

- name: 'Check package using twine'
run: python -m twine check dist/*

- name: 'Extract branch name'
shell: bash
run: echo "branch=$(echo ${GITHUB_REF#refs/heads/})" >>$GITHUB_OUTPUT
id: extract_branch

- name: 'Install from github branch and run a test'
run: |
pip install pytest git+https://github.com/broadinstitute/CellBender@${{ steps.extract_branch.outputs.branch }}
cellbender -v
32 changes: 32 additions & 0 deletions .github/workflows/run_pytest.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Run cellbender's tests

name: 'pytest'

on: pull_request

jobs:
build:

runs-on: 'ubuntu-latest'
strategy:
matrix:
python-version: ['3.7']

steps:
- name: 'Checkout repo'
uses: actions/checkout@v3

- name: 'Set up Python ${{ matrix.python-version }}'
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
cache: 'pip'

- name: 'Install package including pytest'
run: pip install .[dev]

- name: 'Print package versions'
run: pip list

- name: 'Test with pytest'
run: pytest -v
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,9 @@ dist/
.eggs/
.idea/
*.iml
*.h5
*.h5ad
*.tsv
*.csv
*.npz
*.tar.gz
5 changes: 4 additions & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1 +1,4 @@
include README.rst
include README.rst
include LICENSE
include requirements.txt
include requirements-rtd.txt
138 changes: 105 additions & 33 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,27 @@
CellBender
==========

.. image:: https://img.shields.io/github/license/broadinstitute/CellBender?color=white
:target: LICENSE
:alt: License

.. image:: https://readthedocs.org/projects/cellbender/badge/?version=latest
:target: https://cellbender.readthedocs.io/en/latest/?badge=latest
:alt: Documentation Status

.. image:: https://github.com/broadinstitute/CellBender/blob/master/docs/source/_static/design/logo_250_185.png
.. image:: https://img.shields.io/pypi/v/CellBender.svg
:target: https://pypi.org/project/CellBender
:alt: PyPI

.. image:: https://static.pepy.tech/personalized-badge/cellbender?period=total&units=international_system&left_color=grey&right_color=blue&left_text=pypi%20downloads
:target: https://pepy.tech/project/CellBender
:alt: Downloads

.. image:: https://img.shields.io/github/stars/broadinstitute/CellBender?color=yellow&logoColor=yellow)
:target: https://github.com/broadinstitute/CellBender/stargazers
:alt: Stars

.. image:: docs/source/_static/design/logo_250_185.png
:alt: CellBender Logo

CellBender is a software package for eliminating technical artifacts from
Expand All @@ -16,67 +32,123 @@ The current release contains the following modules. More modules will be added i
* ``remove-background``:

This module removes counts due to ambient RNA molecules and random barcode swapping from (raw)
UMI-based scRNA-seq count matrices. At the moment, only the count matrices produced by the
CellRanger ``count`` pipeline is supported. Support for additional tools and protocols will be
added in the future. A quick start tutorial can be found
`here <https://cellbender.readthedocs.io/en/latest/getting_started/remove_background/index.html>`_.
UMI-based scRNA-seq count matrices. Also works for snRNA-seq and CITE-seq.

Please refer to the `documentation <https://cellbender.readthedocs.io/en/latest/>`_ for a quick start tutorial on using CellBender.
Please refer to `the documentation <https://cellbender.readthedocs.io/en/latest/>`_ for a quick start tutorial.

Installation and Usage
----------------------

Manual installation
~~~~~~~~~~~~~~~~~~~
CellBender can be installed via

.. code-block:: console

$ pip install cellbender

(and we recommend installing in its own ``conda`` environment to prevent
conflicts with other software).

CellBender is run as a command-line tool, as in

.. code-block:: console

(cellbender) $ cellbender remove-background \
--cuda \
--input my_raw_count_matrix_file.h5 \
--output my_cellbender_output_file.h5

See `the usage documentation <https://cellbender.readthedocs.io/en/latest/usage/index.html>`_
for details.


Using The Official Docker Image
-------------------------------

A GPU-enabled docker image is available from the Google Container Registry (GCR) as:

``us.gcr.io/broad-dsde-methods/cellbender:latest``

Available image tags track release tags in GitHub, and include ``latest``,
``0.1.0``, ``0.2.0``, ``0.2.1``, ``0.2.2``, and ``0.3.0``.


WDL Users
---------

A workflow written in the
`workflow description language (WDL) <https://github.com/openwdl/wdl>`_
is available for CellBender remove-background.

For `Terra <https://app.terra.bio>`_ users, a workflow called
``cellbender/remove-background`` is
`available from the Broad Methods repository
<https://portal.firecloud.org/#methods/cellbender/remove-background/>`_.

There is also a `version available on Dockstore
<https://dockstore.org/workflows/github.com/broadinstitute/CellBender>`_.

The recommended installation is as follows. Create a conda environment and activate it:

.. code-block:: bash
Advanced installation
---------------------

$ conda create -n cellbender python=3.7
$ source activate cellbender
From source for development
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Create a conda environment and activate it:

.. code-block:: console

$ conda create -n cellbender python=3.7
$ conda activate cellbender

Install the `pytables <https://www.pytables.org>`_ module:

.. code-block:: bash
.. code-block:: console

(cellbender) $ conda install -c anaconda pytables

(cellbender) $ conda install -c anaconda pytables
Install `pytorch <https://pytorch.org>`_ via
`these instructions <https://pytorch.org/get-started/locally/>`_, for example:

Install `pytorch <https://pytorch.org>`_ (shown below for CPU; if you have a CUDA-ready GPU, please skip
this part and follow `these <https://pytorch.org/get-started/locally/>`_ instructions instead):
.. code-block:: console

.. code-block:: bash
(cellbender) $ pip install torch

(cellbender) $ conda install pytorch torchvision -c pytorch
and ensure that your installation is appropriate for your hardware (i.e. that
the relevant CUDA drivers get installed and that ``torch.cuda.is_available()``
returns ``True`` if you have a GPU available.

Clone this repository and install CellBender:
Clone this repository and install CellBender (in editable ``-e`` mode):

.. code-block:: bash
.. code-block:: console

(cellbender) $ git clone https://github.com/broadinstitute/CellBender.git
(cellbender) $ pip install -e CellBender

Using The Official Docker Image
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A GPU-enabled docker image is available from the Google Container Registry (GCR) as:
From a specific commit
~~~~~~~~~~~~~~~~~~~~~~

``us.gcr.io/broad-dsde-methods/cellbender:latest``
This can be achieved via

Terra Users
~~~~~~~~~~~
.. code-block:: console

For `Terra <https://app.terra.bio>`_ users, a `workflow <https://portal.firecloud.org/#methods/cellbender/remove-background/>`_
is available as:
(cellbender) $ pip install --no-cache-dir -U git+https://github.com/broadinstitute/CellBender.git@<SHA>

``cellbender/remove-background``
where ``<SHA>`` must be replaced by any reference to a particular git commit,
such as a tag, a branch name, or a commit sha.


Citing CellBender
-----------------

If you use CellBender in your research (and we hope you will), please consider
citing `our paper on bioRxiv <https://doi.org/10.1101/791699>`_.
citing our paper in Nature Methods:

Stephen J Fleming, Mark D Chaffin, Alessandro Arduini, Amer-Denis Akkad,
Eric Banks, John C Marioni, Anthony A Phillipakis, Patrick T Ellinor,
and Mehrtash Babadi. Unsupervised removal of systematic background noise from
droplet-based single-cell experiments using CellBender.
`Nature Methods` (in press), 2023.

Stephen J Fleming, John C Marioni, and Mehrtash Babadi. CellBender remove-background:
a deep generative model for unsupervised removal of background noise from scRNA-seq
datasets. bioRxiv 791699; doi: `https://doi.org/10.1101/791699 <https://doi.org/10.1101/791699>`_
See also `our preprint on bioRxiv <https://doi.org/10.1101/791699>`_.
9 changes: 0 additions & 9 deletions REQUIREMENTS-DOCKER.txt

This file was deleted.

13 changes: 0 additions & 13 deletions REQUIREMENTS.txt

This file was deleted.

8 changes: 8 additions & 0 deletions build_docker_local.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash

tag=$(cat cellbender/__init__.py | sed -e 's?__version__ = ??' | sed "s/^'\(.*\)'$/\1/")

docker build \
-t us.gcr.io/broad-dsde-methods/cellbender:${tag} \
-f docker/Dockerfile \
.
10 changes: 10 additions & 0 deletions build_docker_release.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/bin/bash

tag=$(cat cellbender/__init__.py | sed -e 's?__version__ = ??' | sed "s/^'\(.*\)'$/\1/")
release=v${tag}

docker build \
-t us.gcr.io/broad-dsde-methods/cellbender:${tag} \
--build-arg GIT_SHA=${release} \
-f docker/DockerfileGit \
.
1 change: 1 addition & 0 deletions cellbender/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
__version__ = '0.3.0'
Loading