Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add flake8 pre-commit #1689

Merged
merged 87 commits into from
Mar 18, 2021
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
40dc2c3
add flake8 pre-commit
Zethson Feb 24, 2021
55737d9
fix pre-commit
Zethson Feb 24, 2021
f653e5a
add E402 to flake8 ignore
Zethson Feb 24, 2021
daf03c9
revert neighbors
Zethson Feb 24, 2021
9a53065
Merge branch 'master' into feature/flake8
Zethson Feb 24, 2021
2b79a88
fix flake8
Zethson Feb 24, 2021
617168f
address review
Zethson Feb 25, 2021
ae43e3d
fix comment character in .flake8
Zethson Feb 25, 2021
7db4e60
fix test
Zethson Feb 25, 2021
48f0648
black
Zethson Feb 25, 2021
e742c66
review round 2
Zethson Feb 25, 2021
a5b1290
review round 3
Zethson Feb 25, 2021
718a06c
readded double comments
Zethson Feb 25, 2021
2a0a19d
Ignoring E262 & reverted comment
Zethson Feb 25, 2021
ebb2b01
using self for obs_tidy
Zethson Feb 25, 2021
d2bb2a9
Restore setup.py
flying-sheep Mar 1, 2021
ecc47a2
rm call of black test (#1690)
Koncopd Feb 24, 2021
f338863
Fix print_versions for python<3.8 (#1691)
ivirshup Feb 25, 2021
ce68cd1
add codecov so we can have a badge to point to (#1693)
ivirshup Feb 25, 2021
b5cc4b6
Attempt server-side search (#1672)
ivirshup Feb 25, 2021
8b0d8f0
Fix paga_path (#1047)
flying-sheep Mar 1, 2021
24d1b2e
Switch to flit
flying-sheep Dec 3, 2020
364f320
add setup.py while leaving it ignored
flying-sheep Jan 15, 2021
8f4f87e
Update install instructions
flying-sheep Jan 14, 2021
d4f7d4c
Circumvent new pip check (see pypa/pip#9628)
flying-sheep Feb 11, 2021
3db4814
Go back to regular pip (#1702)
flying-sheep Mar 2, 2021
6a97d73
codecov comment (#1704)
ivirshup Mar 2, 2021
47af631
Use joblib for parallelism in regress_out (#1695)
ivirshup Mar 3, 2021
6d36c6b
Add sparsificiation step before sparse-dependent Scrublet calls (#1707)
pinin4fjords Mar 3, 2021
c7bd6dc
Fix version on Travis (#1713)
flying-sheep Mar 3, 2021
4eb64c2
`sc.metrics` module (add confusion matrix & Geary's C methods) (#915)
ivirshup Mar 4, 2021
c11c486
Fix clipped images in docs (#1717)
ivirshup Mar 4, 2021
f637c08
Cleanup normalize_total (#1667)
ivirshup Mar 5, 2021
1e814cb
deprecate scvi (#1703)
mjayasur Mar 9, 2021
056d183
updated ecosystem.rst to add triku (#1722)
alexmascension Mar 9, 2021
ade2975
Minor addition to contributing docs (#1726)
ivirshup Mar 10, 2021
5f7f01f
Preserve category order when groupby is a list (#1735)
gokceneraslan Mar 11, 2021
b90e730
Asymmetrical diverging colormaps and vcenter (#1551)
gokceneraslan Mar 14, 2021
8fe2897
add flake8 pre-commit
Zethson Feb 24, 2021
5a144a3
add E402 to flake8 ignore
Zethson Feb 24, 2021
55aee90
revert neighbors
Zethson Feb 24, 2021
fc9d2b6
address review
Zethson Feb 25, 2021
893a034
black
Zethson Feb 25, 2021
53948bd
using self for obs_tidy
Zethson Feb 25, 2021
95958ff
rebased
Zethson Mar 15, 2021
99e1218
rebasing
Zethson Mar 15, 2021
e030ab1
rebasing
Zethson Mar 15, 2021
38e5624
rebasing
Zethson Mar 15, 2021
9bd1f0f
Merge branch 'master' into feature/flake8
Zethson Mar 15, 2021
7529cd3
add flake8 to dev docs
Zethson Mar 15, 2021
c7b9ee4
add autopep8 to pre-commits
Zethson Mar 15, 2021
ad38870
add flake8 ignore docs
Zethson Mar 15, 2021
c968244
add exception todos
Zethson Mar 15, 2021
83e31cf
add ignore directories
Zethson Mar 15, 2021
f8b6b70
reinstated lambdas
Zethson Mar 15, 2021
9e6722a
fix tests
Zethson Mar 15, 2021
207f650
fix tests
Zethson Mar 15, 2021
7fa610e
fix tests
Zethson Mar 15, 2021
976d825
fix tests
Zethson Mar 15, 2021
e3d916c
fix tests
Zethson Mar 15, 2021
5ca8527
Add E741 to allowed flake8 violations.
Zethson Mar 16, 2021
c8b7273
Add F811 flake8 ignore for tests
Zethson Mar 16, 2021
9abc967
Fix mask comparison
Zethson Mar 16, 2021
3a83228
Fix mask comparison
Zethson Mar 16, 2021
e2a4ce7
fix flake8 config file
Zethson Mar 16, 2021
0c69d81
readded autopep8
Zethson Mar 16, 2021
d89105f
import Literal
Zethson Mar 16, 2021
5cdfa9d
revert literal import
Zethson Mar 16, 2021
da412fc
fix scatterplot pca import
Zethson Mar 16, 2021
220ac15
false comparison & unused vars
Zethson Mar 16, 2021
f373a70
Add cleaner level determination
Zethson Mar 16, 2021
5adcfae
Fix comment formatting
Zethson Mar 16, 2021
ce2fb44
Add smoother dev documentation
Zethson Mar 16, 2021
8d7e6e4
fix flake8
Zethson Mar 16, 2021
64f6d7a
Readd long comment
Zethson Mar 16, 2021
32dcf96
Assuming X as array like
Zethson Mar 16, 2021
07cab3d
fix flake8
Zethson Mar 16, 2021
699aaac
fix flake8 config
Zethson Mar 16, 2021
79619ce
reverted rank_genes
Zethson Mar 16, 2021
99a8f2e
fix disp_mean_bin formatting
Zethson Mar 16, 2021
abe0846
fix formatting
Zethson Mar 16, 2021
16a0394
add final todos
Zethson Mar 16, 2021
46f4ca7
boolean checks with is
Zethson Mar 17, 2021
ad418d8
_dpt formatting
Zethson Mar 17, 2021
10e5d76
literal fixes
Zethson Mar 17, 2021
9b1da8c
links to leafs
Zethson Mar 17, 2021
c372f0b
revert paga variable naming
ivirshup Mar 18, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .flake8
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Can't yet be moved to the pyproject.toml due to https://gitlab.com/pycqa/flake8/-/issues/428#note_251982786
[flake8]
max-line-length = 88
# switched off since they conflict with black's standards
ignore = F401, W503, E501, E203, E231, W504, E402, E126, E712, E741, E266, E262
4 changes: 4 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,7 @@ repos:
rev: 20.8b1
hooks:
- id: black
- repo: https://gitlab.com/pycqa/flake8
rev: 3.8.4
hooks:
- id: flake8
4 changes: 2 additions & 2 deletions scanpy/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@ def get_igraph_from_adjacency(adjacency, directed=None):
g.add_edges(list(zip(sources, targets)))
try:
g.es['weight'] = weights
except:
except KeyError:
pass
if g.vcount() != adjacency.shape[0]:
logg.warning(
Expand Down Expand Up @@ -551,7 +551,7 @@ def warn_with_traceback(message, category, filename, lineno, file=None, line=Non
import traceback

traceback.print_stack()
log = file if hasattr(file, 'write') else sys.stderr
log = file if hasattr(file, 'write') else sys.stderr # noqa: F841
settings.write(warnings.formatwarning(message, category, filename, lineno, line))


Expand Down
13 changes: 6 additions & 7 deletions scanpy/external/pl.py
Original file line number Diff line number Diff line change
Expand Up @@ -332,15 +332,14 @@ def scrublet_score_distribution(
figsize: Optional[Tuple[float, float]] = (8, 3),
):
"""\
Plot histogram of doublet scores for observed transcriptomes and simulated doublets.
Plot histogram of doublet scores for observed transcriptomes and simulated doublets.

The histogram for simulated doublets is useful for determining the correct doublet score threshold.

The histogram for simulated doublets is useful for determining the correct doublet
score threshold.

Parameters
----------
adata
An annData object resulting from func:`~scanpy.external.scrublet`.
An annData object resulting from func:`~scanpy.external.scrublet`.
scale_hist_obs
Set y axis scale transformation in matplotlib for the plot of observed
transcriptomes (e.g. "linear", "log", "symlog", "logit")
Expand All @@ -353,9 +352,9 @@ def scrublet_score_distribution(
See also
--------
:func:`~scanpy.external.pp.scrublet`: Main way of running Scrublet, runs
preprocessing, doublet simulation (this function) and calling.
preprocessing, doublet simulation (this function) and calling.
:func:`~scanpy.external.pp.scrublet_simulate_doublets`: Run Scrublet's doublet
simulation separately for advanced usage.
simulation separately for advanced usage.
"""

threshold = adata.uns['scrublet']['threshold']
Expand Down
62 changes: 31 additions & 31 deletions scanpy/external/pp/_scrublet.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from anndata import AnnData
from typing import Collection, Tuple, Optional, Union
from typing import Optional
import numpy as np

from ... import logging as logg
Expand Down Expand Up @@ -38,7 +38,7 @@ def scrublet(
and directly call functions of Scrublet(). You may also undertake your own
preprocessing, simulate doublets with
scanpy.external.pp.scrublet_simulate_doublets(), and run the core scrublet
function scanpy.external.pp.scrublet.scrublet().
function scanpy.external.pp.scrublet.scrublet().

.. note::
More information and bug reports `here
Expand All @@ -59,7 +59,7 @@ def scrublet(
as adata. This should have been built from adata_obs after
filtering genes and cells and selcting highly-variable genes.
sim_doublet_ratio
Number of doublets to simulate relative to the number of observed
Number of doublets to simulate relative to the number of observed
transcriptomes.
expected_doublet_rate
Where adata_sim not suplied, the estimated doublet rate for the
Expand All @@ -71,8 +71,8 @@ def scrublet(
synthetic doublets. If 1.0, each doublet is created by simply adding
the UMI counts from two randomly sampled observed transcriptomes. For
values less than 1, the UMI counts are added and then randomly sampled
at the specified rate.
knn_dist_metric
at the specified rate.
knn_dist_metric
Distance metric used when finding nearest neighbors. For list of
valid values, see the documentation for annoy (if `use_approx_neighbors`
is True) or sklearn.neighbors.NearestNeighbors (if `use_approx_neighbors`
Expand All @@ -88,16 +88,16 @@ def scrublet(
If True, center the data such that each gene has a mean of 0.
`sklearn.decomposition.PCA` will be used for dimensionality
reduction.
n_prin_comps
n_prin_comps
Number of principal components used to embed the transcriptomes prior
to k-nearest-neighbor graph construction.
to k-nearest-neighbor graph construction.
use_approx_neighbors
Use approximate nearest neighbor method (annoy) for the KNN
Use approximate nearest neighbor method (annoy) for the KNN
classifier.
get_doublet_neighbor_parents
If True, return (in .uns) the parent transcriptomes that generated the
doublet neighbors of each observed transcriptome. This information can
be used to infer the cell states that generated a given doublet state.
be used to infer the cell states that generated a given doublet state.
n_neighbors
Number of neighbors used to construct the KNN graph of observed
transcriptomes and simulated doublets. If ``None``, this is
Expand Down Expand Up @@ -131,7 +131,7 @@ def scrublet(
``adata.uns['scrublet']['doublet_scores_sim']``
Doublet scores for each simulated doublet transcriptome

``adata.uns['scrublet']['doublet_parents']``
``adata.uns['scrublet']['doublet_parents']``
Pairs of ``.obs_names`` used to generate each simulated doublet
transcriptome

Expand All @@ -141,9 +141,9 @@ def scrublet(
See also
--------
:func:`~scanpy.external.pp.scrublet_simulate_doublets`: Run Scrublet's doublet
simulation separately for advanced usage.
simulation separately for advanced usage.
:func:`~scanpy.external.pl.scrublet_score_distribution`: Plot histogram of doublet
scores for observed transcriptomes and simulated doublets.
scores for observed transcriptomes and simulated doublets.
"""
try:
import scrublet as sl
Expand Down Expand Up @@ -183,7 +183,7 @@ def scrublet(
pp.highly_variable_genes(adata_obs, subset=True)
else:
logged = pp.log1p(adata_obs, copy=True)
hvg = pp.highly_variable_genes(logged)
_ = pp.highly_variable_genes(logged)
adata_obs = adata_obs[:, logged.var['highly_variable']]

# Simulate the doublets based on the raw expressions from the normalised
Expand Down Expand Up @@ -255,7 +255,7 @@ def _scrublet_call_doublets(
transcriptomes and simulated doublets. This is a wrapper around the core
functions of `Scrublet <https://github.com/swolock/scrublet>`__ to allow
for flexibility in applying Scanpy filtering operations upstream. Unless
you know what you're doing you should use the main scrublet() function.
you know what you're doing you should use the main scrublet() function.

.. note::
More information and bug reports `here
Expand Down Expand Up @@ -291,20 +291,20 @@ def _scrublet_call_doublets(
reduction, unless `mean_center` is True.
n_prin_comps
Number of principal components used to embed the transcriptomes prior
to k-nearest-neighbor graph construction.
to k-nearest-neighbor graph construction.
use_approx_neighbors
Use approximate nearest neighbor method (annoy) for the KNN
Use approximate nearest neighbor method (annoy) for the KNN
classifier.
knn_dist_metric
Distance metric used when finding nearest neighbors. For list of
valid values, see the documentation for annoy (if `use_approx_neighbors`
is True) or sklearn.neighbors.NearestNeighbors (if `use_approx_neighbors`
is False).
get_doublet_neighbor_parents
If True, return the parent transcriptomes that generated the
doublet neighbors of each observed transcriptome. This information can
be used to infer the cell states that generated a given
doublet state.
If True, return the parent transcriptomes that generated the
doublet neighbors of each observed transcriptome. This information can
be used to infer the cell states that generated a given
doublet state.
threshold
Doublet score threshold for calling a transcriptome a doublet. If
`None`, this is set automatically by looking for the minimum between
Expand All @@ -314,7 +314,7 @@ def _scrublet_call_doublets(
predicted doublets in a 2-D embedding.
random_state
Initial state for doublet simulation and nearest neighbors.
verbose
verbose
If True, print progress updates.

Returns
Expand All @@ -331,7 +331,7 @@ def _scrublet_call_doublets(
``adata.uns['scrublet']['doublet_scores_sim']``
Doublet scores for each simulated doublet transcriptome

``adata.uns['scrublet']['doublet_parents']``
``adata.uns['scrublet']['doublet_parents']``
Pairs of ``.obs_names`` used to generate each simulated doublet transcriptome

``uns['scrublet']['parameters']``
Expand Down Expand Up @@ -444,16 +444,16 @@ def scrublet_simulate_doublets(
The annotated data matrix of shape ``n_obs`` × ``n_vars``. Rows
correspond to cells and columns to genes. Genes should have been
filtered for expression and variability, and the object should contain
raw expression of the same dimensions.
raw expression of the same dimensions.
layer
Layer of adata where raw values are stored, or 'X' if values are in .X.
Layer of adata where raw values are stored, or 'X' if values are in .X.
sim_doublet_ratio
Number of doublets to simulate relative to the number of observed
Number of doublets to simulate relative to the number of observed
transcriptomes. If `None`, self.sim_doublet_ratio is used.
synthetic_doublet_umi_subsampling
Rate for sampling UMIs when creating synthetic doublets. If 1.0,
each doublet is created by simply adding the UMIs from two randomly
sampled observed transcriptomes. For values less than 1, the
Rate for sampling UMIs when creating synthetic doublets. If 1.0,
each doublet is created by simply adding the UMIs from two randomly
sampled observed transcriptomes. For values less than 1, the
UMI counts are added and then randomly sampled at the specified
rate.

Expand All @@ -462,7 +462,7 @@ def scrublet_simulate_doublets(
adata : anndata.AnnData with simulated doublets in .X
if ``copy=True`` it returns or else adds fields to ``adata``:

``adata.uns['scrublet']['doublet_parents']``
``adata.uns['scrublet']['doublet_parents']``
Pairs of ``.obs_names`` used to generate each simulated doublet transcriptome

``uns['scrublet']['parameters']``
Expand All @@ -471,9 +471,9 @@ def scrublet_simulate_doublets(
See also
--------
:func:`~scanpy.external.pp.scrublet`: Main way of running Scrublet, runs
preprocessing, doublet simulation (this function) and calling.
preprocessing, doublet simulation (this function) and calling.
:func:`~scanpy.external.pl.scrublet_score_distribution`: Plot histogram of doublet
scores for observed transcriptomes and simulated doublets.
scores for observed transcriptomes and simulated doublets.
"""
try:
import scrublet as sl
Expand Down
16 changes: 8 additions & 8 deletions scanpy/external/pp/_scvi.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,12 +33,12 @@ def scvi(

Fits scVI model onto raw count data given an anndata object

scVI uses stochastic optimization and deep neural networks to aggregate information
scVI uses stochastic optimization and deep neural networks to aggregate information
across similar cells and genes and to approximate the distributions that underlie
observed expression values, while accounting for batch effects and limited sensitivity.

To use a linear-decoded Variational AutoEncoder model (implementation of [Svensson20]_.),
set linear_decoded = True. Compared to standard VAE, this model is less powerful, but can
set linear_decoded = True. Compared to standard VAE, this model is less powerful, but can
be used to inspect which genes contribute to variation in the dataset. It may also be used
for all scVI tasks, like differential expression, batch correction, imputation, etc.
However, batch correction may be less powerful as it assumes a linear model.
Expand Down Expand Up @@ -69,13 +69,13 @@ def scvi(
train_size
The train size, either a float between 0 and 1 or an integer for the number of training samples to use
batch_key
Column name in anndata.obs for batches.
Column name in anndata.obs for batches.
If None, no batch correction is performed
If not None, batch correction is performed per batch category
use_highly_variable_genes
If true, uses only the genes in anndata.var["highly_variable"]
subset_genes
Optional list of indices or gene names to subset anndata.
Optional list of indices or gene names to subset anndata.
If not None, use_highly_variable_genes is ignored
linear_decoder
If true, uses LDVAE model, which is an implementation of [Svensson20]_.
Expand All @@ -89,18 +89,18 @@ def scvi(
Extra arguments for UnsupervisedTrainer
model_kwargs
Extra arguments for VAE or LDVAE model

Returns
-------
If `copy` is true, anndata is returned.
If `return_posterior` is true, the posterior object is returned
If both `copy` and `return_posterior` are true,
a tuple of anndata and the posterior are returned in that order.
If both `copy` and `return_posterior` are true,
a tuple of anndata and the posterior are returned in that order.

`adata.obsm['X_scvi']` stores the latent representations
`adata.obsm['X_scvi_denoised']` stores the normalized mean of the negative binomial
`adata.obsm['X_scvi_sample_rate']` stores the mean of the negative binomial

If linear_decoder is true:
`adata.uns['ldvae_loadings']` stores the per-gene weights in the linear decoder as a
genes by n_latent matrix.
Expand Down
2 changes: 1 addition & 1 deletion scanpy/external/tl/_trimap.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ def trimap(

Example
-------

>>> import scanpy as sc
>>> import scanpy.external as sce
>>> pbmc = sc.datasets.pbmc68k_reduced()
Expand Down
4 changes: 2 additions & 2 deletions scanpy/get/get.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ def rank_genes_groups_df(
def _check_indices(
dim_df: pd.DataFrame,
alt_index: pd.Index,
dim: "Literal['obs', 'var']",
dim: "Literal['obs', 'var']", # noqa: F821
keys: List[str],
alias_index: Optional[pd.Index] = None,
use_raw: bool = False,
Expand Down Expand Up @@ -176,7 +176,7 @@ def _get_array_values(
X,
dim_names: pd.Index,
keys: List[str],
axis: "Literal[0, 1]",
axis: "Literal[0, 1]", # noqa: F821
backed: bool,
):
# TODO: This should be made easier on the anndata side
Expand Down
6 changes: 2 additions & 4 deletions scanpy/neighbors/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,11 @@
from ..tools._utils import _choose_representation, doc_use_rep, doc_n_pcs
from .. import settings


N_DCS = 15 # default number of diffusion components
N_PCS = (
settings.N_PCS
) # Backwards compat, constants should be defined in only one place.


_Method = Literal['umap', 'gauss', 'rapids']
_MetricFn = Callable[[np.ndarray, np.ndarray], float]
# from sklearn.metrics.pairwise_distances.__doc__:
Expand Down Expand Up @@ -126,7 +124,7 @@ def neighbors(
**distances** : sparse matrix of dtype `float32`.
Instead of decaying weights, this stores distances for each pair of
neighbors.

Notes
-----
If `method='umap'`, it's highly recommended to install pynndescent ``pip install pynndescent``.
Expand Down Expand Up @@ -799,7 +797,7 @@ def compute_neighbors(
try:
if forest:
self._rp_forest = _make_forest_dict(forest)
except:
except Exception:
pass
# write indices as attributes
if write_knn_indices:
Expand Down
2 changes: 1 addition & 1 deletion scanpy/plotting/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@
Classes
-------

These classes allow fine tuning of visual parameters.
These classes allow fine tuning of visual parameters.

.. autosummary::
:toctree: .
Expand Down
6 changes: 3 additions & 3 deletions scanpy/plotting/_anndata.py
Original file line number Diff line number Diff line change
Expand Up @@ -552,7 +552,7 @@ def ranking(
n_rows, n_cols = 1, n_panels
else:
n_rows, n_cols = 2, int(n_panels / 2 + 0.5)
fig = pl.figure(
_ = pl.figure(
figsize=(
n_cols * rcParams['figure.figsize'][0],
n_rows * rcParams['figure.figsize'][1],
Expand Down Expand Up @@ -1474,7 +1474,7 @@ def tracksplot(
ymin, ymax = ax.get_ylim()
ymax = int(ymax)
ax.set_yticks([ymax])
tt = ax.set_yticklabels([str(ymax)], ha='left', va='top')
ax.set_yticklabels([str(ymax)], ha='left', va='top')
ax.spines['right'].set_position(('axes', 1.01))
ax.tick_params(
axis='y',
Expand Down Expand Up @@ -1960,7 +1960,7 @@ def _plot_gene_groups_brackets(
va='bottom',
rotation=rotation,
)
except:
except Exception:
pass
else:
top = left
Expand Down
Loading