Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing Scanpy 1.9.3 error for mixed datatype column #135

Open
wants to merge 31 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
4ef26a5
Update metadata file in scanpy-scripts-tests.bats
anilthanki Jun 13, 2024
47598b6
changing string to int
anilthanki Jun 13, 2024
feac1bd
change int loop to print single number
anilthanki Jun 13, 2024
226a08d
Adds empty lines
anilthanki Jun 13, 2024
58239a6
fixes empty lines
anilthanki Jun 13, 2024
a22032c
Update scanpy-scripts-tests.bats
anilthanki Jun 14, 2024
5250083
adds functionality to change multiple datatype to string
anilthanki Jun 14, 2024
dfb85b0
satisfying `black` errors
anilthanki Jun 14, 2024
e544e16
Satisfying `black` error
anilthanki Jun 14, 2024
f95d276
Fixes `black` error
anilthanki Jun 14, 2024
37748c1
Update `black` operation in python-package.yml
anilthanki Jun 14, 2024
fb313c9
reverting changes in scanpy-scripts-tests.bats
anilthanki Jun 14, 2024
a47de2c
changes var type and moves code to right column
anilthanki Jul 4, 2024
6e3e0cc
reverting changes
anilthanki Jul 4, 2024
366ca6a
Update _mnn.py
anilthanki Jul 4, 2024
192edbc
fixes anndata version
anilthanki Jul 5, 2024
b70d130
Update __init__.py
anilthanki Jul 5, 2024
d4d7e35
removes Anndata
anilthanki Jul 5, 2024
e0a20ac
fixes lint
anilthanki Jul 5, 2024
88ea254
fixes `black` test
anilthanki Jul 5, 2024
3a0ebf5
reverting `black` command
anilthanki Jul 5, 2024
024d3bf
applying mixed column test
anilthanki Jul 5, 2024
99c9f10
revert scanpy-scripts-tests.bats
anilthanki Jul 5, 2024
0db8224
Update _read.py - fixes mixed data types in anndata
anilthanki Aug 22, 2024
4cacb39
Update _read.py - removes empty lines
anilthanki Aug 22, 2024
2b0ceed
Update _read.py
anilthanki Aug 22, 2024
6b28282
Update python-package.yml - temporary update black to check errs
anilthanki Aug 22, 2024
4683fcd
Update _read.py - fixes black test
anilthanki Aug 22, 2024
193b8d4
Update python-package.yml - reverting black test
anilthanki Aug 22, 2024
609144f
Update scanpy-scripts-tests.bats - to test anndata changes
anilthanki Aug 22, 2024
943ca3c
Update scanpy-scripts-tests.bats - adds additional column for test
anilthanki Aug 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion scanpy-scripts-tests.bats
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ setup() {
skip "$singlet_obs exists"
fi

run rm -rf $batch_obs && echo -e "batch\n$(printf "%0.sbatch1\n" {1..1350})\n$(printf "%0.sbatch2\n" {1..1350})" > $batch_obs
run rm -rf $batch_obs && echo -e "batch\tadditional_column\n$(for i in {1..1350}; do echo -e "batch1\tdata$i"; done)\n$(for i in {1..1350}; do echo -e "batch2\tinfo$i"; done)" > $batch_obs

[ "$status" -eq 0 ]
[ -f "$batch_obs" ]
Expand Down
22 changes: 22 additions & 0 deletions scanpy_scripts/lib/_read.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,26 @@ def read_10x(
right_index=True,
suffixes=(False, False),
)

# Convert mixed dtype columns to 'string' type to preserve all information
obs_mixed_columns = columns_with_multiple_dtypes(adata.obs)

for column in obs_mixed_columns:
adata.obs[column] = adata.obs[column].astype("str")

var_mixed_columns = columns_with_multiple_dtypes(adata.var)

for column in var_mixed_columns:
adata.var[column] = adata.var[column].astype("str")

return adata


def columns_with_multiple_dtypes(df):
mixed_dtype_columns = []
for column in df.columns:
# Get unique dtypes in the column
unique_dtypes = df[column].apply(type).unique()
if len(unique_dtypes) > 1:
mixed_dtype_columns.append(column)
return mixed_dtype_columns
Loading