Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multibatchnorm / between library batch normalization #3309

Open
Marwansha opened this issue Oct 23, 2024 · 2 comments
Open

Multibatchnorm / between library batch normalization #3309

Marwansha opened this issue Oct 23, 2024 · 2 comments

Comments

@Marwansha
Copy link

What kind of feature would you like to request?

Additional function parameters / changed functionality / changed defaults?

Please describe your wishes

Hi,

Is there an equivalent function to multiBatchNorm in Python, or another method that can perform per-batch normalization?

My goal is to compute psuedobulk per indiviudal, Each individual sample has replicates that are processed across different libraries,

a- Simply summing the raw counts across replicates would likely introduce bias due to library-specific batch effects.

b- Taking the mean of normalized counts across replicates (scranPY normalized counts) doesn’t account for differences in size factors across the libraries, making normalization inconsistent between batches.

important note :
replicates are distributed across different libraries

Individual x might have replicate 1 in library 1 and replicate 2 in library 3, while
Individual y might have replicate 1 in library 1 but replicate 2 in library 4.
so thats why summing raw / normalized counts directly seem inaccurate

I’d greatly appreciate any advice.

In R, I’ve previously used multiBatchNorm from the scran package, which normalizes and scale the size factors within each batch to handle such batch effects. However, given the size of my current dataset, using R is not feasible.

@Marwansha Marwansha added Enhancement ✨ Triage 🩺 This issue needs to be triaged by a maintainer labels Oct 23, 2024
@flying-sheep flying-sheep added Area – Preprocessing 🧼 and removed Enhancement ✨ Triage 🩺 This issue needs to be triaged by a maintainer labels Dec 16, 2024
@flying-sheep
Copy link
Member

We talk a little about batch correction here: https://scanpy.readthedocs.io/en/latest/api/preprocessing.html#batch-effect-correction

@AnnaChristina @Zethson what’s the best practice take on this?

@Marwansha
Copy link
Author

Marwansha commented Dec 16, 2024

Thanks will take a look.

I am mainly interested in multi batch norm like function, to scale the scran size factors across batches

https://rdrr.io/bioc/batchelor/man/multiBatchNorm.html

And also I am wondering if there is a equivalent in python to the edgeR CPM normalisation adjusted for library sizes ?

https://rdrr.io/bioc/edgeR/man/cpm.html

Thanks a lot for your time and help
Marwan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants