-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] is get_enriched_groups(props, reps) working as intended? #2280
Comments
Hi @kafker , your understanding of import numpy as np
import anvio
from anvio import utils
props = [1/8,3/13,5/168,329/435,67/857,11/64,0/13,0/14,3/8]
# making sure the fractions match up to the table
props
# [0.125, 0.23076923076923078, 0.02976190476190476, 0.7563218390804598, 0.07817969661610269, 0.171875, 0.0, 0.0, 0.375]
reps = [8,13,168,435,857,64,13,14,8]
# check the overall proportion
overall_portion = np.sum(np.multiply(props, reps)) / np.sum(reps)
overall_portion
# 0.26518987341772154
utils.get_enriched_groups(props,reps)
# array([False, False, False, True, False, False, False, False, True]) As you can see, it returns So that then begs the question, why are you not seeing this result on your input data? There are a couple of ways I can think of to cause this:
Option 1 would be a bug, and we would need further info from you to help debug it. Option 2 could happen in a couple of ways depending on which enrichment program you are using. I think for For Finally, Option 3 is the most likely based on what you've described:
This could happen if you are running
(from here) @meren , I think this is a bug. Why aren't we using @kafker, Are you running |
Hi @ivagljiva Thank you very much for your answer. I did the enrichment analysis with "anvi-compute-functional-enrichment-across-genomes" because we are working with more than 1500 MAGs and at the moment we are not interested in GC. I think it is Thank you! |
I agree that it doesn't seem like the right approach, @kafker. I think that it would be more reasonable to change |
Thank you both very much. I'll take a look at this hopefully soon and fix this for |
Short description of the problem
Dear Devs,
As far as I understand the 'associated groups' are decided before the enrichment analysis by the function
get_enriched_groups
. This function calculates in which groups a specific function has a proportion that is higher than the overall proportion across all groups.In
get_enriched_groups
the overall proportions are calculated as follow:overall_portion = np.sum(np.multiply(props, reps)) / np.sum(reps)
It is the number of genomes with that specific function divided by the total number of genomes. Here is an example:
So the overall_proportion for K02386 should be
419/1580=0.26
However, the
associated_groups
for K02386 are all groups exceptgroup7
andgroup8
, instead of justgroup4
where0.75 > 0.26
.Is my understanding of
get_enriched_groups
correct or I am missing something?Thank you
A
anvi'o version
anvio-dev
The text was updated successfully, but these errors were encountered: