Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

summary.nblastfafb subsets results before ordering #30

Open
schlegelp opened this issue Jun 26, 2017 · 3 comments
Open

summary.nblastfafb subsets results before ordering #30

schlegelp opened this issue Jun 26, 2017 · 3 comments

Comments

@schlegelp
Copy link

summary.nblastfafb subsets for the top N hits and only then calculates and reorders by mu_score. This leads to two potential problems:
(a) If there are nblast hits with high mu_score that are not among the top N forward scores, they will not show up in the summary.
(b) Probably less of a problem: if the original order (which is based on forward score) is changed by the user, the summary will be different. This could of course also be intended.

@mmc46
Copy link

mmc46 commented Dec 1, 2017

This is affecting FAFB vs FC uPNs mean score calculation - as mentioned in previous comment, some neurons with high mean score (but not forward) are being lost from the summary.

library(elmr)
library(flycircuit)
#Collect FAFB dps
fafbdps=fetchdps_fafb('annotation:WTPN2017_olfactory_uPN_right')
#Collect uPNs from FlyCircuit
upns=fc_gene_name(subset(annotation,annotation_class=='NeuronSubType' & grepl('uPN',text))$neuron_idid)
#Keep only the ones in good_images(good registration)
good_images=scan(fc_download_data("http://jefferislab.org/si/nblast/flycircuit/good_images.txt"),what='', quiet = TRUE)
upns=intersect(upns, good_images)
upns=setdiff(upns, c("DvGlutMARCM-F1364_seg1"))
devtools::source_gist("bbaf5d53353b3944c090", filename = "FlyCircuitStartupNat.R")
fcupns=dps[upns]
#calculate forward score
fafbsc=nblast(query = fafbdps, target = fcupns, normalised = TRUE)
#calculate reverse score
fafbscr=nblast(query = fcupns, target = fafbdps, normalised = TRUE)
#calculate mean score: (forward + reverse)/2
fafbscmu=(fafbsc + t(fafbscr))/2
fafbscmu_sort=sapply(names(fafbdps), function(x) sort(fafbscmu[,x], dec=T))

#For skid 16, comparing
head(fafbscmu_sort)[,1]

#to
sc16=nblast_fafb(16)
summary(sc16)

Mean score top hit in summary is DvGlutMARCM-F004348_seg001 (0.54579523) but the real top hit, from fafbscmu_sort, is FruMARCM-F000734_seg001 (0.5605835)

@jefferis
Copy link
Collaborator

jefferis commented Dec 1, 2017 via email

@mmc46
Copy link

mmc46 commented Dec 1, 2017

Yes, at n=17

Forward score for is FruMARCM-F000734_seg001 0.5272133

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants