\appendix \clearpage
\clearpage
Top-nine performing solutions and benchmark. This table lists the top-nine solutions and the languages and algorithms each used, as well as the average speedup per plate relative to the k-means benchmark.
rank | handle | language | method | category | speedup |
---|---|---|---|---|---|
1 | gardn999 | Java | random forest regressor | DTR | 17x |
2 | Ardavel | C++ | Gaussian mixture model | GMM | 62x |
3 | mkagenius | C++ | modified k-means | k-means | 24x |
4 | Ramzes2 | Python/C++ | ConvNet | CNN | 10x |
5 | vladaburian | Python/C++ | Gaussian mixture model | GMM | 7x |
6 | balajipro | Python/C++ | modified k-means | k-means | 21x |
7 | AliGebily | Python | boosted tree regressor | DTR | 5x |
8 | LastEmperor | Python | modified k-means | k-means | 7x |
9 | mvaudel | Java | other | other | 55x |
benchmark | benchmark | Matlab | k-means | k-means | 1x |
\clearpage
Compound perturbagens descriptives. This table shows componud perturbagen names (pert_iname
), unique id (pert_id
), time of treatment (pert_itime
), dose (pert_idose
), and number of replicates (num_replicates
).
pert_iname | pert_id | pert_itime | pert_idose | num_replicates |
---|---|---|---|---|
abiraterone(cb-7598) | BRD-K50071428 | 24 h | 10 um | 11 |
acalabrutinib | BRD-K64034691 | 24 h | 10 um | 11 |
afatinib | BRD-K66175015 | 24 h | 10 um | 11 |
artesunate | BRD-K54634444 | 24 h | 10 um | 11 |
azithromycin | BRD-K74501079 | 24 h | 10 um | 11 |
betamethasone dipropionate (diprolene) | BRD-K58148589 | 24 h | 10 um | 11 |
CGS-21680 | BRD-A81866333 | 24 h | 10 um | 11 |
chelidonine | BRD-K32828673 | 24 h | 10 um | 11 |
clobetasol | BRD-K84443303 | 24 h | 10 um | 11 |
digoxin | BRD-A91712064 | 24 h | 10 um | 11 |
disulfiram | BRD-K32744045 | 24 h | 10 um | 10 |
emetine hcl | BRD-A77414132 | 24 h | 10 um | 10 |
eplerenone | BRD-K19761926 | 24 h | 10 um | 11 |
epothilone-a | BRD-K71823332 | 24 h | 10 um | 9 |
flumetasone | BRD-K61496577 | 24 h | 10 um | 11 |
fluocinolone | BRD-K94353609 | 24 h | 10 um | 11 |
genipin | BRD-K28824103 | 24 h | 10 um | 11 |
hydrocortisone | BRD-K93568044 | 24 h | 10 um | 10 |
hyoscyamine | BRD-K40530731 | 24 h | 10 um | 11 |
indirubin | BRD-K17894950 | 24 h | 10 um | 10 |
L-745870 | BRD-K05528470 | 24 h | 10 um | 10 |
nTZDpa | BRD-K54708045 | 24 h | 10 um | 11 |
oligomycin-a | BRD-A81541225 | 24 h | 10 um | 11 |
PRIMA1 | BRD-K15318909 | 24 h | 10 um | 11 |
RITA | BRD-K00317371 | 24 h | 10 um | 11 |
spironolactone | BRD-K90027355 | 24 h | 10 um | 11 |
tanespimycin | BRD-K81473043 | 24 h | 10 um | 11 |
tretinoin | BRD-K71879491 | 24 h | 10 um | 10 |
UB-165 | BRD-A14574269 | 24 h | 10 um | 11 |
ursolic-acid | BRD-K68185022 | 24 h | 10 um | 11 |
WAY-161503 | BRD-A62021152 | 24 h | 10 um | 11 |
ZM-39923 | BRD-K40624912 | 24 h | 10 um | 11 |
\clearpage
Short-hairpin (shRNA) perturbagens descriptives. This table shows shRNA perturbagen names (pert_iname
), unique id (pert_id
), and number of replicates (num_replicates
).
pert_iname | pert_id | num_replicates |
---|---|---|
ABCB6 | TRCN0000060320 | 4 |
ADI1 | TRCN0000052275 | 4 |
ALDOA | TRCN0000052504 | 4 |
ANXA7 | TRCN0000056304 | 4 |
ARHGAP1 | TRCN0000307776 | 4 |
ASAH1 | TRCN0000029402 | 4 |
ATMIN | TRCN0000141397 | 4 |
ATP2C1 | TRCN0000043279 | 4 |
B3GNT1 | TRCN0000035909 | 4 |
BAX | TRCN0000033471 | 4 |
BIRC5 | TRCN0000073718 | 4 |
BLCAP | TRCN0000161355 | 4 |
BLVRA | TRCN0000046391 | 4 |
BNIP3L | TRCN0000007847 | 4 |
CALU | TRCN0000053792 | 4 |
CCDC85B | TRCN0000242754 | 4 |
CCND1 | TRCN0000040038 | 4 |
CD97 | TRCN0000008234 | 4 |
CHMP4A | TRCN0000150154 | 4 |
CNOT4 | TRCN0000015216 | 4 |
DDR1 | TRCN0000000618 | 4 |
DDX10 | TRCN0000218747 | 4 |
DECR1 | TRCN0000046516 | 4 |
DNM1L | TRCN0000001097 | 3 |
ECH1 | TRCN0000052455 | 4 |
EIF4EBP1 | TRCN0000040206 | 4 |
EMPTY_VECTOR | TRCN0000208001 | 15 |
ETFB | TRCN0000064432 | 4 |
FDFT1 | TRCN0000036327 | 4 |
GALE | TRCN0000049461 | 4 |
GFP | TRCN0000072181 | 16 |
GRN | TRCN0000115978 | 4 |
GTPBP8 | TRCN0000343727 | 4 |
HDGFRP3 | TRCN0000107348 | 4 |
HIST1H2BK | TRCN0000106710 | 4 |
IKBKAP | TRCN0000037871 | 4 |
INPP4B | TRCN0000230838 | 4 |
INSIG1 | TRCN0000134159 | 4 |
ITFG1 | TRCN0000343702 | 3 |
JMJD6 | TRCN0000063340 | 4 |
LBR | TRCN0000060460 | 4 |
LGMN | TRCN0000029255 | 4 |
LPGAT1 | TRCN0000116066 | 4 |
LSM6 | TRCN0000074719 | 4 |
MAPKAPK2 | TRCN0000002285 | 4 |
MAPKAPK3 | TRCN0000006154 | 4 |
MAPKAPK5 | TRCN0000000684 | 4 |
MIF | TRCN0000056818 | 4 |
MRPL12 | TRCN0000072655 | 4 |
NT5DC2 | TRCN0000350758 | 4 |
NUP88 | TRCN0000145079 | 4 |
PARP2 | TRCN0000007933 | 4 |
PLCB3 | TRCN0000000431 | 4 |
POLE2 | TRCN0000233181 | 4 |
PPIE | TRCN0000049371 | 4 |
PRKAG2 | TRCN0000003146 | 4 |
PSMB10 | TRCN0000010833 | 4 |
PTPN6 | TRCN0000011052 | 4 |
RAB11FIP2 | TRCN0000322640 | 4 |
RALB | TRCN0000072956 | 4 |
RHEB | TRCN0000010425 | 3 |
RNF167 | TRCN0000004100 | 4 |
RPN1 | TRCN0000072588 | 4 |
SLC25A4 | TRCN0000044967 | 4 |
SNX11 | TRCN0000127684 | 4 |
STK25 | TRCN0000006270 | 4 |
STUB1 | TRCN0000007525 | 4 |
STXBP1 | TRCN0000147480 | 4 |
SYPL1 | TRCN0000059926 | 4 |
TATDN2 | TRCN0000049828 | 4 |
TM9SF3 | TRCN0000059371 | 4 |
TMEM110 | TRCN0000127960 | 4 |
TMEM50A | TRCN0000129223 | 4 |
trcn0000014632 | TRCN0000014632 | 4 |
trcn0000040123 | TRCN0000040123 | 4 |
trcn0000220641 | TRCN0000220641 | 4 |
trcn0000221408 | TRCN0000221408 | 4 |
trcn0000221644 | TRCN0000221644 | 4 |
TSKU | TRCN0000005222 | 4 |
UGDH | TRCN0000028108 | 4 |
USP14 | TRCN0000007428 | 4 |
USP6NL | TRCN0000253832 | 4 |
VAT1 | TRCN0000038193 | 4 |
VDAC1 | TRCN0000029126 | 4 |
WIPF2 | TRCN0000029833 | 4 |
YME1L1 | TRCN0000073864 | 4 |
ZW10 | TRCN0000155335 | 4 |
\clearpage
-
The contest data are available in the Clue.io data library https://clue.io/data/CT#CT_DPEAK.
-
The source codes of the solutions along with Docker containers that include all the dependencies needed to run the codes are available in the CMap Github repository https://github.com/cmap/gene_deconvolution_challenge
-
A Docker container used for converting the deconvolution data to differential expression values is available in the Docker Hub https://hub.docker.com/r/cmap/sig_2to4_tool.
-
A collection of scripts in the language R used to generate tables and figures are available in the CMap Github repository https://github.com/cmap/deconv.
\clearpage
Scoring function. This appendix describes the scoring function used in the contest to evaluate the performance of the competitors' submissions.
Submissions were scored based on a scoring function that combines measures of accuracy and computational speed. Accuracy measures were obtained by comparing the contestant's predictions, which were derived from
The scoring function combines two measures of accuracy: correlation and AUC, which are applied to deconvoluted (
The first accuracy component is based on the Spearman rank correlation between the predicted
For a given dataset
The second component of the scoring function is based on the Area Under the receiver operating characteristic Curve (AUC) that uses the competitor's DE values at various thresholds to predict the UNI's DE values being higher than 2 ("high") or lower than -2 ("low").
For a given dataset
These accuracy components were integrated into a single aggregate scores:
$$
\text{SCORE} =
\text{SCORE}{\text{max}} \cdot (\max(\text{COR}{p}, 0))^2
\cdot \text{AUC}{p}
\cdot \exp(- T{\text{solution}} / (3 \cdot T_{\text{benchmark}})),
$$
where
\clearpage
L1000 Experimental Scheme The L1000 assay uses Luminex bead-based fluorescent scanners to detect gene expression changes resulting from treating cultured human cells with chemical or genetic perturbations [Subramanian 2017]. Experiments are performed in 384-well plate format, where each well contains an independent sample. The Luminex scanner is able to distinguish between 500 different bead types, or colors, which CMap uses to measure the expression levels of 978 landmark genes using two detection approaches.
In the first detection mode, called
By contrast, in the