Skip to content

colRowWeightedMeans

hb edited this page Oct 27, 2015 · 6 revisions

matrixStats: Benchmark report


colWeightedMeans() and rowWeightedMeans() benchmarks

This report benchmark the performance of colWeightedMeans() and rowWeightedMeans() against alternative methods.

Alternative methods

  • apply() + weighted.mean()

Data

> rmatrix <- function(nrow, ncol, mode = c("logical", 
+     "double", "integer", "index"), range = c(-100, +100), naProb = 0) {
+     mode <- match.arg(mode)
+     n <- nrow * ncol
+     if (mode == "logical") {
+         X <- sample(c(FALSE, TRUE), size = n, replace = TRUE)
+     }
+     else if (mode == "index") {
+         X <- seq_len(n)
+         mode <- "integer"
+     }
+     else {
+         X <- runif(n, min = range[1], max = range[2])
+     }
+     storage.mode(X) <- mode
+     if (naProb > 0) 
+         X[sample(n, size = naProb * n)] <- NA
+     dim(X) <- c(nrow, ncol)
+     X
+ }
> rmatrices <- function(scale = 10, seed = 1, ...) {
+     set.seed(seed)
+     data <- list()
+     data[[1]] <- rmatrix(nrow = scale * 1, ncol = scale * 1, 
+         ...)
+     data[[2]] <- rmatrix(nrow = scale * 10, ncol = scale * 10, 
+         ...)
+     data[[3]] <- rmatrix(nrow = scale * 100, ncol = scale * 1, 
+         ...)
+     data[[4]] <- t(data[[3]])
+     data[[5]] <- rmatrix(nrow = scale * 10, ncol = scale * 100, 
+         ...)
+     data[[6]] <- t(data[[5]])
+     names(data) <- sapply(data, FUN = function(x) paste(dim(x), 
+         collapse = "x"))
+     data
+ }
> data <- rmatrices(mode = "double")

Results

10x10 matrix

> X <- data[["10x10"]]
> w <- runif(nrow(X))
> gc()
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 1685324 90.1    2637877 140.9  2637877 140.9
Vcells 1709204 13.1    3344261  25.6 46816319 357.2
> colStats <- microbenchmark(colWeightedMeans = colWeightedMeans(X, 
+     w = w, na.rm = FALSE), `apply+weigthed.mean` = apply(X, MARGIN = 2, 
+     FUN = weighted.mean, w = w, na.rm = FALSE), unit = "ms")
> X <- t(X)
> gc()
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 1683908 90.0    2637877 140.9  2637877 140.9
Vcells 1704230 13.1    3344261  25.6 46816319 357.2
> rowStats <- microbenchmark(rowWeightedMeans = rowWeightedMeans(X, 
+     w = w, na.rm = FALSE), `apply+weigthed.mean` = apply(X, MARGIN = 1, 
+     FUN = weighted.mean, w = w, na.rm = FALSE), unit = "ms")

Table: Benchmarking of colWeightedMeans() and apply+weigthed.mean() on 10x10 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colWeightedMeans 0.037502 0.0388740 0.0444442 0.0401330 0.0409645 0.486712
apply+weigthed.mean 0.279034 0.2836725 0.2902293 0.2855485 0.2878485 0.660192
expr min lq mean median uq max
colWeightedMeans 1.00000 1.00000 1.0000 1.000000 1.000000 1.000000
apply+weigthed.mean 7.44051 7.29723 6.5302 7.115055 7.026779 1.356433
Table: Benchmarking of rowWeightedMeans() and apply+weigthed.mean() on 10x10 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
rowWeightedMeans 0.078984 0.0812575 0.0900678 0.0832145 0.0846085 0.762360
apply+weigthed.mean 0.279135 0.2834985 0.2877658 0.2858550 0.2890780 0.402149
expr min lq mean median uq max
rowWeightedMeans 1.00000 1.00000 1.000000 1.000000 1.000000 1.0000000
apply+weigthed.mean 3.53407 3.48889 3.194989 3.435159 3.416654 0.5275054
Figure: Benchmarking of colWeightedMeans() and apply+weigthed.mean() on 10x10 data as well as rowWeightedMeans() and apply+weigthed.mean() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colWeightedMeans() and rowWeightedMeans() on 10x10 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colWeightedMeans 37.502 38.8740 44.44416 40.1330 40.9645 486.712
rowWeightedMeans 78.984 81.2575 90.06785 83.2145 84.6085 762.360
expr min lq mean median uq max
colWeightedMeans 1.000000 1.000000 1.00000 1.000000 1.00000 1.000000
rowWeightedMeans 2.106128 2.090279 2.02654 2.073468 2.06541 1.566347
Figure: Benchmarking of colWeightedMeans() and rowWeightedMeans() on 10x10 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

100x100 matrix

> X <- data[["100x100"]]
> w <- runif(nrow(X))
> gc()
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 1684461 90.0    2637877 140.9  2637877 140.9
Vcells 1706199 13.1    3344261  25.6 46816319 357.2
> colStats <- microbenchmark(colWeightedMeans = colWeightedMeans(X, 
+     w = w, na.rm = FALSE), `apply+weigthed.mean` = apply(X, MARGIN = 2, 
+     FUN = weighted.mean, w = w, na.rm = FALSE), unit = "ms")
> X <- t(X)
> gc()
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 1684455 90.0    2637877 140.9  2637877 140.9
Vcells 1716242 13.1    3344261  25.6 46816319 357.2
> rowStats <- microbenchmark(rowWeightedMeans = rowWeightedMeans(X, 
+     w = w, na.rm = FALSE), `apply+weigthed.mean` = apply(X, MARGIN = 1, 
+     FUN = weighted.mean, w = w, na.rm = FALSE), unit = "ms")

Table: Benchmarking of colWeightedMeans() and apply+weigthed.mean() on 100x100 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colWeightedMeans 0.234753 0.240696 0.2448626 0.2426665 0.2451275 0.30900
apply+weigthed.mean 3.397504 3.426096 3.7220759 3.4367345 3.4508525 10.56193
expr min lq mean median uq max
colWeightedMeans 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000
apply+weigthed.mean 14.47268 14.23412 15.20067 14.16238 14.07779 34.18101
Table: Benchmarking of rowWeightedMeans() and apply+weigthed.mean() on 100x100 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
rowWeightedMeans 0.411241 0.415599 0.4947316 0.4190985 0.4286955 7.25107
apply+weigthed.mean 3.409846 3.426607 3.6565061 3.4353470 3.4518560 10.54385
expr min lq mean median uq max
rowWeightedMeans 1.0000 1.000000 1.000000 1.000000 1.000 1.00000
apply+weigthed.mean 8.2916 8.244985 7.390889 8.196992 8.052 1.45411
Figure: Benchmarking of colWeightedMeans() and apply+weigthed.mean() on 100x100 data as well as rowWeightedMeans() and apply+weigthed.mean() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colWeightedMeans() and rowWeightedMeans() on 100x100 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colWeightedMeans 234.753 240.696 244.8626 242.6665 245.1275 309.00
rowWeightedMeans 411.241 415.599 494.7316 419.0985 428.6955 7251.07
expr min lq mean median uq max
colWeightedMeans 1.000000 1.000000 1.000000 1.000000 1.000000 1.00000
rowWeightedMeans 1.751803 1.726655 2.020445 1.727055 1.748867 23.46625
Figure: Benchmarking of colWeightedMeans() and rowWeightedMeans() on 100x100 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

1000x10 matrix

> X <- data[["1000x10"]]
> w <- runif(nrow(X))
> gc()
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 1684494 90.0    2637877 140.9  2637877 140.9
Vcells 1707333 13.1    3344261  25.6 46816319 357.2
> colStats <- microbenchmark(colWeightedMeans = colWeightedMeans(X, 
+     w = w, na.rm = FALSE), `apply+weigthed.mean` = apply(X, MARGIN = 2, 
+     FUN = weighted.mean, w = w, na.rm = FALSE), unit = "ms")
> X <- t(X)
> gc()
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 1684488 90.0    2637877 140.9  2637877 140.9
Vcells 1717376 13.2    3344261  25.6 46816319 357.2
> rowStats <- microbenchmark(rowWeightedMeans = rowWeightedMeans(X, 
+     w = w, na.rm = FALSE), `apply+weigthed.mean` = apply(X, MARGIN = 1, 
+     FUN = weighted.mean, w = w, na.rm = FALSE), unit = "ms")

Table: Benchmarking of colWeightedMeans() and apply+weigthed.mean() on 1000x10 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colWeightedMeans 0.354732 0.384952 0.4614925 0.4144925 0.4433215 5.168358
apply+weigthed.mean 1.350711 1.372848 1.5383052 1.3961560 1.4192780 6.129751
expr min lq mean median uq max
colWeightedMeans 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
apply+weigthed.mean 3.807694 3.566284 3.333326 3.368351 3.201464 1.186015
Table: Benchmarking of rowWeightedMeans() and apply+weigthed.mean() on 1000x10 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
rowWeightedMeans 0.529684 0.5359695 0.6477361 0.5411305 0.5466305 5.698665
apply+weigthed.mean 1.358664 1.3779810 1.4929919 1.4011190 1.4105980 6.131218
expr min lq mean median uq max
rowWeightedMeans 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
apply+weigthed.mean 2.565046 2.571006 2.304939 2.589244 2.580533 1.075904
Figure: Benchmarking of colWeightedMeans() and apply+weigthed.mean() on 1000x10 data as well as rowWeightedMeans() and apply+weigthed.mean() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colWeightedMeans() and rowWeightedMeans() on 1000x10 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colWeightedMeans 354.732 384.9520 461.4925 414.4925 443.3215 5168.358
rowWeightedMeans 529.684 535.9695 647.7361 541.1305 546.6305 5698.665
expr min lq mean median uq max
colWeightedMeans 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
rowWeightedMeans 1.493195 1.392302 1.403568 1.305525 1.233034 1.102607
Figure: Benchmarking of colWeightedMeans() and rowWeightedMeans() on 1000x10 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

10x1000 matrix

> X <- data[["10x1000"]]
> w <- runif(nrow(X))
> gc()
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 1684535 90.0    2637877 140.9  2637877 140.9
Vcells 1706993 13.1    3344261  25.6 46816319 357.2
> colStats <- microbenchmark(colWeightedMeans = colWeightedMeans(X, 
+     w = w, na.rm = FALSE), `apply+weigthed.mean` = apply(X, MARGIN = 2, 
+     FUN = weighted.mean, w = w, na.rm = FALSE), unit = "ms")
> X <- t(X)
> gc()
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 1684523 90.0    2637877 140.9  2637877 140.9
Vcells 1717026 13.1    3344261  25.6 46816319 357.2
> rowStats <- microbenchmark(rowWeightedMeans = rowWeightedMeans(X, 
+     w = w, na.rm = FALSE), `apply+weigthed.mean` = apply(X, MARGIN = 1, 
+     FUN = weighted.mean, w = w, na.rm = FALSE), unit = "ms")

Table: Benchmarking of colWeightedMeans() and apply+weigthed.mean() on 10x1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colWeightedMeans 0.232424 0.2505285 0.2693935 0.2650665 0.2845045 0.336411
apply+weigthed.mean 22.826529 22.9997595 23.4901163 23.1119560 23.2810565 28.134061
expr min lq mean median uq max
colWeightedMeans 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000
apply+weigthed.mean 98.21072 91.80496 87.19631 87.19305 81.83019 83.63003
Table: Benchmarking of rowWeightedMeans() and apply+weigthed.mean() on 10x1000 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
rowWeightedMeans 0.401787 0.407845 0.4465477 0.441786 0.488857 0.537061
apply+weigthed.mean 22.916100 23.055289 23.5189257 23.129144 23.276941 28.187177
expr min lq mean median uq max
rowWeightedMeans 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000
apply+weigthed.mean 57.03544 56.52954 52.66834 52.35373 47.61503 52.48413
Figure: Benchmarking of colWeightedMeans() and apply+weigthed.mean() on 10x1000 data as well as rowWeightedMeans() and apply+weigthed.mean() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colWeightedMeans() and rowWeightedMeans() on 10x1000 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colWeightedMeans 232.424 250.5285 269.3935 265.0665 284.5045 336.411
rowWeightedMeans 401.787 407.8450 446.5477 441.7860 488.8570 537.061
expr min lq mean median uq max
colWeightedMeans 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
rowWeightedMeans 1.728681 1.627938 1.657604 1.666699 1.718275 1.596443
Figure: Benchmarking of colWeightedMeans() and rowWeightedMeans() on 10x1000 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

100x1000 matrix

> X <- data[["100x1000"]]
> w <- runif(nrow(X))
> gc()
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 1684565 90.0    2637877 140.9  2637877 140.9
Vcells 1707460 13.1    3344261  25.6 46816319 357.2
> colStats <- microbenchmark(colWeightedMeans = colWeightedMeans(X, 
+     w = w, na.rm = FALSE), `apply+weigthed.mean` = apply(X, MARGIN = 2, 
+     FUN = weighted.mean, w = w, na.rm = FALSE), unit = "ms")
> X <- t(X)
> gc()
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 1684559 90.0    2637877 140.9  2637877 140.9
Vcells 1807503 13.8    3344261  25.6 46816319 357.2
> rowStats <- microbenchmark(rowWeightedMeans = rowWeightedMeans(X, 
+     w = w, na.rm = FALSE), `apply+weigthed.mean` = apply(X, MARGIN = 1, 
+     FUN = weighted.mean, w = w, na.rm = FALSE), unit = "ms")

Table: Benchmarking of colWeightedMeans() and apply+weigthed.mean() on 100x1000 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colWeightedMeans 1.96248 2.004568 5.188455 2.038473 2.063834 285.44672
apply+weigthed.mean 33.65308 34.192895 37.962480 40.347068 40.905260 41.85109
expr min lq mean median uq max
colWeightedMeans 1.00000 1.00000 1.000000 1.00000 1.00000 1.0000000
apply+weigthed.mean 17.14824 17.05749 7.316722 19.79279 19.82003 0.1466161
Table: Benchmarking of rowWeightedMeans() and apply+weigthed.mean() on 100x1000 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
rowWeightedMeans 3.388423 3.405931 4.227947 3.512784 3.532478 10.84297
apply+weigthed.mean 34.055173 35.467314 41.433275 40.285399 42.163057 319.03177
expr min lq mean median uq max
rowWeightedMeans 1.00000 1.0000 1.000000 1.00000 1.00000 1.00000
apply+weigthed.mean 10.05045 10.4134 9.799856 11.46822 11.93583 29.42291
Figure: Benchmarking of colWeightedMeans() and apply+weigthed.mean() on 100x1000 data as well as rowWeightedMeans() and apply+weigthed.mean() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colWeightedMeans() and rowWeightedMeans() on 100x1000 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colWeightedMeans 1.962480 2.004568 5.188455 2.038473 2.063834 285.44672
rowWeightedMeans 3.388423 3.405931 4.227947 3.512784 3.532478 10.84297
expr min lq mean median uq max
colWeightedMeans 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
rowWeightedMeans 1.726603 1.699085 0.814876 1.723243 1.711609 0.037986
Figure: Benchmarking of colWeightedMeans() and rowWeightedMeans() on 100x1000 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

1000x100 matrix

> X <- data[["1000x100"]]
> w <- runif(nrow(X))
> gc()
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 1684608 90.0    2637877 140.9  2637877 140.9
Vcells 1708915 13.1    3344261  25.6 46816319 357.2
> colStats <- microbenchmark(colWeightedMeans = colWeightedMeans(X, 
+     w = w, na.rm = FALSE), `apply+weigthed.mean` = apply(X, MARGIN = 2, 
+     FUN = weighted.mean, w = w, na.rm = FALSE), unit = "ms")
> X <- t(X)
> gc()
          used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells 1684596 90.0    2637877 140.9  2637877 140.9
Vcells 1808948 13.9    3344261  25.6 46816319 357.2
> rowStats <- microbenchmark(rowWeightedMeans = rowWeightedMeans(X, 
+     w = w, na.rm = FALSE), `apply+weigthed.mean` = apply(X, MARGIN = 1, 
+     FUN = weighted.mean, w = w, na.rm = FALSE), unit = "ms")

Table: Benchmarking of colWeightedMeans() and apply+weigthed.mean() on 1000x100 data. The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colWeightedMeans 2.373089 2.570216 3.185409 2.648289 2.74999 7.402211
apply+weigthed.mean 13.268391 13.467519 21.270577 13.640463 18.07039 295.995929
expr min lq mean median uq max
colWeightedMeans 1.00000 1.00000 1.000000 1.00000 1.000000 1.0000
apply+weigthed.mean 5.59119 5.23984 6.677503 5.15067 6.571076 39.9875
Table: Benchmarking of rowWeightedMeans() and apply+weigthed.mean() on 1000x100 data (transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.
expr min lq mean median uq max
rowWeightedMeans 3.516543 3.538791 4.137653 3.617999 3.643841 8.818421
apply+weigthed.mean 13.661668 13.903900 19.079959 18.181791 18.508519 296.375817
expr min lq mean median uq max
rowWeightedMeans 1.000000 1.000000 1.0000 1.000000 1.000000 1.00000
apply+weigthed.mean 3.884971 3.928997 4.6113 5.025371 5.079398 33.60872
Figure: Benchmarking of colWeightedMeans() and apply+weigthed.mean() on 1000x100 data as well as rowWeightedMeans() and apply+weigthed.mean() on the same data transposed. Outliers are displayed as crosses. Times are in milliseconds.

Table: Benchmarking of colWeightedMeans() and rowWeightedMeans() on 1000x100 data (original and transposed). The top panel shows times in milliseconds and the bottom panel shows relative times.

expr min lq mean median uq max
colWeightedMeans 2.373089 2.570216 3.185409 2.648289 2.749990 7.402211
rowWeightedMeans 3.516543 3.538791 4.137653 3.617999 3.643841 8.818421
expr min lq mean median uq max
colWeightedMeans 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
rowWeightedMeans 1.481842 1.376846 1.298939 1.366165 1.325038 1.191323
Figure: Benchmarking of colWeightedMeans() and rowWeightedMeans() on 1000x100 data (original and transposed). Outliers are displayed as crosses. Times are in milliseconds.

Appendix

Session information

R version 3.2.2 Patched (2015-10-26 r69575)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] C

attached base packages:
[1] methods   stats     graphics  grDevices utils     datasets  base     

other attached packages:
[1] markdown_0.7.7       microbenchmark_1.4-2 matrixStats_0.15.0  
[4] ggplot2_1.0.1        knitr_1.11           R.devices_2.13.1    
[7] R.utils_2.1.0        R.oo_1.19.0          R.methodsS3_1.7.0   

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.1          plyr_1.8.3           highr_0.5.1         
 [4] base64enc_0.1-3      tools_3.2.2          digest_0.6.8        
 [7] annotate_1.48.0      RSQLite_1.0.0        gtable_0.1.2        
[10] R.cache_0.11.0       DBI_0.3.1            parallel_3.2.2      
[13] proto_0.3-10         R.rsp_0.20.0         genefilter_1.52.0   
[16] stringr_1.0.0        S4Vectors_0.8.0      IRanges_2.4.1       
[19] stats4_3.2.2         grid_3.2.2           Biobase_2.30.0      
[22] AnnotationDbi_1.32.0 survival_2.38-3      XML_3.98-1.3        
[25] reshape2_1.4.1       magrittr_1.5         splines_3.2.2       
[28] scales_0.3.0         MASS_7.3-44          BiocGenerics_0.16.0 
[31] mime_0.4             colorspace_1.2-6     xtable_1.7-4        
[34] labeling_0.3         stringi_1.0-1        munsell_0.4.2       

Total processing time was 33.86 secs.

Reproducibility

To reproduce this report, do:

html <- matrixStats:::benchmark('colWeightedMeans')

Copyright Henrik Bengtsson. Last updated on 2015-10-27 11:58:30 (-0700 UTC). Powered by RSP.

<script> var link = document.createElement('link'); link.rel = 'icon'; link.href = "" document.getElementsByTagName('head')[0].appendChild(link); </script>
Clone this wiki locally