Skip to content

ℹ️ Performance evaluation metrics for supervised and unsupervised machine learning, statistical learning and artificial intelligence applications in R.

License

Notifications You must be signed in to change notification settings

serkor1/SLmetrics

Repository files navigation

{SLmetrics}: Machine learning performance evaluation on steroids

CRAN status CRAN RStudio mirror downloads Lifecycle: experimental R-CMD-check codecov CodeFactor

{SLmetrics} is a lightweight R package written in C++ and {Rcpp} for memory-efficient and lightning-fast machine learning performance evaluation; it’s like using a supercharged {yardstick} but without the risk of soft to super-hard deprecations. {SLmetrics} covers both regression and classification metrics and provides (almost) the same array of metrics as {scikit-learn} and {PyTorch} all without {reticulate} and the Python compile-run-(crash)-debug cylce.

Depending on the mood and alignment of planets {SLmetrics} stands for Supervised Learning metrics, or Statistical Learning metrics. If {SLmetrics} catches on, the latter will be the core philosophy and include unsupervised learning metrics. If not, then it will remain a {pkg} for Supervised Learning metrics, and a sandbox for me to develop my C++ skills.

📚 Table of Contents

🚀 Gettting Started

Below you’ll find instructions to install {SLmetrics} and get started with your first metric, the Root Mean Squared Error (RMSE).

🛡️ Installation

## install stable release
devtools::install_github(
  repo = 'https://github.com/serkor1/SLmetrics@*release',
  ref  = 'main'
)

📚 Basic Usage

Below is a minimal example demonstrating how to compute both unweighted and weighted RMSE.

library(SLmetrics)

actual    <- c(10.2, 12.5, 14.1)
predicted <- c(9.8, 11.5, 14.2)
weights   <- c(0.2, 0.5, 0.3)

cat(
  "Root Mean Squared Error", rmse(
    actual    = actual,
    predicted = predicted,
  ),
  "Root Mean Squared Error (weighted)", weighted.rmse(
    actual    = actual,
    predicted = predicted,
    w         = weights
  ),
  sep = "\n"
)
#> Root Mean Squared Error
#> 0.6244998
#> Root Mean Squared Error (weighted)
#> 0.7314369

That’s all! Now you can explore the rest of this README for in-depth usage, performance comparisons, and more details about {SLmetrics}.

ℹ️ Why?

Machine learning can be a complicated task; the steps from feature engineering to model deployment require carefully measured actions and decisions. One low-hanging fruit to simplify this process is performance evaluation.

At its core, performance evaluation is essentially just comparing two vectors — a programmatically and, at times, mathematically trivial step in the machine learning pipeline, but one that can become complicated due to:

  1. Dependencies and potential deprecations
  2. Needlessly complex or repetitive arguments
  3. Performance and memory bottlenecks at scale

{SLmetrics} solves these issues by being:

  1. Fast: Powered by C++ and Rcpp
  2. Memory-efficient: Everything is structured around pointers and references
  3. Lightweight: Only depends on Rcpp, RcppEigen, and lattice
  4. Simple: S3-based, minimal overhead, and flexible inputs

Performance evaluation should be plug-and-play and “just work” out of the box — there’s no need to worry about quasiquations, dependencies, deprecations, or variations of the same functions relative to their arguments when using {SLmetrics}.

⚡ Performance Comparison

One, obviously, can’t build an R-package on C++ and {Rcpp} without a proper pissing contest at the urinals - below is a comparison in execution time and memory efficiency of two simple cases that any {pkg} should be able to handle gracefully; computing a 2 x 2 confusion matrix and computing the RMSE1.

⏩ Speed comparison

As shown in the chart, {SLmetrics} maintains consistently low(er) execution times across different sample sizes.

💾 Memory-efficiency

Below are the results for garbage collections and total memory allocations when computing a 2×2 confusion matrix (N = 1e7) and RMSE (N = 1e7). Notice that {SLmetrics} requires no GC calls for these operations.

Iterations Garbage Collections [gc()] gc() pr. second Memory Allocation (MB)
{SLmetrics} 100 0 0.00 0
{yardstick} 100 190 4.44 381
{MLmetrics} 100 186 4.50 381
{mlr3measures} 100 371 3.93 916

2 x 2 Confusion Matrix (N = 1e7)

Iterations Garbage Collections [gc()] gc() pr. second Memory Allocation (MB)
{SLmetrics} 100 0 0.00 0
{yardstick} 100 149 4.30 420
{MLmetrics} 100 15 2.00 76
{mlr3measures} 100 12 1.29 76

RMSE (N = 1e7)

In both tasks, {SLmetrics} remains extremely memory-efficient, even at large sample sizes.

Important

From {bench} documentation: Total amount of memory allocated by R while running the expression. Memory allocated outside the R heap, e.g. by malloc() or new directly is not tracked, take care to avoid misinterpreting the results if running code that may do this.

ℹ️ Basic usage

In its simplest form, {SLmetrics}-functions work directly with pairs of <numeric> vectors (for regression) or <factor> vectors (for classification). Below we demonstrate this on two well-known datasets, mtcars (regression) and iris (classification).

📚 Regression

We first fit a linear model to predict mpg in the mtcars dataset, then compute the in-sample RMSE:

# Evaluate a linear model on mpg (mtcars)
model <- lm(mpg ~ ., data = mtcars)
rmse(mtcars$mpg, fitted(model))
#> [1] 2.146905

📚 Classification

Now we recode the iris dataset into a binary problem (“virginica” vs. “others”) and fit a logistic regression. Then we generate predicted classes, compute the confusion matrix and summarize it.

# 1) recode iris
# to binary problem
iris$species_num <- as.numeric(
  iris$Species == "virginica"
)

# 2) fit the logistic
# regression
model <- glm(
  formula = species_num ~ Sepal.Length + Sepal.Width,
  data    = iris,
  family  = binomial(
    link = "logit"
  )
)

# 3) generate predicted
# classes
predicted <- factor(
  as.numeric(
    predict(model, type = "response") > 0.5
  ),
  levels = c(1,0),
  labels = c("Virginica", "Others")
)

# 4) generate actual
# values as factor
actual <- factor(
  x = iris$species_num,
  levels = c(1,0),
  labels = c("Virginica", "Others")
)
# 4) generate
# confusion matrix
summary(
  confusion_matrix <-  cmatrix(
    actual    = actual,
    predicted = predicted
  )
)
#> Confusion Matrix (2 x 2) 
#> ================================================================================
#>           Virginica Others
#> Virginica        35     15
#> Others           14     86
#> ================================================================================
#> Overall Statistics (micro average)
#>  - Accuracy:          0.81
#>  - Balanced Accuracy: 0.78
#>  - Sensitivity:       0.81
#>  - Specificity:       0.81
#>  - Precision:         0.81

ℹ️ Enable OpenMP

Important

OpenMP support in {SLmetrics} is experimental. Use it with caution, as performance gains and stability may vary based on your system configuration and workload.

You can control OpenMP usage within {SLmetrics} using the setUseOpenMP function. Below are examples demonstrating how to enable and disable OpenMP:

# enable OpenMP
SLmetrics::setUseOpenMP(TRUE)
#> OpenMP usage set to: enabled

# disable OpenMP
SLmetrics::setUseOpenMP(FALSE)
#> OpenMP usage set to: disabled

To illustrate the impact of OpenMP on performance, consider the following benchmarks for calculating entropy on a 1,000,000 x 200 matrix over 100 iterations2.

📚 Entropy without OpenMP

Iterations Runtime (sec) Garbage Collections [gc()] gc() pr. second Memory Allocation (MB)
100 2.5 0 0 0

1e6 x 200 matrix without OpenMP

📚 Entropy with OpenMP

Iterations Runtime (sec) Garbage Collections [gc()] gc() pr. second Memory Allocation (MB)
100 0.64 0 0 0

1e6 x 200 matrix with OpenMP

🛡️ Stable version

## install stable release
devtools::install_github(
  repo = 'https://github.com/serkor1/SLmetrics@*release',
  ref  = 'main'
)

🛠️ Development version

## install development version
devtools::install_github(
  repo = 'https://github.com/serkor1/SLmetrics',
  ref  = 'development'
)

ℹ️ Code of Conduct

Please note that the {SLmetrics} project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Footnotes

  1. The source code for these benchmarks is available here.

  2. The source code for these benchmarks is available here.