Documentation Overhauld 📚 (#21)

* [DOCUMENATION] Streamlined Examples 📚 * All examples are named according to their S3-method file. This should make it easier to navigate through the repository. * All examples are wrapped in `cat()` and looks more user-friendly. * All examples are streamlined and looks uniform across functions. * [DOCUMENATION] Package-level documentation 📚 * Added package-level documentation that describes handling of misssing values and the general idea of the package. * [DOCUMENTATION] Added @family-tag 📚 * All functions now has a @family tag based on wether its supervised, or unsupervised. NOTE: This has the unintended consequence of adding regression metrics to classification metrics... * [DOCUMENTATION] Detailed `na.rm` beahviour 📚 * The `na.rm` is now properly documented 📚 * The package documenatation needed escaped brackets; these have been added to retian {pkg}-format. * [DOCUMENTATION] Updated function documentations 📚 * All functions now have a (short) description of their weighted versions where applicable. * Functions that were missing a calculation section have had it added. * Refactored documentation on default values. Its now on the form (default: value). * [DOCUMENATION] Updated performance tests 📚 * The performance tests have been updated and now uses mean instead of median. * The number of samples have been reduced, but has a higher sample size. * Rendered README.
serkor1 · Dec 20, 2024 · 9aec082 · 9aec082
1 parent 48d707a
commit 9aec082
Show file tree

Hide file tree

Showing 133 changed files with 3,513 additions and 2,474 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -34,5 +34,5 @@ Imports:
     Rcpp
 VignetteBuilder: knitr
 Depends: 
-    R (>= 2.10)
+    R (>= 3.5)
 URL: https://serkor1.github.io/SLmetrics/
diff --git a/NAMESPACE b/NAMESPACE
@@ -8,6 +8,7 @@ S3method(baccuracy,factor)
 S3method(ccc,numeric)
 S3method(ckappa,cmatrix)
 S3method(ckappa,factor)
+S3method(cmatrix,factor)
 S3method(csi,cmatrix)
 S3method(csi,factor)
 S3method(dor,cmatrix)

diff --git a/NEWS.Rmd b/NEWS.Rmd
@@ -23,6 +23,8 @@ set.seed(1903)
 
 ## Improvements
 
+* **documentation:** The documentation has gotten some extra love, and now all functions have their formulas embedded, the details section have been freed from a general description of [factor] creation. This will make room for future expansions on the various functions where more details are required.
+
 * **weighted classification metrics:** The `cmatrix()`-function now accepts the argument `w` which is the sample weights; if passed the respective method will return the weighted metric. Below is an example using sample weights for the confusion matrix,
 
 ```{r}

diff --git a/NEWS.md b/NEWS.md
@@ -7,6 +7,12 @@
 
 ## Improvements
 
+- **documentation:** The documentation has gotten some extra love, and
+  now all functions have their formulas embedded, the details section
+  have been freed from a general description of \[factor\] creation.
+  This will make room for future expansions on the various functions
+  where more details are required.
+
 - **weighted classification metrics:** The `cmatrix()`-function now
   accepts the argument `w` which is the sample weights; if passed the
   respective method will return the weighted metric. Below is an example

diff --git a/R/RcppExports.R b/R/RcppExports.R
@@ -64,50 +64,10 @@ ckappa.cmatrix <- function(x, beta = 0.0, ...) {
     .Call(`_SLmetrics_ckappa_cmatrix`, x, beta)
 }
 
-#' Confusion Matrix
-#'
-#' @description
-#'
-#' The [cmatrix()]-function uses cross-classifying factors to build
-#' a confusion matrix of the counts at each combination of the [factor] levels.
-#' Each row of the [matrix] represents the actual [factor] levels, while each
-#' column represents the predicted [factor] levels.
-#'
-#' @usage
-#' cmatrix(
-#'   actual,
-#'   predicted,
-#'   w
-#' )
-#'
-#' @param actual A <[factor]>-vector of [length] \eqn{n}, and \eqn{k} levels.
-#' @param predicted A <[factor]>-vector of [length] \eqn{n}, and \eqn{k} levels.
-#' @param w A <[numeric]>--vector of [length] \eqn{n}. [NULL] by default. If passed it will return a weighted confusion matrix.
-#'
-#' @example man/examples/scr_confusionmatrix.R
-#' @example man/examples/scr_wconfusionmatrix.R
-#' @family classification
-#'
-#' @inherit specificity details
-#'
-#' @section Dimensions:
-#'
-#' There is no robust defensive measure against misspecififying
-#' the confusion matrix. If the arguments are correctly specified, the resulting
-#' confusion matrix is on the form:
-#'
-#' |            | A (Predicted) | B (Predicted) |
-#' | :----------|:-------------:| -------------:|
-#' | A (Actual) | Value         | Value         |
-#' | B (Actual) | Value         | Value         |
-#'
-#'
-#' @returns
-#'
-#' A named \eqn{k} x \eqn{k} <[matrix]> of [class] <cmatrix>
-#'
+#' @rdname cmatrix
+#' @method cmatrix factor
 #' @export
-cmatrix <- function(actual, predicted, w = NULL) {
+cmatrix.factor <- function(actual, predicted, w = NULL, ...) {
     .Call(`_SLmetrics_cmatrix`, actual, predicted, w)
 }
 
@@ -638,7 +598,7 @@ rsq.numeric <- function(actual, predicted, k = 0.0, ...) {
     .Call(`_SLmetrics_rsq`, actual, predicted, k)
 }
 
-#' @rdname weighted.rsq
+#' @rdname rsq
 #' @method weighted.rsq numeric
 #' @export
 weighted.rsq.numeric <- function(actual, predicted, w, k = 0.0, ...) {

diff --git a/R/S3_Accuracy.R b/R/S3_Accuracy.R
@@ -5,11 +5,18 @@
 # script start;
 
 #' Compute the \eqn{\text{accuracy}}
-#'
-#' The [accuracy()]-function computes the [accuracy](https://en.wikipedia.org/wiki/Precision_and_recall) between two
-#' vectors of predicted and observed [factor()] values.
-#'
-#'
+#' 
+#' @description
+#'
+#' The [accuracy()] function computes the [accuracy](https://en.wikipedia.org/wiki/Precision_and_recall) between two
+#' vectors of predicted and observed [factor()] values. The [weighted.accuracy()] function computes the weighted accuracy.
+#'
+#' @param actual A vector of <[factor]>- of [length] \eqn{n}, and \eqn{k} levels
+#' @param predicted A vector of <[factor]>-vector of [length] \eqn{n}, and \eqn{k} levels
+#' @param w A <[numeric]>-vector of [length] \eqn{n}. [NULL] by default
+#' @param x A confusion matrix created [cmatrix()]
+#' @param ... Arguments passed into other methods
+#' 
 #' @inherit specificity
 #'
 #' @section Calculation:
@@ -26,10 +33,10 @@
 #'
 #' A <[numeric]>-vector of [length] 1
 #'
-#' @example man/examples/scr_accuracy.R
-#'
-#' @family classification
+#' @example man/examples/scr_Accuracy.R
 #'
+#' @family Classification
+#' @family Supervised Learning
 #' @export
 accuracy <- function(...) {
   UseMethod(

diff --git a/R/S3_BalancedAccuracy.R b/R/S3_BalancedAccuracy.R
@@ -7,11 +7,11 @@
 #' Compute the \eqn{\text{balanced}} \eqn{\text{accuracy}}
 #'
 #' The [baccuracy()]-function computes the [balanced accuracy](https://neptune.ai/blog/balanced-accuracy) between two
-#' vectors of predicted and observed [factor()] values.
+#' vectors of predicted and observed [factor()] values. The [weighted.baccuracy()] function computes the weighted balanced accuracy.
 #'
 #'
-#' @inherit specificity
-#' @param adjust A [logical] value. [FALSE] by default. If [TRUE] the metric is adjusted for random change \eqn{\frac{1}{k}}
+#' @inherit accuracy
+#' @param adjust A [logical] value (default: [FALSE]). If [TRUE] the metric is adjusted for random chance \eqn{\frac{1}{k}}
 #'
 #' @section Calculation:
 #'
@@ -27,9 +27,10 @@
 #'
 #' A [numeric]-vector of [length] 1
 #'
-#' @example man/examples/scr_baccuracy.R
+#' @example man/examples/scr_BalancedAccuracy.R
 #'
-#' @family classification
+#' @family Classification
+#' @family Supervised Learning
 #'
 #' @export
 baccuracy <- function(...) {

diff --git a/R/S3_CoefficientOfDetermination.R b/R/S3_CoefficientOfDetermination.R
@@ -11,7 +11,7 @@
 #' and predicted <[numeric]> vectors. By default [rsq()] returns the unadjusted \eqn{R^2}. For adjusted \eqn{R^2} set \eqn{k = \kappa - 1}, where \eqn{\kappa} is the number of parameters.
 #'
 #' @inherit huberloss
-#' @param k A <[numeric]>-vector of [length] 1. 0 by default. If \eqn{k>0}
+#' @param k A <[numeric]>-vector of [length] 1 (default: 0). If \eqn{k>0}
 #' the function returns the adjusted \eqn{R^2}.
 #'
 #' @section Calculation:
@@ -24,7 +24,10 @@
 #'
 #' Where \eqn{\text{SSE}} is the sum of squared errors, \eqn{\text{SST}} is total sum of squared errors, \eqn{n} is the number of observations, and \eqn{k} is the number of non-constant parameters.
 #'
-#' @family regression
+#' @example man/examples/scr_CoefficientOfDetermination.R
+#' 
+#' @family Regression
+#' @family Supervised Learning
 #' @export
 rsq <- function(...) {
   UseMethod(

diff --git a/R/S3_CohensKappa.R b/R/S3_CohensKappa.R
@@ -8,22 +8,30 @@
 #'
 #' @description
 #' The [kappa()]-function computes [Cohen's \eqn{\kappa}](https://en.wikipedia.org/wiki/Cohen%27s_kappa), a statistic that measures inter-rater agreement for categorical items between
-#' two vectors of predicted and observed [factor()] values.
+#' two vectors of predicted and observed [factor()] values. The [weighted.ckappa()] function computes the weighted \eqn{\kappa}-statistic.
 #'
 #' If \eqn{\beta \neq 0} the off-diagonals of the confusion matrix are penalized with a factor of
 #' \eqn{(y_{+} - y_{i,-})^\beta}. See below for further details.
 #'
-#'
-#' @example man/examples/scr_kappa.R
-#'
 #' @inherit specificity
 #'
-#' @inheritParams specificity
-#' @param beta A <[numeric]> value of [length] 1. 0 by default. If set to a value different from zero, the off-diagonal confusion matrix will be penalized.
-#'
-#'
-#' @section Calculation
-#' @family classification
+#' @inheritParams accurracy
+#' @param beta A <[numeric]> value of [length] 1 (default: 0). If set to a value different from zero, the off-diagonal confusion matrix will be penalized.
+#'
+#' @example man/examples/scr_CohensKappa.R
+#' 
+#' @section Calculation:
+#' 
+#' \deqn{
+#'   \frac{\rho_p - \rho_e}{1-\rho_e}
+#' }
+#' 
+#' where \eqn{\rho_p} is the empirical probability of agreement between predicted and actual values, and \eqn{\rho_e} is the expected probability of agreement under random chance.
+#' 
+#' 
+#' @family Classification
+#' @family Supervised Learning
+#' 
 #' @export
 ckappa <- function(...) {
   UseMethod(

diff --git a/R/S3_ConcordanceCorrelationCoefficient.R b/R/S3_ConcordanceCorrelationCoefficient.R
@@ -8,14 +8,15 @@
 #'
 #' @description
 #' The [ccc()]-function computes the simple and weighted [concordance correlation coefficient](https://en.wikipedia.org/wiki/Concordance_correlation_coefficient) between
-#' the two vectors of predicted and observed <[numeric]> values. If `w` is not [NULL], the function returns the weighted [concordance correlation coefficient](https://en.wikipedia.org/wiki/Concordance_correlation_coefficient).
-#'
+#' the two vectors of predicted and observed <[numeric]> values.  The [weighted.ccc()] function computes the weighted Concordance Correlation Coefficient. 
 #' If `correction` is [TRUE] \eqn{\sigma^2} is adjusted by \eqn{\frac{1-n}{n}} in the intermediate steps.
+#' 
 #' @inherit huberloss
-#' @param correction A <[logical]> vector of [length] 1. [FALSE] by default. If [TRUE] the variance and covariance
+#' 
+#' @param correction A <[logical]> vector of [length] \eqn{1} (default: [FALSE]). If [TRUE] the variance and covariance
 #' will be adjusted with \eqn{\frac{1-n}{n}}
 #'
-#' @example man/examples/scr_ccc.R
+#' @example man/examples/scr_ConcordanceCorrelationCoefficient.R
 #'
 #' @section Calculation:
 #'
@@ -26,10 +27,10 @@
 #' }
 #'
 #' Where \eqn{\rho} is the \eqn{\text{pearson correlation coefficient}}, \eqn{\sigma} is the \eqn{\text{standard deviation}} and \eqn{\mu} is the simple mean of `actual` and `predicted`.
+#' 
 #'
-#' If `w` is not [NULL], all calculations are based on the weighted measures.
-#'
-#' @family regression
+#' @family Regression
+#' @family Supervised Learning
 #' @export
 ccc <- function(...) {
   UseMethod(

diff --git a/R/S3_ConfusionMatrix.R b/R/S3_ConfusionMatrix.R
@@ -5,6 +5,52 @@
 # objective:
 # script start;
 
+#' Confusion Matrix
+#'
+#' @description
+#'
+#' The [cmatrix()]-function uses cross-classifying factors to build
+#' a confusion matrix of the counts at each combination of the [factor] levels.
+#' Each row of the [matrix] represents the actual [factor] levels, while each
+#' column represents the predicted [factor] levels.
+#'
+#' @param actual A <[factor]>-vector of [length] \eqn{n}, and \eqn{k} levels.
+#' @param predicted A <[factor]>-vector of [length] \eqn{n}, and \eqn{k} levels.
+#' @param w A <[numeric]>-vector of [length] \eqn{n} (default: [NULL]) If passed it will return a weighted confusion matrix.
+#' @param ... Arguments passed into other methods.
+#'
+#' @example man/examples/scr_ConfusionMatrix.R
+#' 
+#' @family Classification
+#' @family Supervised Learning
+#'
+#' @inherit specificity details
+#'
+#' @section Dimensions:
+#'
+#' There is no robust defensive measure against misspecififying
+#' the confusion matrix. If the arguments are correctly specified, the resulting
+#' confusion matrix is on the form:
+#'
+#' |            | A (Predicted) | B (Predicted) |
+#' | :----------|:-------------:| -------------:|
+#' | A (Actual) | Value         | Value         |
+#' | B (Actual) | Value         | Value         |
+#'
+#'
+#' @returns
+#'
+#' A named \eqn{k} x \eqn{k} <[matrix]> of [class] <cmatrix>
+#'
+#' @export
+cmatrix <- function(...) {
+  UseMethod(
+    generic = "cmatrix",
+    object  = ..1 
+  )
+}
+
+
 #' @export
 print.cmatrix <- function(
     x,

diff --git a/R/S3_FBetaScore.R b/R/S3_FBetaScore.R
@@ -10,35 +10,26 @@
 #'
 #' @description
 #' The [fbeta()]-function computes the [\eqn{F_\beta} score](https://en.wikipedia.org/wiki/F1_score), the weighted harmonic mean of [precision()] and [recall()], between
-#' two vectors of predicted and observed [factor()] values. The parameter \eqn{\beta} determines the weight of precision and recall in the combined score.
+#' two vectors of predicted and observed [factor()] values. The parameter \eqn{\beta} determines the weight of precision and recall in the combined score. The [weighted.fbeta()] function computes the weighted \eqn{F_\beta} score.
 #'
-#' When `aggregate = TRUE`, the function returns the micro-average \eqn{F_\beta} score across all classes \eqn{k}. By default, it returns the class-wise \eqn{F_\beta} score.
-#'
-#'
-#' @example man/examples/scr_fbeta.R
+#' @example man/examples/scr_FBetaScore.R
 #'
 #' @inherit specificity
-#' @param beta A <[numeric]> vector of length 1. 1 by default, see calculations.
+#' @param beta A <[numeric]> vector of [length] \eqn{1} (default: \eqn{1}).
 #'
 #' @section Calculation:
 #'
 #' The metric is calculated for each class \eqn{k} as follows,
 #'
-#'
 #' \deqn{
 #'   (1 + \beta^2) \frac{\text{Precision}_k \cdot \text{Recall}_k}{(\beta^2 \cdot \text{Precision}_k) + \text{Recall}_k}
 #' }
 #'
 #' Where precision is \eqn{\frac{\#TP_k}{\#TP_k + \#FP_k}} and recall (sensitivity) is \eqn{\frac{\#TP_k}{\#TP_k + \#FN_k}}, and \eqn{\beta} determines the weight of precision relative to recall.
-#'
-#' When `aggregate = TRUE`, the `micro`-average \eqn{F_\beta} score is calculated,
-#'
-#' \deqn{
-#'   (1 + \beta^2) \frac{\sum_{k=1}^K \text{Precision}_k \cdot \sum_{k=1}^K \text{Recall}_k}{(\beta^2 \cdot \sum_{k=1}^K \text{Precision}_k) + \sum_{k=1}^K \text{Recall}_k}
-#' }
-#'
-#'
-#' @family classification
+#' 
+#' @family Classification
+#' @family Supervised Learning
+#' 
 #' @export
 fbeta <- function(...) {
   UseMethod(

diff --git a/R/S3_FalseDiscoveryRate.R b/R/S3_FalseDiscoveryRate.R
@@ -7,12 +7,11 @@
 #' Compute the \eqn{\text{false}} \eqn{\text{discovery}} \eqn{\text{rate}}
 #'
 #' @description
+#' 
 #' The [fdr()]-function computes the [false discovery rate](https://en.wikipedia.org/wiki/False_discovery_rate) (FDR), the proportion of false positives among the predicted positives, between
-#' two vectors of predicted and observed [factor()] values.
+#' two vectors of predicted and observed [factor()] values. The [weighted.fdr()] function computes the weighted false discovery rate.
 #'
-#' When `aggregate = TRUE`, the function returns the micro-average FDR across all classes \eqn{k}. By default, it returns the class-wise FDR.
-#'
-#' @example man/examples/scr_fdr.R
+#' @example man/examples/scr_FalseDiscoveryRate.R
 #'
 #' @inherit specificity
 #'
@@ -25,14 +24,10 @@
 #' }
 #'
 #' Where \eqn{\#TP_k} and \eqn{\#FP_k} is the number of true psotives and false positives, respectively, for each class \eqn{k}.
-#'
-#' When `aggregate = TRUE` the `micro`-average is calculated,
-#'
-#' \deqn{
-#'  \frac{\sum_{k=1}^k \#FP_k}{\sum_{k=1}^k \#TP_k + \sum_{k=1}^k \#FP_k}
-#' }
-#'
-#' @family classification
+#' 
+#' @family Classification
+#' @family Supervised Learning
+#' 
 #' @export
 fdr <- function(...) {
   UseMethod(
-Original file line number
+Diff line change
@@ Expand Up / @@ -23,6 +23,8 @@ set.seed(1903) @@
     ## Improvements
+    * **documentation:** The documentation has gotten some extra love, and now all functions have their formulas embedded, the details section have been freed from a general description of [factor] creation. This will make room for future expansions on the various functions where more details are required.
     * **weighted classification metrics:** The `cmatrix()`-function now accepts the argument `w` which is the sample weights; if passed the respective method will return the weighted metric. Below is an example using sample weights for the confusion matrix,
     ```{r}
@@ Expand Down @@