Skip to content

Commit

Permalink
Documentation Overhauld 📚 (#21)
Browse files Browse the repository at this point in the history
* [DOCUMENATION] Streamlined Examples 📚

* All examples are named according to their S3-method file. This should make it easier to navigate through the repository.

* All examples are wrapped in `cat()` and looks more user-friendly.

* All examples are streamlined and looks uniform across functions.

* [DOCUMENATION] Package-level documentation 📚

* Added package-level documentation that describes handling of misssing values and the general idea of the package.

* [DOCUMENTATION] Added @family-tag 📚

* All functions now has a @family tag based on wether its supervised, or unsupervised.

NOTE: This has the unintended consequence of adding regression metrics to  classification metrics...

* [DOCUMENTATION] Detailed `na.rm` beahviour 📚

* The `na.rm` is now properly documented 📚

* The package documenatation needed escaped brackets; these have been added to retian {pkg}-format.

* [DOCUMENTATION] Updated function documentations 📚

* All functions now have a (short) description of their weighted versions where applicable.

* Functions that were missing a calculation section have had it added.

* Refactored documentation on default values. Its now on the form (default: value).

* [DOCUMENATION] Updated performance tests 📚

* The performance tests have been updated and now uses mean instead of median.

* The number of samples have been reduced, but has a higher sample size.

* Rendered README.
  • Loading branch information
serkor1 authored Dec 20, 2024
1 parent 48d707a commit 9aec082
Show file tree
Hide file tree
Showing 133 changed files with 3,513 additions and 2,474 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -34,5 +34,5 @@ Imports:
Rcpp
VignetteBuilder: knitr
Depends:
R (>= 2.10)
R (>= 3.5)
URL: https://serkor1.github.io/SLmetrics/
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ S3method(baccuracy,factor)
S3method(ccc,numeric)
S3method(ckappa,cmatrix)
S3method(ckappa,factor)
S3method(cmatrix,factor)
S3method(csi,cmatrix)
S3method(csi,factor)
S3method(dor,cmatrix)
Expand Down
2 changes: 2 additions & 0 deletions NEWS.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ set.seed(1903)
## Improvements

* **documentation:** The documentation has gotten some extra love, and now all functions have their formulas embedded, the details section have been freed from a general description of [factor] creation. This will make room for future expansions on the various functions where more details are required.

* **weighted classification metrics:** The `cmatrix()`-function now accepts the argument `w` which is the sample weights; if passed the respective method will return the weighted metric. Below is an example using sample weights for the confusion matrix,

```{r}
Expand Down
6 changes: 6 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,12 @@
## Improvements

- **documentation:** The documentation has gotten some extra love, and
now all functions have their formulas embedded, the details section
have been freed from a general description of \[factor\] creation.
This will make room for future expansions on the various functions
where more details are required.

- **weighted classification metrics:** The `cmatrix()`-function now
accepts the argument `w` which is the sample weights; if passed the
respective method will return the weighted metric. Below is an example
Expand Down
48 changes: 4 additions & 44 deletions R/RcppExports.R
Original file line number Diff line number Diff line change
Expand Up @@ -64,50 +64,10 @@ ckappa.cmatrix <- function(x, beta = 0.0, ...) {
.Call(`_SLmetrics_ckappa_cmatrix`, x, beta)
}

#' Confusion Matrix
#'
#' @description
#'
#' The [cmatrix()]-function uses cross-classifying factors to build
#' a confusion matrix of the counts at each combination of the [factor] levels.
#' Each row of the [matrix] represents the actual [factor] levels, while each
#' column represents the predicted [factor] levels.
#'
#' @usage
#' cmatrix(
#' actual,
#' predicted,
#' w
#' )
#'
#' @param actual A <[factor]>-vector of [length] \eqn{n}, and \eqn{k} levels.
#' @param predicted A <[factor]>-vector of [length] \eqn{n}, and \eqn{k} levels.
#' @param w A <[numeric]>--vector of [length] \eqn{n}. [NULL] by default. If passed it will return a weighted confusion matrix.
#'
#' @example man/examples/scr_confusionmatrix.R
#' @example man/examples/scr_wconfusionmatrix.R
#' @family classification
#'
#' @inherit specificity details
#'
#' @section Dimensions:
#'
#' There is no robust defensive measure against misspecififying
#' the confusion matrix. If the arguments are correctly specified, the resulting
#' confusion matrix is on the form:
#'
#' | | A (Predicted) | B (Predicted) |
#' | :----------|:-------------:| -------------:|
#' | A (Actual) | Value | Value |
#' | B (Actual) | Value | Value |
#'
#'
#' @returns
#'
#' A named \eqn{k} x \eqn{k} <[matrix]> of [class] <cmatrix>
#'
#' @rdname cmatrix
#' @method cmatrix factor
#' @export
cmatrix <- function(actual, predicted, w = NULL) {
cmatrix.factor <- function(actual, predicted, w = NULL, ...) {
.Call(`_SLmetrics_cmatrix`, actual, predicted, w)
}

Expand Down Expand Up @@ -638,7 +598,7 @@ rsq.numeric <- function(actual, predicted, k = 0.0, ...) {
.Call(`_SLmetrics_rsq`, actual, predicted, k)
}

#' @rdname weighted.rsq
#' @rdname rsq
#' @method weighted.rsq numeric
#' @export
weighted.rsq.numeric <- function(actual, predicted, w, k = 0.0, ...) {
Expand Down
23 changes: 15 additions & 8 deletions R/S3_Accuracy.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,18 @@
# script start;

#' Compute the \eqn{\text{accuracy}}
#'
#' The [accuracy()]-function computes the [accuracy](https://en.wikipedia.org/wiki/Precision_and_recall) between two
#' vectors of predicted and observed [factor()] values.
#'
#'
#'
#' @description
#'
#' The [accuracy()] function computes the [accuracy](https://en.wikipedia.org/wiki/Precision_and_recall) between two
#' vectors of predicted and observed [factor()] values. The [weighted.accuracy()] function computes the weighted accuracy.
#'
#' @param actual A vector of <[factor]>- of [length] \eqn{n}, and \eqn{k} levels
#' @param predicted A vector of <[factor]>-vector of [length] \eqn{n}, and \eqn{k} levels
#' @param w A <[numeric]>-vector of [length] \eqn{n}. [NULL] by default
#' @param x A confusion matrix created [cmatrix()]
#' @param ... Arguments passed into other methods
#'
#' @inherit specificity
#'
#' @section Calculation:
Expand All @@ -26,10 +33,10 @@
#'
#' A <[numeric]>-vector of [length] 1
#'
#' @example man/examples/scr_accuracy.R
#'
#' @family classification
#' @example man/examples/scr_Accuracy.R
#'
#' @family Classification
#' @family Supervised Learning
#' @export
accuracy <- function(...) {
UseMethod(
Expand Down
11 changes: 6 additions & 5 deletions R/S3_BalancedAccuracy.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@
#' Compute the \eqn{\text{balanced}} \eqn{\text{accuracy}}
#'
#' The [baccuracy()]-function computes the [balanced accuracy](https://neptune.ai/blog/balanced-accuracy) between two
#' vectors of predicted and observed [factor()] values.
#' vectors of predicted and observed [factor()] values. The [weighted.baccuracy()] function computes the weighted balanced accuracy.
#'
#'
#' @inherit specificity
#' @param adjust A [logical] value. [FALSE] by default. If [TRUE] the metric is adjusted for random change \eqn{\frac{1}{k}}
#' @inherit accuracy
#' @param adjust A [logical] value (default: [FALSE]). If [TRUE] the metric is adjusted for random chance \eqn{\frac{1}{k}}
#'
#' @section Calculation:
#'
Expand All @@ -27,9 +27,10 @@
#'
#' A [numeric]-vector of [length] 1
#'
#' @example man/examples/scr_baccuracy.R
#' @example man/examples/scr_BalancedAccuracy.R
#'
#' @family classification
#' @family Classification
#' @family Supervised Learning
#'
#' @export
baccuracy <- function(...) {
Expand Down
7 changes: 5 additions & 2 deletions R/S3_CoefficientOfDetermination.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
#' and predicted <[numeric]> vectors. By default [rsq()] returns the unadjusted \eqn{R^2}. For adjusted \eqn{R^2} set \eqn{k = \kappa - 1}, where \eqn{\kappa} is the number of parameters.
#'
#' @inherit huberloss
#' @param k A <[numeric]>-vector of [length] 1. 0 by default. If \eqn{k>0}
#' @param k A <[numeric]>-vector of [length] 1 (default: 0). If \eqn{k>0}
#' the function returns the adjusted \eqn{R^2}.
#'
#' @section Calculation:
Expand All @@ -24,7 +24,10 @@
#'
#' Where \eqn{\text{SSE}} is the sum of squared errors, \eqn{\text{SST}} is total sum of squared errors, \eqn{n} is the number of observations, and \eqn{k} is the number of non-constant parameters.
#'
#' @family regression
#' @example man/examples/scr_CoefficientOfDetermination.R
#'
#' @family Regression
#' @family Supervised Learning
#' @export
rsq <- function(...) {
UseMethod(
Expand Down
28 changes: 18 additions & 10 deletions R/S3_CohensKappa.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,22 +8,30 @@
#'
#' @description
#' The [kappa()]-function computes [Cohen's \eqn{\kappa}](https://en.wikipedia.org/wiki/Cohen%27s_kappa), a statistic that measures inter-rater agreement for categorical items between
#' two vectors of predicted and observed [factor()] values.
#' two vectors of predicted and observed [factor()] values. The [weighted.ckappa()] function computes the weighted \eqn{\kappa}-statistic.
#'
#' If \eqn{\beta \neq 0} the off-diagonals of the confusion matrix are penalized with a factor of
#' \eqn{(y_{+} - y_{i,-})^\beta}. See below for further details.
#'
#'
#' @example man/examples/scr_kappa.R
#'
#' @inherit specificity
#'
#' @inheritParams specificity
#' @param beta A <[numeric]> value of [length] 1. 0 by default. If set to a value different from zero, the off-diagonal confusion matrix will be penalized.
#'
#'
#' @section Calculation
#' @family classification
#' @inheritParams accurracy
#' @param beta A <[numeric]> value of [length] 1 (default: 0). If set to a value different from zero, the off-diagonal confusion matrix will be penalized.
#'
#' @example man/examples/scr_CohensKappa.R
#'
#' @section Calculation:
#'
#' \deqn{
#' \frac{\rho_p - \rho_e}{1-\rho_e}
#' }
#'
#' where \eqn{\rho_p} is the empirical probability of agreement between predicted and actual values, and \eqn{\rho_e} is the expected probability of agreement under random chance.
#'
#'
#' @family Classification
#' @family Supervised Learning
#'
#' @export
ckappa <- function(...) {
UseMethod(
Expand Down
15 changes: 8 additions & 7 deletions R/S3_ConcordanceCorrelationCoefficient.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,15 @@
#'
#' @description
#' The [ccc()]-function computes the simple and weighted [concordance correlation coefficient](https://en.wikipedia.org/wiki/Concordance_correlation_coefficient) between
#' the two vectors of predicted and observed <[numeric]> values. If `w` is not [NULL], the function returns the weighted [concordance correlation coefficient](https://en.wikipedia.org/wiki/Concordance_correlation_coefficient).
#'
#' the two vectors of predicted and observed <[numeric]> values. The [weighted.ccc()] function computes the weighted Concordance Correlation Coefficient.
#' If `correction` is [TRUE] \eqn{\sigma^2} is adjusted by \eqn{\frac{1-n}{n}} in the intermediate steps.
#'
#' @inherit huberloss
#' @param correction A <[logical]> vector of [length] 1. [FALSE] by default. If [TRUE] the variance and covariance
#'
#' @param correction A <[logical]> vector of [length] \eqn{1} (default: [FALSE]). If [TRUE] the variance and covariance
#' will be adjusted with \eqn{\frac{1-n}{n}}
#'
#' @example man/examples/scr_ccc.R
#' @example man/examples/scr_ConcordanceCorrelationCoefficient.R
#'
#' @section Calculation:
#'
Expand All @@ -26,10 +27,10 @@
#' }
#'
#' Where \eqn{\rho} is the \eqn{\text{pearson correlation coefficient}}, \eqn{\sigma} is the \eqn{\text{standard deviation}} and \eqn{\mu} is the simple mean of `actual` and `predicted`.
#'
#'
#' If `w` is not [NULL], all calculations are based on the weighted measures.
#'
#' @family regression
#' @family Regression
#' @family Supervised Learning
#' @export
ccc <- function(...) {
UseMethod(
Expand Down
46 changes: 46 additions & 0 deletions R/S3_ConfusionMatrix.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,52 @@
# objective:
# script start;

#' Confusion Matrix
#'
#' @description
#'
#' The [cmatrix()]-function uses cross-classifying factors to build
#' a confusion matrix of the counts at each combination of the [factor] levels.
#' Each row of the [matrix] represents the actual [factor] levels, while each
#' column represents the predicted [factor] levels.
#'
#' @param actual A <[factor]>-vector of [length] \eqn{n}, and \eqn{k} levels.
#' @param predicted A <[factor]>-vector of [length] \eqn{n}, and \eqn{k} levels.
#' @param w A <[numeric]>-vector of [length] \eqn{n} (default: [NULL]) If passed it will return a weighted confusion matrix.
#' @param ... Arguments passed into other methods.
#'
#' @example man/examples/scr_ConfusionMatrix.R
#'
#' @family Classification
#' @family Supervised Learning
#'
#' @inherit specificity details
#'
#' @section Dimensions:
#'
#' There is no robust defensive measure against misspecififying
#' the confusion matrix. If the arguments are correctly specified, the resulting
#' confusion matrix is on the form:
#'
#' | | A (Predicted) | B (Predicted) |
#' | :----------|:-------------:| -------------:|
#' | A (Actual) | Value | Value |
#' | B (Actual) | Value | Value |
#'
#'
#' @returns
#'
#' A named \eqn{k} x \eqn{k} <[matrix]> of [class] <cmatrix>
#'
#' @export
cmatrix <- function(...) {
UseMethod(
generic = "cmatrix",
object = ..1
)
}


#' @export
print.cmatrix <- function(
x,
Expand Down
23 changes: 7 additions & 16 deletions R/S3_FBetaScore.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,35 +10,26 @@
#'
#' @description
#' The [fbeta()]-function computes the [\eqn{F_\beta} score](https://en.wikipedia.org/wiki/F1_score), the weighted harmonic mean of [precision()] and [recall()], between
#' two vectors of predicted and observed [factor()] values. The parameter \eqn{\beta} determines the weight of precision and recall in the combined score.
#' two vectors of predicted and observed [factor()] values. The parameter \eqn{\beta} determines the weight of precision and recall in the combined score. The [weighted.fbeta()] function computes the weighted \eqn{F_\beta} score.
#'
#' When `aggregate = TRUE`, the function returns the micro-average \eqn{F_\beta} score across all classes \eqn{k}. By default, it returns the class-wise \eqn{F_\beta} score.
#'
#'
#' @example man/examples/scr_fbeta.R
#' @example man/examples/scr_FBetaScore.R
#'
#' @inherit specificity
#' @param beta A <[numeric]> vector of length 1. 1 by default, see calculations.
#' @param beta A <[numeric]> vector of [length] \eqn{1} (default: \eqn{1}).
#'
#' @section Calculation:
#'
#' The metric is calculated for each class \eqn{k} as follows,
#'
#'
#' \deqn{
#' (1 + \beta^2) \frac{\text{Precision}_k \cdot \text{Recall}_k}{(\beta^2 \cdot \text{Precision}_k) + \text{Recall}_k}
#' }
#'
#' Where precision is \eqn{\frac{\#TP_k}{\#TP_k + \#FP_k}} and recall (sensitivity) is \eqn{\frac{\#TP_k}{\#TP_k + \#FN_k}}, and \eqn{\beta} determines the weight of precision relative to recall.
#'
#' When `aggregate = TRUE`, the `micro`-average \eqn{F_\beta} score is calculated,
#'
#' \deqn{
#' (1 + \beta^2) \frac{\sum_{k=1}^K \text{Precision}_k \cdot \sum_{k=1}^K \text{Recall}_k}{(\beta^2 \cdot \sum_{k=1}^K \text{Precision}_k) + \sum_{k=1}^K \text{Recall}_k}
#' }
#'
#'
#' @family classification
#'
#' @family Classification
#' @family Supervised Learning
#'
#' @export
fbeta <- function(...) {
UseMethod(
Expand Down
19 changes: 7 additions & 12 deletions R/S3_FalseDiscoveryRate.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,11 @@
#' Compute the \eqn{\text{false}} \eqn{\text{discovery}} \eqn{\text{rate}}
#'
#' @description
#'
#' The [fdr()]-function computes the [false discovery rate](https://en.wikipedia.org/wiki/False_discovery_rate) (FDR), the proportion of false positives among the predicted positives, between
#' two vectors of predicted and observed [factor()] values.
#' two vectors of predicted and observed [factor()] values. The [weighted.fdr()] function computes the weighted false discovery rate.
#'
#' When `aggregate = TRUE`, the function returns the micro-average FDR across all classes \eqn{k}. By default, it returns the class-wise FDR.
#'
#' @example man/examples/scr_fdr.R
#' @example man/examples/scr_FalseDiscoveryRate.R
#'
#' @inherit specificity
#'
Expand All @@ -25,14 +24,10 @@
#' }
#'
#' Where \eqn{\#TP_k} and \eqn{\#FP_k} is the number of true psotives and false positives, respectively, for each class \eqn{k}.
#'
#' When `aggregate = TRUE` the `micro`-average is calculated,
#'
#' \deqn{
#' \frac{\sum_{k=1}^k \#FP_k}{\sum_{k=1}^k \#TP_k + \sum_{k=1}^k \#FP_k}
#' }
#'
#' @family classification
#'
#' @family Classification
#' @family Supervised Learning
#'
#' @export
fdr <- function(...) {
UseMethod(
Expand Down
Loading

0 comments on commit 9aec082

Please sign in to comment.