From 8cffcd684233a480669cf268e7e3e6c8689acb8f Mon Sep 17 00:00:00 2001 From: serkor1 <77464572+serkor1@users.noreply.github.com> Date: Mon, 6 Jan 2025 05:55:42 +0100 Subject: [PATCH] Revert "Updated .gitignore to exclude _freeze" This reverts commit 9e37665a5efc63d4af0ff923d5e96c1a70674296. --- .gitignore | 2 -- docs/_freeze/intro/execute-results/html.json | 15 --------------- docs/_freeze/site_libs/clipboard/clipboard.min.js | 7 ------- docs/_freeze/summary/execute-results/html.json | 15 --------------- 4 files changed, 39 deletions(-) delete mode 100644 docs/_freeze/intro/execute-results/html.json delete mode 100644 docs/_freeze/site_libs/clipboard/clipboard.min.js delete mode 100644 docs/_freeze/summary/execute-results/html.json diff --git a/.gitignore b/.gitignore index 2ad86fc..6d03c21 100644 --- a/.gitignore +++ b/.gitignore @@ -58,8 +58,6 @@ src/.idea /.quarto/ docs/*/ -!docs/_freeze/ - # html-files # (from rendering markdown/quarto) *.html diff --git a/docs/_freeze/intro/execute-results/html.json b/docs/_freeze/intro/execute-results/html.json deleted file mode 100644 index a6cb192..0000000 --- a/docs/_freeze/intro/execute-results/html.json +++ /dev/null @@ -1,15 +0,0 @@ -{ - "hash": "6f8a663956d151f936ec0be4aa2f61eb", - "result": { - "engine": "knitr", - "markdown": "::: {.callout-note}\nThe disucssion in this section is academic, I have the outmost respect for all the developers, contributors and users of the {pkgs}. We are, afterall, united in our love for programming, data-science and `R`\n:::\n\n# Introduction\n\nThere are currently three {pkgs} that are developed with machine leaning performance evaluation in mind: [{MLmetrics}](https://github.com/yanyachen/MLmetrics), [{yardstick}](https://github.com/tidymodels/yardstick), [{mlr3measures}](https://github.com/mlr-org/mlr3measures). These {pkgs} have historically bridged the gap between `R` and `Python` in terms of machine learning and data science.\n\n## The status-quo of {pkgs}\n\n[{MLmetrics}](https://github.com/yanyachen/MLmetrics) can be considered *the* legacy code when it comes to performance evaluation, and it served as a backend in [{yardstick}](https://github.com/tidymodels/yardstick) up to [version 0.0.2](https://yardstick.tidymodels.org/news/index.html#yardstick-002). It is built entirely on base R, and has been stable since its inception almost 10 years ago.\n\nHowever, it appears that the development has reached it's peak and is currently stale - see, for example, this stale [PR](https://github.com/yanyachen/MLmetrics/pull/3) related to this [issue](https://github.com/yanyachen/MLmetrics/issues/2). Micro- and macro-averages have been implented in [{scikit-learn}](https://github.com/scikit-learn/scikit-learn) for many years, and [{MLmetrics}](https://github.com/yanyachen/MLmetrics) simply didn't keep up with the development.\n\n[{yardstick}](https://github.com/tidymodels/yardstick), on the other hand, carried the torch forward and implemented these modern features. [{yardstick}](https://github.com/tidymodels/yardstick) closely follows the syntax, naming and functionality of [{scikit-learn}](https://github.com/scikit-learn/scikit-learn) but is built with [{tidyverse}](https://github.com/tidyverse) tools; although the source code is nice to look at, it does introduce some serious overhead and carries the risk of deprecations.\n\nFurthermore, it complicates a simple application by its verbose function naming, see for example `metric()`-function for `` and `metric_vec()`-function for `` - the output is the same, but the call is different. [{yardstick}](https://github.com/tidymodels/yardstick) can't handle more than one positive class at a time, so the end-user is forced to run the same function more than once to get performance metrics for the adjacent classes.\n\n### Summary\n\nIn short, the existing {pkgs} are outdated, inefficient and insufficient for modern large-scale machine learning applications.\n\n## Why {SLmetrics}?\n\nAs the name suggests, [{SLmetrics}](https://github.com/serkor1/SLmetrics) closely resembles [{MLmetrics}](https://github.com/yanyachen/MLmetrics) in it's *simplistic* and *low-level* implementation of machine learning metrics. The resemblance ends there, however.\n\n[{SLmetrics}](https://github.com/serkor1/SLmetrics) are developed with three things in mind: *speed*, *efficiency* and *scalability*. And therefore addresses the shortcomings of the status-quo by construction - the {pkg} is built on `c++` and [{Rcpp}](https://github.com/RcppCore/Rcpp) from the ground up. See @tbl-rmse-speed where \n\n\n\n\n\n\n\n\n::: {#tbl-rmse-speed .cell tbl-cap='Calculating RMSE on 1e7 vectors' messages='false' warnings='false'}\n\n```{.r .cell-code code-fold=\"true\"}\nset.seed(1903)\nactual <- rnorm(1e7)\npredicted <- actual + rnorm(1e7)\n\nbench::mark(\n `{SLmetrics}` = SLmetrics::rmse(actual, predicted),\n `{MLmetrics}` = MLmetrics::RMSE(predicted, actual),\n iterations = 100\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 2 × 6\n expression min median `itr/sec` mem_alloc `gc/sec`\n \n1 {SLmetrics} 81ms 82.5ms 12.1 2.44KB 0 \n2 {MLmetrics} 68.2ms 68.5ms 14.6 76.37MB 350.\n```\n\n\n:::\n:::\n\n\n\n\n\n\n\n\nThis shows that well-written `R`-code is hard to beat speed-wise. [{MLmetrics}](https://github.com/yanyachen/MLmetrics) is roughly 20\\% faster - but uses 30,000 times more memory. How about constructing a confusion matrix\n\n\n\n\n\n\n\n\n::: {#tbl-confusion_matrix-speed .cell tbl-cap='Computing a 3x3 confusion matrix on 1e7 vectors'}\n\n```{.r .cell-code code-fold=\"true\"}\nset.seed(1903)\nactual <- factor(sample(letters[1:3], size = 1e7, replace = TRUE))\npredicted <- factor(sample(letters[1:3], size = 1e7, replace = TRUE))\n\nbench::mark(\n `{SLmetrics}` = SLmetrics::cmatrix(actual, predicted),\n `{MLmetrics}` = MLmetrics::ConfusionMatrix(actual, predicted),\n check = FALSE,\n iterations = 100\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 2 × 6\n expression min median `itr/sec` mem_alloc `gc/sec`\n \n1 {SLmetrics} 6.08ms 6.12ms 163. 2.51KB 0 \n2 {MLmetrics} 303.5ms 324.09ms 2.95 381.64MB 7.63\n```\n\n\n:::\n:::\n\n\n\n\n\n\n\n\n[{SLmetrics}](https://github.com/serkor1/SLmetrics) uses 1/50th of the time [{MLmetrics}](https://github.com/yanyachen/MLmetrics) and the memory usage is equivalent as the previous example but uses significantly less memory than [{MLmetrics}](https://github.com/yanyachen/MLmetrics).\n\n### Summary\n\n[{SLmetrics}](https://github.com/serkor1/SLmetrics) is, in the worst-case scenario, on par with low-level `R` implementations of equivalent metrics and is a multitude more memory-efficient than *any* of the {pkgs}. A detailed benchmark can be found here.\n\n## Key takeaways\n\n", - "supporting": [], - "filters": [ - "rmarkdown/pagebreak.lua" - ], - "includes": {}, - "engineDependencies": {}, - "preserve": {}, - "postProcess": true - } -} \ No newline at end of file diff --git a/docs/_freeze/site_libs/clipboard/clipboard.min.js b/docs/_freeze/site_libs/clipboard/clipboard.min.js deleted file mode 100644 index 1103f81..0000000 --- a/docs/_freeze/site_libs/clipboard/clipboard.min.js +++ /dev/null @@ -1,7 +0,0 @@ -/*! - * clipboard.js v2.0.11 - * https://clipboardjs.com/ - * - * Licensed MIT © Zeno Rocha - */ -!function(t,e){"object"==typeof exports&&"object"==typeof module?module.exports=e():"function"==typeof define&&define.amd?define([],e):"object"==typeof exports?exports.ClipboardJS=e():t.ClipboardJS=e()}(this,function(){return n={686:function(t,e,n){"use strict";n.d(e,{default:function(){return b}});var e=n(279),i=n.n(e),e=n(370),u=n.n(e),e=n(817),r=n.n(e);function c(t){try{return document.execCommand(t)}catch(t){return}}var a=function(t){t=r()(t);return c("cut"),t};function o(t,e){var n,o,t=(n=t,o="rtl"===document.documentElement.getAttribute("dir"),(t=document.createElement("textarea")).style.fontSize="12pt",t.style.border="0",t.style.padding="0",t.style.margin="0",t.style.position="absolute",t.style[o?"right":"left"]="-9999px",o=window.pageYOffset||document.documentElement.scrollTop,t.style.top="".concat(o,"px"),t.setAttribute("readonly",""),t.value=n,t);return e.container.appendChild(t),e=r()(t),c("copy"),t.remove(),e}var f=function(t){var e=1