Spring Cleaning 2023 #124

merged 8 commits into from
Mar 7, 2023
50 changes: 24 additions & 26 deletions → .github/
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, or sexual identity and
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.

We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
Expand All @@ -21,25 +21,25 @@ community include:
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the overall

Examples of unacceptable behavior include:

* The use of sexualized language or imagery, and sexual attention or
advances of any kind
* The use of sexualized language or imagery, and sexual attention or advances of
any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email
address, without their explicit permission
* Publishing others' private information, such as a physical or email address,
without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
professional setting

## Enforcement Responsibilities

Community leaders are responsible for clarifying and enforcing our standards
of acceptable behavior and will take appropriate and fair corrective action in
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.

Expand All @@ -50,17 +50,17 @@ decisions when appropriate.

## Scope

This Code of Conduct applies within all community spaces, and also applies
when an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail
address, posting via an official social media account, or acting as an appointed
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.

## Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at [INSERT CONTACT
METHOD]. All complaints will be reviewed and investigated promptly and fairly.
reported to the community leaders responsible for enforcement at [email protected].
All complaints will be reviewed and investigated promptly and fairly.

All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
Expand Down Expand Up @@ -114,15 +114,13 @@ community.
## Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.0,
available at
version 2.1, available at

Community Impact Guidelines were inspired by [Mozilla's code of conduct
enforcement ladder](

Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][].

For answers to common questions about this code of conduct, see the FAQ at Translations are available at https://
<>. Translations are available at <>.

17 changes: 9 additions & 8 deletions .github/workflows/R-CMD-check.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,25 +22,26 @@ jobs:
fail-fast: false
- {os: macOS-latest, r: 'release'}
- {os: macos-latest, r: 'release'}

- {os: windows-latest, r: 'release'}
# Use 3.6 to trigger usage of RTools35
- {os: windows-latest, r: '3.6'}
# use 4.1 to check with rtools40's older compiler
- {os: windows-latest, r: '4.1'}

# Use older ubuntu to maximise backward compatibility
- {os: ubuntu-20.04, r: 'devel', http-user-agent: 'release'}
- {os: ubuntu-20.04, r: 'release'}
- {os: ubuntu-20.04, r: 'oldrel-1'}
- {os: ubuntu-20.04, r: 'oldrel-2'}
- {os: ubuntu-20.04, r: 'oldrel-3'}
- {os: ubuntu-latest, r: 'devel', http-user-agent: 'release'}
- {os: ubuntu-latest, r: 'release'}
- {os: ubuntu-latest, r: 'oldrel-1'}
- {os: ubuntu-latest, r: 'oldrel-2'}
- {os: ubuntu-latest, r: 'oldrel-3'}


- uses: actions/checkout@v2
- uses: actions/checkout@v3

- uses: r-lib/actions/setup-pandoc@v2

Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/pkgdown.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
- uses: actions/checkout@v2
- uses: actions/checkout@v3

- uses: r-lib/actions/setup-pandoc@v2

Expand All @@ -39,7 +39,7 @@ jobs:

- name: Deploy to GitHub pages 🚀
if: github.event_name != 'pull_request'
uses: JamesIves/[email protected].4
uses: JamesIves/github-pages-deploy-action@v4.4.1
clean: false
branch: gh-pages
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/pr-commands.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jobs:
- uses: actions/checkout@v2
- uses: actions/checkout@v3

- uses: r-lib/actions/pr-fetch@v2
Expand Down Expand Up @@ -51,7 +51,7 @@ jobs:
- uses: actions/checkout@v2
- uses: actions/checkout@v3

- uses: r-lib/actions/pr-fetch@v2
Expand Down
23 changes: 21 additions & 2 deletions .github/workflows/test-coverage.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:

- uses: actions/checkout@v2
- uses: actions/checkout@v3

- uses: r-lib/actions/setup-r@v2
Expand All @@ -27,5 +27,24 @@ jobs:
needs: coverage

- name: Test coverage
run: covr::codecov(quiet = FALSE)
run: |
quiet = FALSE,
clean = FALSE,
install_path = file.path(Sys.getenv("RUNNER_TEMP"), "package")
shell: Rscript {0}

- name: Show testthat output
if: always()
run: |
## --------------------------------------------------------------------
find ${{ runner.temp }}/package -name 'testthat.Rout*' -exec cat '{}' \; || true
shell: bash

- name: Upload test results
if: failure()
uses: actions/upload-artifact@v3
name: coverage-test-failures
path: ${{ runner.temp }}/package
8 changes: 5 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
Package: themis
Title: Extra Recipes Steps for Dealing with Unbalanced Data
person("Emil", "Hvitfeldt", , "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0002-0679-1945"))
Authors@R: c(
person("Emil", "Hvitfeldt", , "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0002-0679-1945")),
person(given = "Posit Software, PBC", role = c("cph", "fnd"))
Description: A dataset with an uneven number of cases in each class is
said to be unbalanced. Many models produce a subpar performance on
unbalanced datasets. A dataset can be balanced by increasing the
Expand Down
4 changes: 2 additions & 2 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
YEAR: 2020
YEAR: 2023
COPYRIGHT HOLDER: themis authors
2 changes: 1 addition & 1 deletion
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# MIT License

Copyright (c) 2020 Emil Hvitfeldt
Copyright (c) 2023 themis authors

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
9 changes: 7 additions & 2 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@ install.packages("themis")
Install the development version from GitHub with:

``` r
# install.packages("remotes")
# install.packages("pak")

## Example
Expand Down Expand Up @@ -73,6 +73,7 @@ ds_rec %>%
Below is some unbalanced data. Used for examples latter.

#| fig-alt: "Bar chart with 5 columns. class on the x-axis and count on the y-axis. Class a has height 10, b has 20, c has 30, d has 40, and e has 50."
example_data <- data.frame(class = letters[rep(1:5, 1:5 * 10)],
x = rnorm(150))

Expand All @@ -99,6 +100,7 @@ The following methods all share the tuning parameter `over_ratio`, which is the
By setting `over_ratio = 1` you bring the number of samples of all minority classes equal to 100% of the majority class.

#| fig-alt: "Bar chart with 5 columns. class on the x-axis and count on the y-axis. class a, b, c, d, and e all have a height of 50."
recipe(~., example_data) %>%
step_upsample(class, over_ratio = 1) %>%
prep() %>%
Expand All @@ -110,6 +112,7 @@ recipe(~., example_data) %>%
and by setting `over_ratio = 0.5` we upsample any minority class with less samples then 50% of the majority up to have 50% of the majority.

#| fig-alt: "Bar chart with 5 columns. class on the x-axis and count on the y-axis. Class a has height 25, b has 25, c has 30, d has 40, and e has 50."
recipe(~., example_data) %>%
step_upsample(class, over_ratio = 0.5) %>%
prep() %>%
Expand All @@ -131,6 +134,7 @@ Most of the the following methods all share the tuning parameter `under_ratio`,
By setting `under_ratio = 1` you bring the number of samples of all majority classes equal to 100% of the minority class.

#| fig-alt: "Bar chart with 5 columns. class on the x-axis and count on the y-axis. Class a, b, c, d, and e all have a height of 10."
recipe(~., example_data) %>%
step_downsample(class, under_ratio = 1) %>%
prep() %>%
Expand All @@ -142,6 +146,7 @@ recipe(~., example_data) %>%
and by setting `under_ratio = 2` we downsample any majority class with more then 200% samples of the minority class down to have to 200% samples of the minority.

#| fig-alt: "Bar chart with 5 columns. class on the x-axis and count on the y-axis. Class a has height 10, b, c, d, and e have ha height of 20."
recipe(~., example_data) %>%
step_downsample(class, under_ratio = 2) %>%
prep() %>%
Expand Down
38 changes: 19 additions & 19 deletions
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ install.packages("themis")
Install the development version from GitHub with:

``` r
# install.packages("remotes")
# install.packages("pak")

## Example
Expand Down Expand Up @@ -93,7 +93,7 @@ example_data %>%

<img src="man/figures/README-unnamed-chunk-2-1.png" width="100%" />
<img src="man/figures/README-unnamed-chunk-2-1.png" alt="Bar chart with 5 columns. class on the x-axis and count on the y-axis. Class a has height 10, b has 20, c has 30, d has 40, and e has 50." width="100%" />

### Upsample / Over-sampling

Expand Down Expand Up @@ -121,7 +121,7 @@ recipe(~., example_data) %>%

<img src="man/figures/README-unnamed-chunk-3-1.png" width="100%" />
<img src="man/figures/README-unnamed-chunk-3-1.png" alt="Bar chart with 5 columns. class on the x-axis and count on the y-axis. class a, b, c, d, and e all have a height of 50." width="100%" />

and by setting `over_ratio = 0.5` we upsample any minority class with
less samples then 50% of the majority up to have 50% of the majority.
Expand All @@ -135,7 +135,7 @@ recipe(~., example_data) %>%

<img src="man/figures/README-unnamed-chunk-4-1.png" width="100%" />
<img src="man/figures/README-unnamed-chunk-4-1.png" alt="Bar chart with 5 columns. class on the x-axis and count on the y-axis. Class a has height 25, b has 25, c has 30, d has 40, and e has 50." width="100%" />

### Downsample / Under-sampling

Expand All @@ -161,7 +161,7 @@ recipe(~., example_data) %>%

<img src="man/figures/README-unnamed-chunk-5-1.png" width="100%" />
<img src="man/figures/README-unnamed-chunk-5-1.png" alt="Bar chart with 5 columns. class on the x-axis and count on the y-axis. Class a, b, c, d, and e all have a height of 10." width="100%" />

and by setting `under_ratio = 2` we downsample any majority class with
more then 200% samples of the minority class down to have to 200%
Expand All @@ -176,26 +176,26 @@ recipe(~., example_data) %>%

<img src="man/figures/README-unnamed-chunk-6-1.png" width="100%" />
<img src="man/figures/README-unnamed-chunk-6-1.png" alt="Bar chart with 5 columns. class on the x-axis and count on the y-axis. Class a has height 10, b, c, d, and e have ha height of 20." width="100%" />

## Contributing

This project is released with a [Contributor Code of
By contributing to this project, you agree to abide by its terms.

- For questions and discussions about tidymodels packages, modeling,
and machine learning, [join us on RStudio
- For questions and discussions about tidymodels packages, modeling, and
machine learning, [join us on RStudio

- If you think you have encountered a bug, please [submit an
- If you think you have encountered a bug, please [submit an

- Either way, learn how to create and share a
(a minimal, reproducible example), to clearly communicate about your
- Either way, learn how to create and share a
(a minimal, reproducible example), to clearly communicate about your

- Check out further details on [contributing guidelines for tidymodels
packages]( and [how to get
- Check out further details on [contributing guidelines for tidymodels
packages]( and [how to get
Binary file modified man/figures/logo.png
7 changes: 6 additions & 1 deletion man/themis-package.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.