Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweaks to docs + wrap long lines + add function index #559

Merged
merged 20 commits into from
Jan 25, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,11 @@ These are all minor breaking changes resulting from enhancements and are not exp

* `get_one_to_one()` no longer errors with near-equal values that become identical factor levels (fix #543, thanks to @olivroy for reporting)

# Refactoring
## Refactoring

* Remove dplyr verbs superseded in dplyr 1.0.0 (#547, @olivroy)

* Restyle the package and vignettes according to the [tidyverse style guide](style.tidyverse.org) (#548, olivroy)
* Restyle the package and vignettes according to the [tidyverse style guide](https://style.tidyverse.org) (#548, olivroy)

# janitor 2.2.0 (2023-02-02)

Expand Down
25 changes: 18 additions & 7 deletions R/adorn_ns.R
Original file line number Diff line number Diff line change
@@ -1,14 +1,25 @@
#' Add underlying Ns to a tabyl displaying percentages.
#'
#' This function adds back the underlying Ns to a `tabyl` whose percentages were calculated using `adorn_percentages()`, to display the Ns and percentages together. You can also call it on a non-tabyl data.frame to which you wish to append Ns.
#' This function adds back the underlying Ns to a `tabyl` whose percentages were
#' calculated using [adorn_percentages()], to display the Ns and percentages together.
#' You can also call it on a non-tabyl data.frame to which you wish to append Ns.
#'
#' @param dat a data.frame of class `tabyl` that has had `adorn_percentages` and/or `adorn_pct_formatting` called on it. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param position should the N go in the front, or in the rear, of the percentage?
#' @param ns the Ns to append. The default is the "core" attribute of the input tabyl `dat`, where the original Ns of a two-way `tabyl` are stored. However, if your Ns are stored somewhere else, or you need to customize them beyond what can be done with `format_func`, you can supply them here.
#' @param format_func a formatting function to run on the Ns. Consider defining with [base::format()].
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all columns are adorned except for the first column and columns not of class `numeric`, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to `tabyl`.
#' @param dat A data.frame of class `tabyl` that has had `adorn_percentages` and/or
#' `adorn_pct_formatting` called on it. If given a list of data.frames,
#' this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param position Should the N go in the front, or in the rear, of the percentage?
#' @param ns The Ns to append. The default is the "core" attribute of the input tabyl
#' `dat`, where the original Ns of a two-way `tabyl` are stored. However, if your Ns
#' are stored somewhere else, or you need to customize them beyond what can be done
#' with `format_func`, you can supply them here.
#' @param format_func A formatting function to run on the Ns. Consider defining
#' with [base::format()].
#' @param ... Columns to adorn. This takes a tidyselect specification. By default,
#' all columns are adorned except for the first column and columns not of class
#' `numeric`, but this allows you to manually specify which columns should be adorned,
#' for use on a data.frame that does not result from a call to `tabyl`.
#'
#' @return a data.frame with Ns appended
#' @return A `data.frame` with Ns appended
#' @export
#' @examples
#' mtcars %>%
Expand Down
18 changes: 13 additions & 5 deletions R/adorn_percentages.R
Original file line number Diff line number Diff line change
@@ -1,13 +1,21 @@
#' Convert a data.frame of counts to percentages.
#'
#' This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to adorn in the `...` argument.
#' This function defaults to excluding the first column of the input data.frame,
#' assuming that it contains a descriptive variable, but this can be overridden
#' by specifying the columns to adorn in the `...` argument.
#'
#' @param dat a `tabyl` or other data.frame with a tabyl-like layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param denominator the direction to use for calculating percentages. One of "row", "col", or "all".
#' @param dat A `tabyl` or other data.frame with a tabyl-like layout.
#' If given a list of data.frames, this function will apply itself to each
#' `data.frame` in the list (designed for 3-way `tabyl` lists).
#' @param denominator The direction to use for calculating percentages.
#' One of "row", "col", or "all".
#' @param na.rm should missing values (including NaN) be omitted from the calculations?
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to `tabyl`.
#' @param ... columns to adorn. This takes a <[`tidy-select`][dplyr::dplyr_tidy_select]>
#' specification. By default, all numeric columns (besides the initial column, if numeric)
#' are adorned, but this allows you to manually specify which columns should
#' be adorned, for use on a `data.frame` that does not result from a call to [tabyl()].
#'
#' @return Returns a data.frame of percentages, expressed as numeric values between 0 and 1.
#' @return A `data.frame` of percentages, expressed as numeric values between 0 and 1.
#' @export
#' @examples
#'
Expand Down
31 changes: 23 additions & 8 deletions R/adorn_rounding.R
Original file line number Diff line number Diff line change
@@ -1,16 +1,29 @@
#' Round the numeric columns in a data.frame.
#'
#' @description
#' Can run on any data.frame with at least one numeric column. This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to round in the `...` argument.
#' Can run on any `data.frame` with at least one numeric column.
#' This function defaults to excluding the first column of the input data.frame,
#' assuming that it contains a descriptive variable, but this can be overridden by
#' specifying the columns to round in the `...` argument.
#'
#' If you're formatting percentages, e.g., the result of `adorn_percentages()`, use `adorn_pct_formatting()` instead. This is a more flexible variant for ad-hoc usage. Compared to `adorn_pct_formatting()`, it does not multiply by 100 or pad the numbers with spaces for alignment in the results data.frame. This function retains the class of numeric input columns.
#' If you're formatting percentages, e.g., the result of [adorn_percentages()],
#' use [adorn_pct_formatting()] instead. This is a more flexible variant for ad-hoc usage.
#' Compared to `adorn_pct_formatting()`, it does not multiply by 100 or pad the
#' numbers with spaces for alignment in the results `data.frame`.
#' This function retains the class of numeric input columns.
#'
#' @param dat a `tabyl` or other data.frame with similar layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param digits how many digits should be displayed after the decimal point?
#' @param rounding method to use for rounding - either "half to even", the base R default method, or "half up", where 14.5 rounds up to 15.
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to `tabyl`.
#' @param dat A `tabyl` or other `data.frame` with similar layout.
#' If given a list of data.frames, this function will apply itself to each
#' `data.frame` in the list (designed for 3-way `tabyl` lists).
#' @param digits How many digits should be displayed after the decimal point?
#' @param rounding Method to use for rounding - either "half to even"
#' (the base R default method), or "half up", where 14.5 rounds up to 15.
#' @param ... Columns to adorn. This takes a tidyselect specification.
#' By default, all numeric columns (besides the initial column, if numeric)
#' are adorned, but this allows you to manually specify which columns should
#' be adorned, for use on a data.frame that does not result from a call to `tabyl`.
#'
#' @return Returns the data.frame with rounded numeric columns.
#' @return The `data.frame` with rounded numeric columns.
#' @export
#' @examples
#'
Expand Down Expand Up @@ -54,7 +67,9 @@ adorn_rounding <- function(dat, digits = 1, rounding = "half to even", ...) {
}
numeric_cols <- which(vapply(dat, is.numeric, logical(1)))
non_numeric_cols <- setdiff(1:ncol(dat), numeric_cols)
numeric_cols <- setdiff(numeric_cols, 1) # assume 1st column should not be included so remove it from numeric_cols. Moved up to this line so that if only 1st col is numeric, the function errors
# assume 1st column should not be included so remove it from numeric_cols.
# Moved up to this line so that if only 1st col is numeric, the function errors
numeric_cols <- setdiff(numeric_cols, 1)

if (rlang::dots_n(...) == 0) {
cols_to_round <- numeric_cols
Expand Down
55 changes: 41 additions & 14 deletions R/adorn_title.R
Original file line number Diff line number Diff line change
@@ -1,13 +1,30 @@
#' @title Add column name to the top of a two-way tabyl.
#' Add column name to the top of a two-way tabyl.
#'
#' @description
#' This function adds the column variable name to the top of a `tabyl` for a complete display of information. This makes the tabyl prettier, but renders the data.frame less useful for further manipulation.
#' This function adds the column variable name to the top of a `tabyl` for a
#' complete display of information. This makes the tabyl prettier, but renders
#' the `data.frame` less useful for further manipulation.
#'
#' @param dat a data.frame of class `tabyl` or other data.frame with a tabyl-like layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param placement whether the column name should be added to the top of the tabyl in an otherwise-empty row `"top"` or appended to the already-present row name variable (`"combined"`). The formatting in the `"top"` option has the look of base R's `table()`; it also wipes out the other column names, making it hard to further use the data.frame besides formatting it for reporting. The `"combined"` option is more conservative in this regard.
#' @param row_name (optional) default behavior is to pull the row name from the attributes of the input `tabyl` object. If you wish to override that text, or if your input is not a `tabyl`, supply a string here.
#' @param col_name (optional) default behavior is to pull the column_name from the attributes of the input `tabyl` object. If you wish to override that text, or if your input is not a `tabyl`, supply a string here.
#' @return the input tabyl, augmented with the column title. Non-tabyl inputs that are of class `tbl_df` are downgraded to basic data.frames so that the title row prints correctly.
#' The `placement` argument indicates whether the column name should be added to
#' the `top` of the tabyl in an otherwise-empty row `"top"` or appended to the
#' already-present row name variable (`"combined"`). The formatting in the `"top"`
#' option has the look of base R's `table()`; it also wipes out the other column
#' names, making it hard to further use the `data.frame` besides formatting it for reporting.
#' The `"combined"` option is more conservative in this regard.
#'
#' @param dat A `data.frame` of class `tabyl` or other `data.frame` with a tabyl-like layout.
#' If given a list of data.frames, this function will apply itself to each `data.frame`
#' in the list (designed for 3-way `tabyl` lists).
#' @param placement The title placement, one of `"top"`, or `"combined"`.
#' See **Details** for more information.
#' @param row_name (optional) default behavior is to pull the row name from the
#' attributes of the input `tabyl` object. If you wish to override that text,
#' or if your input is not a `tabyl`, supply a string here.
#' @param col_name (optional) default behavior is to pull the column_name from
#' the attributes of the input `tabyl` object. If you wish to override that text,
#' or if your input is not a `tabyl`, supply a string here.
#' @return The input `tabyl`, augmented with the column title. Non-tabyl inputs
#' that are of class `tbl_df` are downgraded to basic data.frames so that the
#' title row prints correctly.
#'
#' @export
#' @examples
Expand Down Expand Up @@ -38,12 +55,18 @@ adorn_title <- function(dat, placement = "top", row_name, col_name) {

if (inherits(dat, "tabyl")) {
if (attr(dat, "tabyl_type") == "one_way") {
warning("adorn_title is meant for two-way tabyls, calling it on a one-way tabyl may not yield a meaningful result")
warning(c(
olivroy marked this conversation as resolved.
Show resolved Hide resolved
"adorn_title is meant for two-way tabyls, ",
"calling it on a one-way tabyl may not yield a meaningful result"
))
}
}
if (missing(col_name)) {
if (!inherits(dat, "tabyl")) {
stop("When input is not a data.frame of class tabyl, a value must be specified for the col_name argument")
stop(c(
olivroy marked this conversation as resolved.
Show resolved Hide resolved
"When input is not a data.frame of class tabyl, ",
"a value must be specified for the col_name argument"
))
}
col_var <- attr(dat, "var_names")$col
} else {
Expand All @@ -63,13 +86,15 @@ adorn_title <- function(dat, placement = "top", row_name, col_name) {
if (inherits(dat, "tabyl")) {
row_var <- attr(dat, "var_names")$row
} else {
row_var <- names(dat)[1] # for non-tabyl input, if no row_name supplied, use first existing name
# for non-tabyl input, if no row_name supplied, use first existing name
row_var <- names(dat)[1]
}
}


if (placement == "top") {
dat[, ] <- lapply(dat[, ], as.character) # to handle factors, problematic in first column and at bind_rows.
# to handle factors, problematic in first column and at bind_rows.
dat[, ] <- lapply(dat[, ], as.character)
# Can't use mutate_all b/c it strips attributes
top <- dat[1, ]

Expand All @@ -82,8 +107,10 @@ adorn_title <- function(dat, placement = "top", row_name, col_name) {
out <- dat
names(out)[1] <- paste(row_var, col_var, sep = "/")
}
if (inherits(out, "tbl_df")) { # "top" text doesn't print if input (and thus the output) is a tibble
out <- as.data.frame(out) # but this prints row numbers, so don't apply to non-tbl_dfs like tabyls
# "top" text doesn't print if input (and thus the output) is a tibble
if (inherits(out, "tbl_df")) {
# but this prints row numbers, so don't apply to non-tbl_dfs like tabyls
out <- as.data.frame(out)
}
out
}
Expand Down
Loading