Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documention with markdown and remove mentions to master branch #551

Merged
merged 24 commits into from
Aug 13, 2023
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ If your proposed contribution addresses multiple issues, it should ideally be br
* Make sure to track progress upstream (i.e., on our version of `janitor` at `sfirke/janitor`) by doing `git remote add upstream https://github.com/sfirke/janitor.git`. Before making changes make sure to pull changes in from upstream by doing either `git fetch upstream` then merge later or `git pull upstream` to fetch and merge in one step
* Make your changes (bonus points for making changes on a new feature branch)
* Push up to your account
* Submit a pull request to the master branch at `sfirke/janitor`
* Submit a pull request to the main branch at `sfirke/janitor`

### Prefer to discuss over email?
Email Sam. His email address is in the `DESCRIPTION` file of this repo.
Expand Down
50 changes: 27 additions & 23 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,22 +1,24 @@
Package: janitor
Title: Simple Tools for Examining and Cleaning Dirty Data
Version: 2.2.0.9000
Authors@R: c(person("Sam", "Firke", email = "[email protected]", role = c("aut", "cre")),
person("Bill", "Denney", email = "[email protected]", role = "ctb"),
person("Chris", "Haid", email = "[email protected]", role = "ctb"),
person("Ryan", "Knight", email = "[email protected]", role = "ctb"),
person("Malte", "Grosser", email = "[email protected]", role = "ctb"),
person("Jonathan", "Zadra", email = "[email protected]", role = "ctb"))
Description: The main janitor functions can: perfectly format data.frame column
names; provide quick counts of variable combinations (i.e., frequency
tables and crosstabs); and explore duplicate records. Other janitor functions
nicely format the tabulation results. These tabulate-and-report functions
approximate popular features of SPSS and Microsoft Excel. This package
follows the principles of the "tidyverse" and works well with the pipe function
%>%. janitor was built with beginning-to-intermediate R users in mind and is
optimized for user-friendliness.
URL: https://github.com/sfirke/janitor,
https://sfirke.github.io/janitor/
Authors@R: c(
person("Sam", "Firke", , "[email protected]", role = c("aut", "cre")),
person("Bill", "Denney", , "[email protected]", role = "ctb"),
person("Chris", "Haid", , "[email protected]", role = "ctb"),
person("Ryan", "Knight", , "[email protected]", role = "ctb"),
person("Malte", "Grosser", , "[email protected]", role = "ctb"),
person("Jonathan", "Zadra", , "[email protected]", role = "ctb")
)
Description: The main janitor functions can: perfectly format data.frame
column names; provide quick counts of variable combinations (i.e.,
frequency tables and crosstabs); and explore duplicate records. Other
janitor functions nicely format the tabulation results. These
tabulate-and-report functions approximate popular features of SPSS and
Microsoft Excel. This package follows the principles of the
"tidyverse" and works well with the pipe function %>%. janitor was
built with beginning-to-intermediate R users in mind and is optimized
for user-friendliness.
URL: https://github.com/sfirke/janitor, https://sfirke.github.io/janitor/
BugReports: https://github.com/sfirke/janitor/issues
Depends:
R (>= 3.1.2)
Expand All @@ -28,14 +30,12 @@ Imports:
magrittr,
purrr,
rlang,
snakecase (>= 0.9.2),
stringi,
stringr,
snakecase (>= 0.9.2),
tidyselect (>= 1.0.0),
tidyr (>= 1.0.0)
tidyr (>= 1.0.0),
tidyselect (>= 1.0.0)
License: MIT + file LICENSE
LazyData: true
RoxygenNote: 7.2.3
Suggests:
dbplyr,
knitr,
Expand All @@ -45,6 +45,10 @@ Suggests:
testthat (>= 3.0.0),
tibble,
tidygraph
VignetteBuilder: knitr
Encoding: UTF-8
VignetteBuilder:
knitr
Config/testthat/edition: 3
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
14 changes: 6 additions & 8 deletions R/adorn_ns.R
Original file line number Diff line number Diff line change
@@ -1,18 +1,16 @@
#' @title Add underlying Ns to a tabyl displaying percentages.
#' Add underlying Ns to a tabyl displaying percentages.
#'
#' @description
#' This function adds back the underlying Ns to a \code{tabyl} whose percentages were calculated using \code{adorn_percentages()}, to display the Ns and percentages together. You can also call it on a non-tabyl data.frame to which you wish to append Ns.
#' This function adds back the underlying Ns to a `tabyl` whose percentages were calculated using `adorn_percentages()`, to display the Ns and percentages together. You can also call it on a non-tabyl data.frame to which you wish to append Ns.
#'
#' @param dat a data.frame of class \code{tabyl} that has had \code{adorn_percentages} and/or \code{adorn_pct_formatting} called on it. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way \code{tabyl} lists).
#' @param dat a data.frame of class `tabyl` that has had `adorn_percentages` and/or `adorn_pct_formatting` called on it. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param position should the N go in the front, or in the rear, of the percentage?
#' @param ns the Ns to append. The default is the "core" attribute of the input tabyl \code{dat}, where the original Ns of a two-way \code{tabyl} are stored. However, if your Ns are stored somewhere else, or you need to customize them beyond what can be done with `format_func`, you can supply them here.
#' @param format_func a formatting function to run on the Ns. Consider defining with \code{base::format()}.
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all columns are adorned except for the first column and columns not of class \code{numeric}, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to \code{tabyl}.
#' @param ns the Ns to append. The default is the "core" attribute of the input tabyl `dat`, where the original Ns of a two-way `tabyl` are stored. However, if your Ns are stored somewhere else, or you need to customize them beyond what can be done with `format_func`, you can supply them here.
#' @param format_func a formatting function to run on the Ns. Consider defining with [base::format()].
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all columns are adorned except for the first column and columns not of class `numeric`, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to `tabyl`.
#'
#' @return a data.frame with Ns appended
#' @export
#' @examples
#'
#' mtcars %>%
#' tabyl(am, cyl) %>%
#' adorn_percentages("col") %>%
Expand Down
33 changes: 23 additions & 10 deletions R/adorn_pct_formatting.R
Original file line number Diff line number Diff line change
@@ -1,21 +1,34 @@
#' @title Format a data.frame of decimals as percentages.
#' Format a `data.frame` of decimals as percentages.
#'
#' @description
#' Numeric columns get multiplied by 100 and formatted as percentages according to user specifications. This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to adorn in the \code{...} argument. Non-numeric columns are always excluded.
#' Numeric columns get multiplied by 100 and formatted as
#' percentages according to user specifications. This function defaults to
#' excluding the first column of the input data.frame, assuming that it contains
#' a descriptive variable, but this can be overridden by specifying the columns
#' to adorn in the `...` argument. Non-numeric columns are always excluded.
#'
#' The decimal separator character is the result of \code{getOption("OutDec")}, which is based on the user's locale. If the default behavior is undesirable,
#' change this value ahead of calling the function, either by changing locale or with \code{options(OutDec = ",")}. This aligns the decimal separator character with that used in \code{base::print()}.
#' The decimal separator character is the result of `getOption("OutDec")`, which
#' is based on the user's locale. If the default behavior is undesirable,
#' change this value ahead of calling the function, either by changing locale or
#' with `options(OutDec = ",")`. This aligns the decimal separator character
#' with that used in `base::print()`.
#'
#' @param dat a data.frame with decimal values, typically the result of a call to \code{adorn_percentages} on a \code{tabyl}. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way \code{tabyl} lists).
#' @param dat a data.frame with decimal values, typically the result of a call
#' to `adorn_percentages` on a `tabyl`. If given a list of data.frames, this
#' function will apply itself to each data.frame in the list (designed for
#' 3-way `tabyl` lists).
#' @param digits how many digits should be displayed after the decimal point?
#' @param rounding method to use for rounding - either "half to even", the base R default method, or "half up", where 14.5 rounds up to 15.
#' @param affix_sign should the \% sign be affixed to the end?
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to \code{tabyl}.
#'
#' @param rounding method to use for rounding - either "half to even", the base
#' R default method, or "half up", where 14.5 rounds up to 15.
#' @param affix_sign should the % sign be affixed to the end?
#' @param ... columns to adorn. This takes a tidyselect specification. By
#' default, all numeric columns (besides the initial column, if numeric) are
#' adorned, but this allows you to manually specify which columns should be
#' adorned, for use on a data.frame that does not result from a call to
#' `tabyl`.
#' @return a data.frame with formatted percentages
#' @export
#' @examples
#'
#' mtcars %>%
#' tabyl(am, cyl) %>%
#' adorn_percentages("col") %>%
Expand Down
9 changes: 4 additions & 5 deletions R/adorn_percentages.R
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
#' @title Convert a data.frame of counts to percentages.
#' Convert a data.frame of counts to percentages.
#'
#' @description
#' This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to adorn in the \code{...} argument.
#' This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to adorn in the `...` argument.
#'
#' @param dat a \code{tabyl} or other data.frame with a tabyl-like layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way \code{tabyl} lists).
#' @param dat a `tabyl` or other data.frame with a tabyl-like layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param denominator the direction to use for calculating percentages. One of "row", "col", or "all".
#' @param na.rm should missing values (including NaN) be omitted from the calculations?
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to \code{tabyl}.
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to `tabyl`.
#'
#' @return Returns a data.frame of percentages, expressed as numeric values between 0 and 1.
#' @export
Expand Down
12 changes: 6 additions & 6 deletions R/adorn_rounding.R
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
#' @title Round the numeric columns in a data.frame.
#' Round the numeric columns in a data.frame.
#'
#' @description
#' Can run on any data.frame with at least one numeric column. This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to round in the \code{...} argument.
#' Can run on any data.frame with at least one numeric column. This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to round in the `...` argument.
#'
#' If you're formatting percentages, e.g., the result of \code{adorn_percentages()}, use \code{adorn_pct_formatting()} instead. This is a more flexible variant for ad-hoc usage. Compared to \code{adorn_pct_formatting()}, it does not multiply by 100 or pad the numbers with spaces for alignment in the results data.frame. This function retains the class of numeric input columns.
#' If you're formatting percentages, e.g., the result of `adorn_percentages()`, use `adorn_pct_formatting()` instead. This is a more flexible variant for ad-hoc usage. Compared to `adorn_pct_formatting()`, it does not multiply by 100 or pad the numbers with spaces for alignment in the results data.frame. This function retains the class of numeric input columns.
#'
#' @param dat a \code{tabyl} or other data.frame with similar layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way \code{tabyl} lists).
#' @param dat a `tabyl` or other data.frame with similar layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param digits how many digits should be displayed after the decimal point?
#' @param rounding method to use for rounding - either "half to even", the base R default method, or "half up", where 14.5 rounds up to 15.
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to \code{tabyl}.
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to `tabyl`.
#'
#' @return Returns the data.frame with rounded numeric columns.
#' @export
Expand Down Expand Up @@ -39,7 +39,7 @@
#'
#' cases %>%
#' adorn_percentages(, , ends_with("ed")) %>%
#' adorn_rounding(, , one_of(c("recovered", "died")))
olivroy marked this conversation as resolved.
Show resolved Hide resolved
#' adorn_rounding(, , all_of(c("recovered", "died")))
adorn_rounding <- function(dat, digits = 1, rounding = "half to even", ...) {
# if input is a list, call purrr::map to recursively apply this function to each data.frame
if (is.list(dat) && !is.data.frame(dat)) {
Expand Down
12 changes: 6 additions & 6 deletions R/adorn_title.R
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
#' @title Add column name to the top of a two-way tabyl.
#'
#' @description
#' This function adds the column variable name to the top of a \code{tabyl} for a complete display of information. This makes the tabyl prettier, but renders the data.frame less useful for further manipulation.
#' This function adds the column variable name to the top of a `tabyl` for a complete display of information. This makes the tabyl prettier, but renders the data.frame less useful for further manipulation.
#'
#' @param dat a data.frame of class \code{tabyl} or other data.frame with a tabyl-like layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way \code{tabyl} lists).
#' @param placement whether the column name should be added to the top of the tabyl in an otherwise-empty row \code{"top"} or appended to the already-present row name variable (\code{"combined"}). The formatting in the \code{"top"} option has the look of base R's \code{table()}; it also wipes out the other column names, making it hard to further use the data.frame besides formatting it for reporting. The \code{"combined"} option is more conservative in this regard.
#' @param row_name (optional) default behavior is to pull the row name from the attributes of the input \code{tabyl} object. If you wish to override that text, or if your input is not a \code{tabyl}, supply a string here.
#' @param col_name (optional) default behavior is to pull the column_name from the attributes of the input \code{tabyl} object. If you wish to override that text, or if your input is not a \code{tabyl}, supply a string here.
#' @return the input tabyl, augmented with the column title. Non-tabyl inputs that are of class \code{tbl_df} are downgraded to basic data.frames so that the title row prints correctly.
#' @param dat a data.frame of class `tabyl` or other data.frame with a tabyl-like layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param placement whether the column name should be added to the top of the tabyl in an otherwise-empty row `"top"` or appended to the already-present row name variable (`"combined"`). The formatting in the `"top"` option has the look of base R's `table()`; it also wipes out the other column names, making it hard to further use the data.frame besides formatting it for reporting. The `"combined"` option is more conservative in this regard.
#' @param row_name (optional) default behavior is to pull the row name from the attributes of the input `tabyl` object. If you wish to override that text, or if your input is not a `tabyl`, supply a string here.
#' @param col_name (optional) default behavior is to pull the column_name from the attributes of the input `tabyl` object. If you wish to override that text, or if your input is not a `tabyl`, supply a string here.
#' @return the input tabyl, augmented with the column title. Non-tabyl inputs that are of class `tbl_df` are downgraded to basic data.frames so that the title row prints correctly.
#'
#' @export
#' @examples
Expand Down
Loading