Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: verify required variables in dataset #45

Merged
merged 19 commits into from
Mar 15, 2024
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
c6eaf3f
chore: `use_data_raw("variable-description")` to process and create a…
lwjohnst86 Feb 26, 2024
eee3ff5
feat: internal function to verify if the dataset has required variables.
lwjohnst86 Feb 26, 2024
6fb1083
build: suppress CRAN note by assigning "data mask" .data pronoun as a…
lwjohnst86 Feb 26, 2024
f40d0ad
build: add dependencies to DESCRIPTION list
lwjohnst86 Feb 26, 2024
9bb3b9c
tests: add unit test for verify function
lwjohnst86 Feb 26, 2024
01fcbea
build: ignore data-raw folder during build process
lwjohnst86 Feb 26, 2024
8286e57
feat: function to get list of the registers abbreviations
lwjohnst86 Feb 26, 2024
09822e7
fix: use the created function for getting the abbrev
lwjohnst86 Feb 26, 2024
5c482ba
Merge commit 'c57eda10a7fa7ee7301f55c68061e7631d02b219'
lwjohnst86 Mar 8, 2024
b29f1c0
Merge branch 'main' of https://github.com/steno-aarhus/osdc into feat…
lwjohnst86 Mar 15, 2024
e6366e7
Merge branch 'variable-description' of https://github.com/steno-aarhu…
lwjohnst86 Mar 15, 2024
dba6eb9
docs: Apply suggestions from code review
lwjohnst86 Mar 15, 2024
e5c00a1
Merge branch 'main' into feat/check-variables
lwjohnst86 Mar 15, 2024
3253cfa
Merge branch 'main' of https://github.com/steno-aarhus/osdc into feat…
lwjohnst86 Mar 15, 2024
1caf049
Merge branch 'main' of https://github.com/steno-aarhus/osdc into feat…
lwjohnst86 Mar 15, 2024
d24c203
Merge branch 'feat/check-variables' of https://github.com/steno-aarhu…
lwjohnst86 Mar 15, 2024
7e57d5e
test: add tests to check against different data formats
lwjohnst86 Mar 15, 2024
3f4e5f1
fix: DuckDB requires `colnames()`, not `names()`
lwjohnst86 Mar 15, 2024
445e34f
Merge branch 'main' into feat/check-variables
lwjohnst86 Mar 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
^\.Rproj\.user$
^LICENSE\.md$
^\.github$
^data-raw$
^dev$
^CODE_OF_CONDUCT\.md$
^_pkgdown\.yml$
Expand Down
11 changes: 8 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -23,20 +23,25 @@ LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
Imports:
checkmate,
data.table,
dplyr,
fs,
haven,
here,
lifecycle,
lubridate,
purrr
purrr,
rlang
Suggests:
furrr,
DiagrammeR,
knitr,
rmarkdown,
spelling,
testthat (>= 3.0.0)
testthat (>= 3.0.0),
spelling
Depends:
R (>= 2.10)
VignetteBuilder: knitr
Language: en-US
Config/testthat/edition: 3
11 changes: 11 additions & 0 deletions R/get-variables.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@

#' Get a list of the registers' abbreviations.
#'
#' @return A character string.
#' @export
#'
#' @examples
#' get_register_abbrev()
get_register_abbrev <- function() {
unique(required_variables$register_abbrev)
}
5 changes: 5 additions & 0 deletions R/osdc-package.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,8 @@
#' @importFrom lifecycle deprecated
## usethis namespace: end
NULL

# Allows for using tidyverse functionality without triggering CRAN NOTES,
# since CRAN doesn't know that packages like dplyr use NSE.
# For more details, see https://rlang.r-lib.org/reference/dot-data.html#where-does-data-live
utils::globalVariables(".data")
lwjohnst86 marked this conversation as resolved.
Show resolved Hide resolved
Binary file added R/sysdata.rda
Binary file not shown.
27 changes: 27 additions & 0 deletions R/verify-variables.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#' Verify that the dataset has the required variables for the algorithm.
#'
#' @param data The dataset to check.
#' @param register The abbreviation of the register name. See list of
#' abbreviations in [get_register_abbrev()].
#'
#' @return Either TRUE if the verification passes, or a character string if
#' there is an error.
#' @keywords internal
#'
#' @examples
#' library(tibble)
#' library(dplyr)
#' # TODO: Replace with simulated data.
#' example_bef_data <- tibble(pnr = 1, koen = 1, foed_dato = 1)
#' verify_required_variables(example_bef_data, "bef")
verify_required_variables <- function(data, register) {
checkmate::assert_choice(register, get_register_abbrev())
expected_variables <- required_variables |>
dplyr::filter(.data$register_abbrev == register) |>
dplyr::pull(.data$variable_name)
actual_variables <- names(data)
checkmate::check_names(
x = actual_variables,
must.include = expected_variables
)
}
8 changes: 8 additions & 0 deletions data-raw/variable-description.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
## code to prepare `variable-description` dataset goes here

library(tidyverse)

required_variables <- read_csv(here::here("data-raw/variable_description.csv")) |>
select(register_abbrev = raw_register_filename, variable_name)

usethis::use_data(required_variables, overwrite = TRUE, internal = TRUE)
26 changes: 26 additions & 0 deletions tests/testthat/test-verify-variables.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
library(tibble)

test_that("the correct abbreviation for the register is used", {
bef_complete <- tibble(pnr = 1, koen = 1, foed_dato = 1)

# When incorrect register abbreviation is given
expect_error(verify_required_variables(bef_complete, "bef1"))
# When correct abbreviation is given
expect_true(verify_required_variables(bef_complete, "bef"))
lwjohnst86 marked this conversation as resolved.
Show resolved Hide resolved
})

test_that("the required variables are present in the dataset", {
# Expected
bef_complete <- tibble(pnr = 1, koen = 1, foed_dato = 1)
bef_complete_extra <- tibble(pnr = 1, koen = 1, foed_dato = 1, something = 1)
bef_incomplete <- tibble(pnr = 1, koen = 1)

# When all variables are the required variables
expect_true(verify_required_variables(bef_complete, "bef"))

# When some of the variables are the required variables
expect_true(verify_required_variables(bef_complete_extra, "bef"))

# When it is a character output, it is a fail
expect_character(verify_required_variables(bef_incomplete, "bef"))
})
Loading