This package offers a tidy solution for epidemiological data. It houses a range of functions for epidemiologists and public health data wizards for data management and cleaning. For more details on how to use this package, visit the epiCleanr website.
The package is available on Cran and can be installed in the following way:
install.packages("epiCleanr")
library("epiCleanr")
Or install the development version from GitHub:
# If you haven't installed the 'devtools' package, run:
# install.packages("devtools")
devtools::install_github("truenomad/epiCleanr")
Load the package:
library(epiCleanr)
epiCleanr
could be used as a helper package for end-to-end epidemiological
data management, offering functionalities ranging from data importation and
quality assessment to cleaning and exporting files. Below are some of the
workflow steps this package streamlines:
Utilise import()
to seamlessly read data from a wide array of file
formats, from CSV to Excel to JSON, all within one function.
-
consistency_check()
: Generate plots to identify inconsistencies, such as when the number of tests exceeds the number of cases. -
missing_plot()
: Visualize patterns of missing data or reporting rates across different variables and factors. -
create_test()
: Establish unit-testing functions to automate data validation, ensuring the robustness of your dataset.
-
clean_admin_names()
: Normalize administrative names in your dataset using either user-supplied data or downloaded reference data viaget_admin_names()
. -
cleaning_names_strings()
: Use this function to clean and standardize string columns in your data. -
handle_outliers()
: Detect and manage outliers using a variety of statistical methods, providing you with options to either remove or impute them.
Finally, use export()
to save your cleaned data back into multiple
file formats, be it CSV, Excel, or other specialized formats.