diff --git a/episodes/read-cases.Rmd b/episodes/read-cases.Rmd deleted file mode 100644 index b467ceb4..00000000 --- a/episodes/read-cases.Rmd +++ /dev/null @@ -1,106 +0,0 @@ ---- -title: 'Read case data' -teaching: 10 -exercises: 2 ---- - -:::::::::::::::::::::::::::::::::::::: questions - -- how many different data formats can I read? -- is it possible to import data from database and health APIs? -:::::::::::::::::::::::::::::::::::::::::::::::: - -::::::::::::::::::::::::::::::::::::: objectives - -- Explain how read/import outbreak data from different sources into `R` -environment for analysis. -:::::::::::::::::::::::::::::::::::::::::::::::: -::::::::::::::::::::::::::::::::::::: prereq - -## Prerequisites - -This episode requires you to be familiar with: - -**Data science** : Basic programming with R. -::::::::::::::::::::::::::::::::: - -# Introduction - -The initial step in outbreak analysis involves importing the target dataset into the `R` environment from various sources. Outbreak data is typically stored in files of diverse formats, within relational database management systems (RDBMS), and health information system APIs (HIS). The latter two options are particularly well-suited for storing institutional data. This episode will elucidate the process of reading cases from these sources. - -## Reading from files - -Several packages are available for importing outbreak data stored in individual files into `R`. These include [rio](http://gesistsa.github.io/rio/), [readr](https://readr.tidyverse.org/) from the `tidyverse`, [io](https://bitbucket.org/djhshih/io/src/master/), and [ImportExport](https://cran.r-project.org/web/packages/ImportExport/index.html). Together, these packages offer methods to read single or multiple files in a wide range of formats. - -The below example shows how to import a `csv` file into `R` environment using `rio` package. -```{r,warning=FALSE,message=FALSE} -library("rio") -case_data = rio::import("./data/ebola_cases.csv", format = "csv") -``` -Similarly, you can import files of other formats such as "tsv", "xlsx", ...etc. - -::::::::::::::::::::::::::::::::: challenge - -### Reading compressed data -Take 1 minute: - -- Is it possible to read compressed data in `R`? - -::::::::::::::::: hint - -You can check the supported file formats in the `{rio}` package as follows: -```{r, eval=FALSE} -library("rio") -rio::install_formats() -``` - -:::::::::::::::::::::: - -::::::::::::::::: solution - -```{r,eval=FALSE} -library("rio") -rio::import("/path_name/file_name.zip") -``` - -:::::::::::::::::::::::::: - -::::::::::::::::::::::::::::::::::::::::::: - - -## Reading from databases - -The [DBI](https://dbi.r-dbi.org/) package serves as a versatile interface for interacting with database management systems (DBMS) across different back-ends or servers. It offers a uniform method for accessing and retrieving data from various database systems. - - -The following code chunk demonstrates how to create a temporary SQLite database in memory, store the `case_data` dataframe as a table within it, and subsequently read from it: - -```{r,warning=FALSE,message=FALSE} -library(DBI) -library(RSQLite) -# Create a temporary SQLite database in memory -db_con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:") -# Store the 'case_data' dataframe as a table named 'cases' in the SQLite database -DBI::dbWriteTable(db_con, "cases", case_data) -# Read data from the 'cases' table -result <- DBI::dbReadTable(db_con, "cases") -# Close the database connection -DBI::dbDisconnect(db_con) -# View the result -base::print(utils::head(result)) -``` - -This code first establishes a connection to an SQLite database created in memory using `dbConnect` function. Then, it writes the `case_data` dataframe into a table named 'cases' within the database using `dbWriteTable` function. Subsequently, it reads the data from the 'cases' table using `dbReadTable` function. Finally, it closes the database connection with `dbDisconnect` function. - -More examples about SQL databases and R can be found [here](https://datacarpentry.org/R-ecology-lesson/05-r-and-databases.html). - -## Reading from HIS APIs - -Coming soon with {readepi} - -::::::::::::::::::::::::::::::::::::: keypoints -- Use `{rio}, {io}, {readr}` and `{ImportExport}` to read data from individual files. -- Use `{DBI}` to read data from databases. -- Use `{readepi}` to read data form HIS APIs. -:::::::::::::::::::::::::::::::::::::::::::::::: -