You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: 1-Tidyverse.Rmd
+27-12
Original file line number
Diff line number
Diff line change
@@ -15,14 +15,29 @@ library(lubridate)
15
15
examples <- readLines('Examples.R')
16
16
```
17
17
18
+
# Table of Contents
18
19
19
-
# Tibbles
20
+
1.[Tibbles](#tibbles)
21
+
22
+
1.1 [Why tibbles?](#why)
23
+
24
+
1.2 [Working with tibbles](#working)
25
+
26
+
1.3 [Examples and exercises](#eeTibbles)
27
+
28
+
2.[Importing Data](#import)
29
+
30
+
2.1 [Comments and metadata](#skip)
31
+
32
+
2.2 [Examples and exercises](#eeImport)
33
+
34
+
# <aname="tibbles"></a>Tibbles
20
35
21
36
In the tidyverse the commonly returning objects are not data.frame but tibbles, which can be created with either the `tibble()` or `data_frame()` functions.
22
37
23
38
What is tibble?
24
39
25
-
- modern way of loooking at the traditional data.frame
40
+
- modern way of looking at the traditional data.frame
26
41
- you will get a lot more useful information than the data.frames
27
42
- tibble is part of tibble package and part of the core tidyverse package
28
43
@@ -46,9 +61,9 @@ as_tibble(df)
46
61
```
47
62
48
63
49
-
## Why Tibbles?
64
+
## <aname="why"></a>Why Tibbles?
50
65
51
-
-`tibble()`doesnt change the inputs (i.e. it doesn't convert strings to factors).
66
+
-`tibble()`doesn't change the inputs (i.e. it doesn't convert strings to factors).
52
67
53
68
```{r}
54
69
data.frame(x = letters[1:5]) %>%
@@ -95,7 +110,7 @@ as.data.frame(who) # try printing as a data.frame (output not shown here)
95
110
96
111
Why not use a tibble? There are a few packages that don't get along with tibbles (e.g. the missForest package). In this case, you may need to convert your tibble into a data.frame using `as.data.frame()`.
97
112
98
-
## Working with tibbles
113
+
## <aname="working"></a>Working with tibbles
99
114
100
115
Here is a more complicated tibble, consisting of a random start time within +/- 12 hours of now and a random end time within the next 30 days (where "now" is relative to when this code is run).
101
116
@@ -179,13 +194,13 @@ options(tmp)
179
194
180
195
You can also use the `tibble.width = Inf` option to print all columns. There are more options documented at `package?tibble`.
181
196
182
-
## Examples and Exercises
197
+
## <aname="eeTibbles"></a>Examples and Exercises
183
198
184
199
For more examples, see line ```r which(examples == "########## tibble Examples ##########")```of [Examples.R](https://github.com/ravichas/TidyingData/blob/master/Examples.R).
185
200
186
201
Practice exercises for this section can be found in [Exercsies.Rmd](https://github.com/ravichas/TidyingData/blob/master/Exercises.md#tibbleEx).
187
202
188
-
# Importing Data
203
+
# <aname="import"></a>Importing Data
189
204
190
205
RStudio has a nice data import utility under File > Import Dataset. This will generate the code to repeat the import (i.e. so you can save it to your script).
191
206
@@ -196,8 +211,8 @@ If you are comfortable with writing the code directly, the following functions w
196
211
-`?read_csv`: import comma separated values data
197
212
-`?read_csv2`: import semicolon separated values data (European version of a csv)
198
213
-`?read_tsv`: import tab delimited data
199
-
-`?read_delim`: import a text file with data (e.g. space delimted)
200
-
-`?read_excel`: import Excel formated data (either xls or xlsx format)
214
+
-`?read_delim`: import a text file with data (e.g. space delimited)
215
+
-`?read_excel`: import Excel formatted data (either xls or xlsx format)
201
216
202
217
If you are familiar with R you may recognize that there are data.frame generating counterparts from the utils package (e.g. `read.csv()` and `read.delim()`). Why would we want to use these function from the readr package over the base-R functions?
203
218
@@ -215,9 +230,9 @@ require(readr)
215
230
read_csv('Data/WHO-2a.csv')
216
231
```
217
232
218
-
## Comments/Metadata
233
+
## <aname="skip"></a>Comments/Metadata
219
234
220
-
Sometimes, there will be extra metadata at the top of a file, often preceded with '#'. How do we read a dataset that has some metadata (indicated by '#')? What if the extra lines aren't properly marked with '#'?
235
+
Sometimes, there will be extra metadata at the top of a file, often preceded with '#'. How do we read a data set that has some metadata (indicated by '#')? What if the extra lines aren't properly marked with '#'?
0 commit comments