-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.qmd
493 lines (367 loc) · 15.9 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
---
format:
html:
css: styles.css
toc: false
title: "R and GitHub for Editors"
title-block-banner: images/urban-cover-photo.png
title-block-banner-color: white
execute:
echo: false
code-overflow: wrap
---
<!-- Front Matter -->
<!-- Product Type -->
<product-type-font-style>Toolkit</product-type-font-style>
<!-- Title -->
::: space-subtitle
<index-title-font-style>R and Git for Editors</index-title-font-style>
:::
<!-- Subtitle -->
::: decreased-space-subtitle
<index-subtitle-font-style>A Guide for Editing Technical Digital Content</index-subtitle-font-style>
:::
<!-- Author -->
::: space-author
<author-font-style>Alex Dallman</author-font-style>
:::
<!-- Author Affiliation -->
::: decreased-space-author-affiliation
<author-affiliation-font-style>Urban Institute</author-affiliation-font-style>
:::
<!-- Date -->
::: decreased-space-date
<date-font-style>Last updated July 2024</date-font-style>
:::
<!--Button -->
::: added-space
::: flex-container
<a href="about.qmd"> <button type="button" class="btn btn-primary">About the Project</button></a>
:::
:::
<!-- End Front Matter -->
<!-- Intro -->
::: added-space
# Using This Resource
Although the standard research product at Urban is still made with word-processing software, research products genereated in other programs, like RStudio and its associated packages, are becoming more ubiquitous because of their clear advantages in chart and table generation, citation management, and reproducibility. This guide is for those navigating text, charts, tables, and graphics in RStudio and GitHub. It cuts through the technicalities to give you a practical approach to easily manuever through and edit files, which will streamline the production of technical documents while avoiding ever-too-common copy-and-paste errors.
On this page, you will find an introduction to the various software and programming languages used at Urban to give you brief, relevant background information. For a direct how-to guide on using any of these programs in the context of editing, see the menu to navigate to the page you are looking for.
:::
## What Are R, LaTeX, Quarto, and GitHub?
In research institutions like the Urban Institute, R, LaTeX, Quarto, and GitHub are powerful tools used together or seperately to produce research products efficiency, with reproducibility and presentation in mind. They are all opensource to some degree and R, LaTeX, and Quarto are free to the public at all tiers. Described below is how each piece of the puzzle fits together. Let's start by looking at @fig-flowchart-1.
<!-- Figure -->
::: {#fig-flowchart-1}
::: left-align
<box-title-text>How R, RStudio, RMarkdown, LaTeX, Quarto, and GitHub Work Together</box-title-text>
```{mermaid r-gitub}
graph TD
A([RStudio]) -->|IDE for R| B[R]
A -->|Create Documents| C{RMarkdown}
A -->|Creates HTML, Word, and PDFs with | D{Quarto}
B -->|Used in| C
B -->|Used in| D
C -->|Used in| D
C -->|Generates PDFs with| F{LaTeX}
A -->|Version Control| E([GitHub])
C -->|Shared via| E
D -->|Shared via| E
F -->|Shared via| E
subgraph Tools
A
B
C
D
F
end
subgraph Platforms
E
end
```
::: no-indent
<source-note-text>**Source:** Author's depiction.</source-note-text>
:::
:::
:::
### R, RMarkdown, and RStudio
#### R
**R** is a programming language used for data analysis and visualization. It was created by statisticians to make the pipeline from raw data to clean visualization and analysis possible.
::: urban-list
- **Analyze data.** R helps researchers process and organize datasets, perform statistical analyses, and draw insights from the data.
- **Visualize results.** R can access different types of software or packages to perform different tasks, and one of those packages, ggplot2, allows researchers to create graphs and charts to visualize their findings without having to migrate the data to Excel and build the chart manually.
:::
Here’s an example of R code that loads in data and displays it:
```{r r-example}
#| echo: true
#| code-overflow: wrap
#This is a comment. In this line of code, I'm loading a dataset called mtcars.
data(mtcars)
#This line of code prints the first few rows of the mtcars dataset.
head(mtcars)
```
#### RMarkdown
**RMarkdown** is a tool that mixes regular text with R code. Think of it as a combination of a word processor and a programming tool that makes it easy to create dynamic and reproducible documents in various formats (PDF, HTML, or Word documents).
Below is an example of RMarkdown. Don't get caught up in exactly what it means, but do recognize the general syntax pattern.
```{markdown example-2}
#| echo: true
#| code-overflow: wrap
---
title: "RMarkdown Example"
author: "Your Name"
date: 7/23/2024
---
# Introduction That Will Print in Heading 1 Style
This is a simple example of an RMarkdown document.
## Head of mtcars That Will Print in Heading 2 Style
data(mtcars)
head(mtcars)
```
#### RStudio
**RStudio** is the downloadable software environment used to write code in R and RMarkdown. In RStudio, you can write in R, run code, and view plots all in one place. Think of it as the structure that allows you to use both R and RMarkdown in user-friendly way. See @fig-Rstudio below to get a sense of what working in RStudio looks like.
<!-- Figure -->
::: {#fig-Rstudio}
::: no-indent
<box-title-text>RStudio on Windows Desktop</box-title-text>
![](images/r-studio.png){fig-align="left"}
<source-note-text>**Source:** Author's screenshot.</source-note-text>
:::
:::
<!-- Box -->
::: space-before-box
::: urban-box-borders
::: {#box-about}
::: no-indent
<box-title-text>How R, RMarkdown, and RStudio Work Together</box-title-text>
:::
<p>When combined, these tools provide a powerful ecosystem for data analysis, visualization, and reporting.</p>
::: urban-ordered-list
1. **R** is the engine that allows you to write and execute code for data analysis.
2. **RStudio** is the environment where you develop and manage your R projects in an organized manner.
3. **RMarkdown** is the format you use to create dynamic documents that include both text and R code, making it easy to share your findings in various formats.
:::
<!-- ::: urban-list
- Here's a bulleted list.
- It's in Urban style.
- This is a secondary bullet.
- This is also.
:::
::: urban-ordered-list
1. ordered list
2. item 2
i) sub-item 1
A. sub-sub-item 1
:::
::: no-indent
<source-note-text>**Source:** This is a source.</source-note-text>
::: -->
<!-- ::: no-indent
::: spacing-source-note
<source-note-text>**Note:** This is a note.</source-note-text>
:::
::: -->
:::
:::
:::
For more information about R, visit Urban's Intro to R resource page [here](https://urbaninstitute.github.io/r-at-urban/intro-to-r.html).
<!-- End Box -->
### LaTeX
LaTeX is a typesetting program using plain text with embedded formatting commands to produce high-quality products, usually in PDF format. It’s useful for scientific and technical writing because it handles complex formatting, like mathematical equations, bibliographies, and citations, more easily than traditional word-processing software.
#### LaTeX Vs. Word-Processing Software
Instead of clicking buttons to format your document like in word-processing software, you write plain text mixed with special commands that tell LaTeX how to format the text. This reduces formatting, citation, and numbering errors common while using word-processing software, although word-processing software has a much smalller learning curve than LaTeX.
#### LaTeX Basics
::: urban-ordered-list
1. Open RStudio and install a LaTeX distribution.
2. Write the Document.
i) Content is typically in plain text in a .tex or .rmd file and includes commands to format text. See the example below:
```{latex latex-example}
#| echo: true
#| code-overflow: wrap
\documentclass{article}
\begin{document}
\title{My LaTeX Document}
\author{John Doe}
\maketitle
\section{Introduction}
This is a LaTeX document. I can write text and include formulas like \( E = mc^2 \).
\end{document}
```
3. Run a LaTeX compiler program that reads the .tex and .rmd files and processes them to produce a formatted PDF. The output is a professional-looking document in which all elements are perfectly placed, consistent, and properly formatted.
i) In this case, it will be the "knitr" compiler in RStudio.
:::
#### Resources
For more information about LaTeX, visit the LaTeX site [here](https://www.latex-project.org/). Currently, [Overleaf](https://www.overleaf.com) is the most popular choice for editing LaTeX documents because it allows you to apply track changes easily in various files.
### Quarto
Quarto is a tool that helps people create professional documents, presentations, websites, and more. It combines text, data, code, and graphics into a single, polished output.
::: urban-ordered-list
1. Writing Content
i) Quarto uses Markdown, and RMarkdown in RStudio.
2. Code and Data
i) Code snippets from languages like R, Python, or Julia can allow users to embed data analysis directly within your document.
3. Rendering
i) Quarto runs the code, captures the outputs (like graphs and tables), and formats everything according to your specifications, producing PDFs, HTML files, Word documents, PowerPoint presentations, and even websites.
:::
#### Quarto vs. LaTeX
Quarto and LaTeX are similar, but there are key differences. See @tbl-example for more information.
Feature| Quarto | LaTeX |
|------|--------|-------|
| **Ease of use**| Easier, uses Markdown | More difficult, uses LaTeX |
| **Integration**| Combines text, code, and visualizations | Focused on text and formatting |
| **Output formats**| PDF, HTML, Word, PowerPoint | Primarily PDF |
| **Code embedding**| Supports embedding R, Python, Julia code | Requires external tools for code integration |
| **Collaboration**| Integrates with GitHub for version control | Can be used with GitHub, but less integrated |
| **Flexibility**| Great for interactive and static documents | Best for static documents |
Table: <span class="table-title">Quarto vs. LaTeX</span>{#tbl-example}
::: left-align
::: decreased-top-margin-table-note
<source-note-text>**Source:** Author's analysis.</source-note-text>
:::
:::
To understand more about Quarto, visit their website [here](https://www.quarto.org).
### GitHub
GitHub is a web-based platform used for collaborative software developmenta and version-control system that tracks changes in files and coordinates work on those files among multiple people.
#### GitHub Basics
Researchers and data scientists use GitHub throughout the lifecycle of a project to house all the various R, RMarkdown, LaTeX, and other associated files. It's like Box but for more code-based work (but any type of file can be uploaded to GitHub).
:::urban-list
- **Repository (Repo).** A repository is like a project's folder on GitHub. It contains all the project files and the history of changes to those files.
- **Commits**. A commit is like a snapshot of a project, representing a set of changes made to the files in a repository.
- **Branches**. Branches allow users to work on different parts of a project separately. A branch can develop a new feature while the main branch contains the stable version of the project.
:::
See @fig-git-flow below.
<!-- Figure -->
::: {#fig-git-flow}
::: left-align
<box-title-text>Example GitHub Repository and Branching Structure</box-title-text>
```{mermaid github-ex}
%%{init: { 'logLevel': 'debug', 'theme': 'default' , 'themeVariables': {
'git0': '#1696d2',
'git1': '#fdbf11',
'git2': '#000000',
'git3': '#d2d2d2',
'git4': '#ec008b',
'git5': '#55b748',
'git6': '#5c5859',
'git7': '#db2b27'
} } }%%
gitGraph
commit
commit
branch develop
checkout develop
commit
commit
checkout main
merge develop
commit
commit
```
::: no-indent
<source-note-text>**Source:** "Gitgraph Diagrams," Mermaid, accessed July 25, 2024, [https://mermaid.js.org/syntax/gitgraph.html](https://mermaid.js.org/syntax/gitgraph.html).</source-note-text>
:::
:::
:::
## Summary
The technologies metioned above are interconnected in the world of research, and having base knowledge of how these programs work can help editors better understand researchers' workflow. To summarize all these concepts, let's go through a real-world example of how these programs relate to one another.
::: urban-ordered-list
1. Write R code in RStudio to analyze your data.
2. Document your analysis using RMarkdown, incorporating text and code.
3. Format the document using LaTeX for any complex equations or detailed formatting.
4. Track changes and collaborate with team members using GitHub, ensuring everyone works on the most recent version.
5. Compile and publish your final report using Quarto, producing a document ready for publication.
:::
On subsequent pages, you can learn more about how to edit files in different file formats in various programs.
<!-- Box -->
<!-- ::: space-before-box
::: urban-box-borders
::: {#box-about}
::: no-indent
<box-title-text>Example Box</box-title-text>
:::
<p>This is just regular text in a box.</p>
::: urban-list
- Here's a bulleted list.
- It's in Urban style.
- This is a secondary bullet.
- This is also.
:::
::: no-indent
<source-note-text>**Source:** This is a source.</source-note-text>
:::
::: no-indent
::: spacing-source-note
<source-note-text>**Note:** This is a note.</source-note-text>
:::
:::
:::
:::
::: -->
<!-- End Box -->
<!-- Pull Quote -->
<!-- ::: pull-quote-borders
::: no-indent
<pull-quote-text>Here's a pull quote. Someone said something very important.<br> ---Author</pull-quote-text>
:::
::: -->
<!-- End Pull Quote -->
<!-- Tables -->
<!-- ### Tables
See @tbl-state-population.
```{r table-1-setup, include=FALSE}
library(tidycensus)
library(tidyverse)
library(dplyr)
library(knitr)
library(kableExtra)
library(scales)
source("config.R")
census_api_key(CENSUS_API_KEY, overwrite = TRUE, install = TRUE)
census_data_example <- get_decennial(
geography = "state",
variables = c(
total_population = "P1_001N",
hispanic_or_latino = "P2_002N",
not_hispanic_or_latino = "P2_003N",
white_alone = "P2_005N",
black_or_african_american_alone = "P2_006N",
american_indian_and_alaska_native_alone = "P2_007N",
asian_alone = "P2_008N",
native_hawaiian_and_other_pacific_islander_alone = "P2_009N",
some_other_race_alone = "P2_010N",
two_or_more_races = "P2_011N"
),
year = 2020
)
print(census_data_example)
census_df_ethnicity <- census_data_example %>%
select(NAME, variable, value) %>%
spread(key = variable, value = value) %>%
rename(
"State" = NAME,
"Total Population" = total_population,
"Hispanic or Latino" = hispanic_or_latino,
"Not Hispanic or Latino" = not_hispanic_or_latino,
"White" = white_alone,
"Black or African American" = black_or_african_american_alone,
"American Indian and Alaska Native" = american_indian_and_alaska_native_alone,
"Asian" = asian_alone,
"Native Hawaiian and Other Pacific Islander" = native_hawaiian_and_other_pacific_islander_alone,
"Other Race/Ethnicity" = some_other_race_alone,
"Two or More Races/Ethnitcites" = two_or_more_races
) %>%
select("State", "Total Population", "Hispanic or Latino", "Not Hispanic or Latino","White", "Black or African American", "American Indian and Alaska Native", "Asian", "Native Hawaiian and Other Pacific Islander", "Other Race/Ethnicity", "Two or More Races/Ethnitcites")
census_df_ethnicity <- census_df_ethnicity %>%
mutate(across(where(is.numeric), comma))
```
:::: {.tbl-cap #tbl-state-population}
::: no-indent
<box-title-text>Example Table</box-title-text>
:::
```{r tbl-state-population}
kable(census_df_ethnicity)
```
::::
::: no-indent
<source-note-text>**Source:** This is a source.</source-note-text>
:::
::: no-indent
::: spacing-source-note
<source-note-text>**Note:** This is a note.</source-note-text>
:::
::: -->