-
Notifications
You must be signed in to change notification settings - Fork 0
/
RR.html
444 lines (441 loc) · 36.5 KB
/
RR.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="pandoc" />
<meta name="author" content="Toni Verbeiren" />
<title>A Shot at Reproducible Data Analysis</title>
<style type="text/css">code{white-space: pre;}</style>
<style type="text/css">
table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode {
margin: 0; padding: 0; vertical-align: baseline; border: none; }
table.sourceCode { width: 100%; line-height: 100%; background-color: #f8f8f8; }
td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; }
td.sourceCode { padding-left: 5px; }
pre, code { background-color: #f8f8f8; }
code > span.kw { color: #204a87; font-weight: bold; }
code > span.dt { color: #204a87; }
code > span.dv { color: #0000cf; }
code > span.bn { color: #0000cf; }
code > span.fl { color: #0000cf; }
code > span.ch { color: #4e9a06; }
code > span.st { color: #4e9a06; }
code > span.co { color: #8f5902; font-style: italic; }
code > span.ot { color: #8f5902; }
code > span.al { color: #ef2929; }
code > span.fu { color: #000000; }
code > span.er { font-weight: bold; }
</style>
<link rel="stylesheet" href="template-html/template.css" />
<script src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script>
</head>
<body>
<div class="navbar navbar-static-top">
<div class="navbar-inner">
<div class="container">
<span class="doc-title">A Shot at Reproducible Data Analysis</span>
<ul class="nav pull-right doc-info">
<li><p class="navbar-text">Toni Verbeiren</p></li>
<li><p class="navbar-text">13/3/2015</p></li>
</ul>
</div>
</div>
</div>
<div class="container">
<div class="row">
<div id="TOC" class="span3">
<div class="well toc">
<ul>
<li class="nav-header">Table of Contents</li>
</ul>
<ul>
<li><a href="#introduction">Introduction</a></li>
<li><a href="#idea">Idea</a></li>
<li><a href="#some-examples">Some Examples</a></li>
<li><a href="#different-languages">Different languages</a></li>
<li><a href="#sweave">Sweave</a></li>
<li><a href="#what-to-use-it-for">What to use it for?</a></li>
<li><a href="#how-to-use-it">How to use it?</a></li>
<li><a href="#additional-pointers">Additional pointers</a></li>
</ul>
</div>
</div>
<div class="span9">
<h1 id="introduction">Introduction</h1>
<p>In this talk/document/presentation I showcase some of the possibilities that a combination of <em>tools</em> provides:</p>
<ul>
<li><a href="http://daringfireball.net/projects/markdown/">Markdown</a></li>
<li><a href="http://rmarkdown.rstudio.com/">RMarkdown</a></li>
<li><a href="http://yihui.name/knitr/">Knitr</a></li>
<li><a href="http://johnmacfarlane.net/pandoc/">Pandoc</a></li>
<li><a href="http://lab.hakim.se/reveal-js/#/">Reveal.js</a></li>
<li><a href="http://www.latex-project.org/">Latex</a></li>
</ul>
<p>In order to make sure things look good from the first start, you might check out some additional projects and files:</p>
<ul>
<li>Bootstrap template for Pandoc: <a href="https://github.com/tonyblundell/pandoc-bootstrap-template">https://github.com/tonyblundell/pandoc-bootstrap-template</a></li>
<li>Alternative LaTeX templates: <a href="https://github.com/kjhealy/latex-custom-kjh">https://github.com/kjhealy/latex-custom-kjh</a></li>
<li>Alternative Pandoc template: <a href="https://github.com/kjhealy/pandoc-templates">https://github.com/kjhealy/pandoc-templates</a></li>
<li>Non-official KU Leuven templates: <a href="https://github.com/exporl/kuleuven-templates">https://github.com/exporl/kuleuven-templates</a></li>
</ul>
<h1 id="idea">Idea</h1>
<h2 id="workflow">Workflow</h2>
<ol type="1">
<li>Write data generation, data manipulation and discussion in <strong>one text file</strong>.
<ul>
<li>Syntax for text is Markdown.</li>
<li>Code lines start with <code>tab</code> or delimited by <code>```</code></li>
<li>Call this file <code>file.Rmd</code>, even if it includes more than <code>R</code> code.</li>
</ul></li>
<li><p>Call <code>knitr</code> on the <code>.Rmd</code> file in order to <strong>execute</strong> the code blocks and <strong>include</strong> the output of the code in one file. The output is a <code>.md</code> file.</p></li>
<li><p>Call <code>Pandoc</code> on the file, given suitable options (see below). <code>Pandoc</code> is responsible for translating the <code>.md</code> file to <strong>any format</strong> you want.</p></li>
</ol>
<h2 id="rmarkdown-format">RMarkdown format</h2>
<p>The <code>.Rmd</code> source of this report looks like this (50 lines):</p>
<pre class="sourceCode r"><code class="sourceCode r">text <-<span class="st"> </span><span class="kw">readLines</span>(<span class="st">"RR.Rmd"</span>,<span class="dt">encoding=</span><span class="st">"UTF-8"</span>)
<span class="kw">tail</span>(<span class="kw">head</span>(text, <span class="dv">70</span>),<span class="dv">50</span>)</code></pre>
<pre><code> [1] " <https://github.com/tonyblundell/pandoc-bootstrap-template>"
[2] "* Alternative LaTeX templates: "
[3] " <https://github.com/kjhealy/latex-custom-kjh>"
[4] "* Alternative Pandoc template: "
[5] " <https://github.com/kjhealy/pandoc-templates>"
[6] "* Non-official KU Leuven templates:"
[7] " <https://github.com/exporl/kuleuven-templates>"
[8] ""
[10] ""
[11] "# Idea"
[12] ""
[14] ""
[15] "## Workflow"
[16] ""
[17] "1. Write data generation, data manipulation and discussion in **one text file**."
[18] " * Syntax for text is Markdown."
[19] " * Code lines start with `tab` or delimited by `` ``` ``"
[20] " * Call this file `file.Rmd`, even if it includes more than `R` code."
[21] ""
[22] "2. Call `knitr` on the `.Rmd` file in order to **execute** the code blocks and **include** the output of the code in one file. The output is a `.md` file."
[23] ""
[24] "3. Call `Pandoc` on the file, given suitable options (see below). `Pandoc` is responsible for translating the `.md` file to **any format** you want. "
[25] ""
[27] ""
[28] "## RMarkdown format"
[29] ""
[30] "The `.Rmd` source of this report looks like this (50 lines):"
[31] ""
[32] "```{r, results=\"markup\", comment=\"\"}"
[33] "text <- readLines(\"RR.Rmd\",encoding=\"UTF-8\")"
[34] "tail(head(text, 70),50)"
[35] "```"
[36] ""
[38] ""
[39] "## Markdown format"
[40] ""
[41] "The `.md` source of this report looks like this (50 lines):"
[42] ""
[43] "```{r, results=\"markup\", comment=\"\"}"
[44] "text <- readLines(\"RR.md\",encoding=\"UTF-8\")"
[45] "tail(head(text, 70),50)"
[46] "```"
[47] ""
[48] "Conversion is done using `knitr`."
[49] "" </code></pre>
<h2 id="markdown-format">Markdown format</h2>
<p>The <code>.md</code> source of this report looks like this (50 lines):</p>
<pre class="sourceCode r"><code class="sourceCode r">text <-<span class="st"> </span><span class="kw">readLines</span>(<span class="st">"RR.md"</span>,<span class="dt">encoding=</span><span class="st">"UTF-8"</span>)
<span class="kw">tail</span>(<span class="kw">head</span>(text, <span class="dv">70</span>),<span class="dv">50</span>)</code></pre>
<pre><code> [1] " <https://github.com/tonyblundell/pandoc-bootstrap-template>"
[2] "* Alternative LaTeX templates: "
[3] " <https://github.com/kjhealy/latex-custom-kjh>"
[4] "* Alternative Pandoc template: "
[5] " <https://github.com/kjhealy/pandoc-templates>"
[6] "* Non-official KU Leuven templates:"
[7] " <https://github.com/exporl/kuleuven-templates>"
[8] ""
[10] ""
[11] "# Idea"
[12] ""
[14] ""
[15] "## Workflow"
[16] ""
[17] "1. Write data generation, data manipulation and discussion in **one text file**."
[18] " * Syntax for text is Markdown."
[19] " * Code lines start with `tab` or delimited by `` ``` ``"
[20] " * Call this file `file.Rmd`, even if it includes more than `R` code."
[21] ""
[22] "2. Call `knitr` on the `.Rmd` file in order to **execute** the code blocks and **include** the output of the code in one file. The output is a `.md` file."
[23] ""
[24] "3. Call `Pandoc` on the file, given suitable options (see below). `Pandoc` is responsible for translating the `.md` file to **any format** you want. "
[25] ""
[27] ""
[28] "## RMarkdown format"
[29] ""
[30] "The `.Rmd` source of this report looks like this (50 lines):"
[31] ""
[32] ""
[33] "```r"
[34] "text <- readLines(\"RR.Rmd\",encoding=\"UTF-8\")"
[35] "tail(head(text, 70),50)"
[36] "```"
[37] ""
[38] "```"
[39] " [1] \" <https://github.com/tonyblundell/pandoc-bootstrap-template>\" "
[40] " [2] \"* Alternative LaTeX templates: \" "
[41] " [3] \" <https://github.com/kjhealy/latex-custom-kjh>\" "
[42] " [4] \"* Alternative Pandoc template: \" "
[43] " [5] \" <https://github.com/kjhealy/pandoc-templates>\" "
[44] " [6] \"* Non-official KU Leuven templates:\" "
[45] " [7] \" <https://github.com/exporl/kuleuven-templates>\" "
[46] " [8] \"\" "
[48] "[10] \"\" "
[49] "[11] \"# Idea\" "
[50] "[12] \"\" "</code></pre>
<p>Conversion is done using <code>knitr</code>.</p>
<h2 id="pandoc">Pandoc</h2>
<p>A simple and a more involved example of running <code>Pandoc</code>:</p>
<pre class="shell"><code>pandoc file.md -o file.docx
pandoc file.md -o file.html \
-t html5 \
--template template.html \
--css template.css \
--highlight-style=tango --mathjax \
--toc --toc-depth 2</code></pre>
<p>Dust off your <code>Makefile</code> skills!</p>
<h1 id="some-examples">Some Examples</h1>
<h2 id="simple-example">Simple example</h2>
<p>The first example is in <code>R</code>. Let's say I want to plot a function</p>
<p><span class="math">\[ f(x) = \frac{\log(x^2+x+1)}{2x} \]</span></p>
<p>We first define <span class="math">\(x\)</span> and the function value <span class="math">\(y\)</span> (in doing so we have used some inline equations as well):</p>
<pre class="sourceCode r"><code class="sourceCode r">x <-<span class="st"> </span><span class="kw">seq</span>(<span class="dt">from=</span>-<span class="dv">5</span>,<span class="dt">to=</span><span class="dv">10</span>,<span class="dt">by=</span>.<span class="dv">01</span>)
y <-<span class="st"> </span>(<span class="kw">log</span>(x*x +<span class="st"> </span>x +<span class="st"> </span><span class="dv">1</span>))/(<span class="dv">2</span>*x)</code></pre>
<p>Then we can plot the function. We use the <code>ggplot2</code> package.</p>
<pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(ggplot2)
<span class="kw">qplot</span>(x,y,<span class="dt">geom=</span><span class="st">"line"</span>)</code></pre>
<figure>
<img src="figure/chunk-1.png" alt="Plot of the very special function defined above." /><figcaption>Plot of the very special function defined above.</figcaption>
</figure>
<p>See the figure for the result.</p>
<h2 id="working-with-data">Working with data</h2>
<p>Let us take a look at a dataset that comes with <code>R</code>, <code>mtcars</code>:</p>
<pre class="sourceCode r"><code class="sourceCode r"><span class="kw">summary</span>(mtcars)</code></pre>
<pre><code>## mpg cyl disp hp
## Min. :10.40 Min. :4.000 Min. : 71.1 Min. : 52.0
## 1st Qu.:15.43 1st Qu.:4.000 1st Qu.:120.8 1st Qu.: 96.5
## Median :19.20 Median :6.000 Median :196.3 Median :123.0
## Mean :20.09 Mean :6.188 Mean :230.7 Mean :146.7
## 3rd Qu.:22.80 3rd Qu.:8.000 3rd Qu.:326.0 3rd Qu.:180.0
## Max. :33.90 Max. :8.000 Max. :472.0 Max. :335.0
## drat wt qsec vs
## Min. :2.760 Min. :1.513 Min. :14.50 Min. :0.0000
## 1st Qu.:3.080 1st Qu.:2.581 1st Qu.:16.89 1st Qu.:0.0000
## Median :3.695 Median :3.325 Median :17.71 Median :0.0000
## Mean :3.597 Mean :3.217 Mean :17.85 Mean :0.4375
## 3rd Qu.:3.920 3rd Qu.:3.610 3rd Qu.:18.90 3rd Qu.:1.0000
## Max. :4.930 Max. :5.424 Max. :22.90 Max. :1.0000
## am gear carb
## Min. :0.0000 Min. :3.000 Min. :1.000
## 1st Qu.:0.0000 1st Qu.:3.000 1st Qu.:2.000
## Median :0.0000 Median :4.000 Median :2.000
## Mean :0.4062 Mean :3.688 Mean :2.812
## 3rd Qu.:1.0000 3rd Qu.:4.000 3rd Qu.:4.000
## Max. :1.0000 Max. :5.000 Max. :8.000</code></pre>
<p>Now the fun starts. Let's fit a model relates how many Miles/Gallon are consumed, given a weight.</p>
<pre class="sourceCode r"><code class="sourceCode r">model <-<span class="st"> </span><span class="kw">lm</span>(mpg ~<span class="st"> </span>wt, <span class="dt">data=</span>mtcars)
<span class="kw">summary</span>(model)</code></pre>
<pre><code>##
## Call:
## lm(formula = mpg ~ wt, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.5432 -2.3647 -0.1252 1.4096 6.8727
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.2851 1.8776 19.858 < 2e-16 ***
## wt -5.3445 0.5591 -9.559 1.29e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.046 on 30 degrees of freedom
## Multiple R-squared: 0.7528, Adjusted R-squared: 0.7446
## F-statistic: 91.38 on 1 and 30 DF, p-value: 1.294e-10</code></pre>
<p>This is verbatim output, we can use some <code>R</code> package magic to get proper tables as output as well using the <code>pander</code> package:</p>
<pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(pander)
<span class="kw">pander</span>(model)</code></pre>
<table>
<caption>Fitting linear model: mpg ~ wt</caption>
<col style="width: 25%" /><col style="width: 15%" /><col style="width: 18%" /><col style="width: 13%" /><col style="width: 13%" /><thead>
<tr class="header">
<th style="text-align: center;"> </th>
<th style="text-align: center;">Estimate</th>
<th style="text-align: center;">Std. Error</th>
<th style="text-align: center;">t value</th>
<th style="text-align: center;">Pr(>|t|)</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: center;"><strong>wt</strong></td>
<td style="text-align: center;">-5.344</td>
<td style="text-align: center;">0.5591</td>
<td style="text-align: center;">-9.559</td>
<td style="text-align: center;">1.294e-10</td>
</tr>
<tr class="even">
<td style="text-align: center;"><strong>(Intercept)</strong></td>
<td style="text-align: center;">37.29</td>
<td style="text-align: center;">1.878</td>
<td style="text-align: center;">19.86</td>
<td style="text-align: center;">8.242e-19</td>
</tr>
</tbody>
</table>
<p>We can also plot this information using the code below.</p>
<pre class="sourceCode r"><code class="sourceCode r"><span class="kw">qplot</span>(<span class="dt">x=</span>wt, <span class="dt">y=</span>mpg, <span class="dt">data=</span>mtcars, <span class="dt">xlab=</span><span class="st">"Weight (lb/1000)"</span>, <span class="dt">ylab=</span><span class="st">"Miles per Gallon"</span>,
<span class="dt">geom=</span><span class="kw">c</span>(<span class="st">"point"</span>,<span class="st">"smooth"</span>), <span class="dt">method=</span><span class="st">"lm"</span>)</code></pre>
<figure>
<img src="figure/unnamed-chunk-7-1.png" alt="A scatterplot of the fuel consumption versus the weight of the car, along with the results of a linear regression. See the text for more information." /><figcaption>A scatterplot of the fuel consumption versus the weight of the car, along with the results of a linear regression. See the text for more information.</figcaption>
</figure>
<h2 id="scraping-the-web">Scraping the web</h2>
<p>This script parses the Wikipedia page with Belgian Beers in order to get the data out. It then does some cleaning up and converts the data to different formats. The result can be stored in a file, but just display the first 10 rows.</p>
<pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(XML)
rawBeers <-<span class="st"> </span><span class="kw">readHTMLTable</span>(<span class="dt">doc=</span><span class="st">"http://nl.wikipedia.org/wiki/Lijst_van_Belgische_bieren"</span>)
beers <-<span class="st"> </span><span class="ot">NULL</span>
<span class="co"># The first table is not relevant, the rest is:</span>
for (i in <span class="kw">seq</span>(<span class="dv">2</span>,<span class="dv">28</span>)) {
beers <-<span class="st"> </span><span class="kw">rbind</span>(beers,rawBeers[[i]])
}
<span class="co"># Remove the percentage sign and convert to numbers:</span>
beers$Percentagealcohol <-<span class="st"> </span><span class="kw">gsub</span>(<span class="st">"%"</span>,<span class="st">""</span>,beers$Percentagealcohol)
beers$Percentagealcohol <-<span class="st"> </span><span class="kw">gsub</span>(<span class="st">","</span>,<span class="st">"."</span>,beers$Percentagealcohol)
beers$Percentagealcohol <-<span class="kw">as.numeric</span>(beers$Percentagealcohol)</code></pre>
<pre><code>## Warning: NAs introduced by coercion</code></pre>
<pre class="sourceCode r"><code class="sourceCode r"><span class="co"># A few entries do not have a percentage entry</span>
nas <-<span class="st"> </span><span class="kw">length</span>(beers[<span class="kw">is.na</span>(beers$Percentagealcohol),])</code></pre>
<p>The number of entries without percentage entry is: 4.</p>
<p>We use <code>pander</code> again for displaying the top-10 of beers with the highest amount of alcohol:</p>
<pre class="sourceCode r"><code class="sourceCode r"><span class="kw">pander</span>(
<span class="kw">head</span>(
beers[<span class="kw">order</span>(beers$Percentagealcohol,<span class="dt">decreasing=</span><span class="ot">TRUE</span>),
<span class="kw">c</span>(<span class="st">"Merk"</span>,<span class="st">"Percentagealcohol"</span>)],
<span class="dv">10</span>)
)</code></pre>
<table>
<col style="width: 13%" /><col style="width: 37%" /><col style="width: 26%" /><thead>
<tr class="header">
<th style="text-align: center;"> </th>
<th style="text-align: center;">Merk</th>
<th style="text-align: center;">Percentagealcohol</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: center;"><strong>196</strong></td>
<td style="text-align: center;">Black Damnation V (Double Black)</td>
<td style="text-align: center;">26</td>
</tr>
<tr class="even">
<td style="text-align: center;"><strong>412</strong></td>
<td style="text-align: center;">Cuvée d'Erpigny</td>
<td style="text-align: center;">15</td>
</tr>
<tr class="odd">
<td style="text-align: center;"><strong>191</strong></td>
<td style="text-align: center;">Black Albert</td>
<td style="text-align: center;">13</td>
</tr>
<tr class="even">
<td style="text-align: center;"><strong>192</strong></td>
<td style="text-align: center;">Black Damnation I</td>
<td style="text-align: center;">13</td>
</tr>
<tr class="odd">
<td style="text-align: center;"><strong>194</strong></td>
<td style="text-align: center;">Black Damnation III (Black Mes)</td>
<td style="text-align: center;">13</td>
</tr>
<tr class="even">
<td style="text-align: center;"><strong>195</strong></td>
<td style="text-align: center;">Black Damnation IV (Coffée Club)</td>
<td style="text-align: center;">13</td>
</tr>
<tr class="odd">
<td style="text-align: center;"><strong>313</strong></td>
<td style="text-align: center;">Bush de Noël Premium</td>
<td style="text-align: center;">13</td>
</tr>
<tr class="even">
<td style="text-align: center;"><strong>314</strong></td>
<td style="text-align: center;">Bush de Nuits</td>
<td style="text-align: center;">13</td>
</tr>
<tr class="odd">
<td style="text-align: center;"><strong>315</strong></td>
<td style="text-align: center;">Bush Prestige</td>
<td style="text-align: center;">13</td>
</tr>
<tr class="even">
<td style="text-align: center;"><strong>411</strong></td>
<td style="text-align: center;">Cuvée Delphine</td>
<td style="text-align: center;">13</td>
</tr>
</tbody>
</table>
<h1 id="different-languages">Different languages</h1>
<h2 id="python">Python</h2>
<pre class="sourceCode python"><code class="sourceCode python"><span class="ch">import</span> pprint
pprint.pprint(<span class="dt">zip</span>((<span class="st">'Byte'</span>, <span class="st">'KByte'</span>, <span class="st">'MByte'</span>, <span class="st">'GByte'</span>, <span class="st">'TByte'</span>),
(<span class="dv">1</span> << <span class="dv">10</span>*i <span class="kw">for</span> i in <span class="dt">xrange</span>(<span class="dv">5</span>))))</code></pre>
<pre><code>## [('Byte', 1),
## ('KByte', 1024),
## ('MByte', 1048576),
## ('GByte', 1073741824),
## ('TByte', 1099511627776)]</code></pre>
<h2 id="scala">Scala</h2>
<pre class="sourceCode scala"><code class="sourceCode scala"><span class="kw">val</span> collection = <span class="kw">for</span> {i <- <span class="dv">1</span> to <span class="dv">10</span>} <span class="kw">yield</span> {i}
<span class="kw">val</span> mapped = collection <span class="fu">map</span> (x => x*x)
<span class="kw">val</span> reduced = mapped <span class="fu">reduce</span> (_ + _)
<span class="fu">println</span>(reduced)</code></pre>
<pre><code>## 385</code></pre>
<h1 id="sweave">Sweave</h1>
<p><code>knitr</code> can handle sweave documents as well.</p>
<pre class="sourceCode R"><code class="sourceCode r"><span class="kw">library</span>(knitr)
<span class="kw">Sweave2knitr</span>(<span class="st">'dummy.Rnw'</span>)
<span class="kw">knit</span>(<span class="st">'dummy-knitr.Rnw'</span>)</code></pre>
<p>Or, just write in RMarkdown:</p>
<pre class="script"><code>Rscript -e 'library(knitr); knit("rmarkdown-version.Rmd")
pandoc rmarkdown-version.md -o rmarkdown-version.pdf --toc</code></pre>
<p>Text (and code) can be translated using <code>Pandoc</code></p>
<figure>
<img src="figure/SweaveVsRmd.png" alt="Side-by-side view of the same text/code in RMarkdown and Sweave" /><figcaption>Side-by-side view of the same text/code in RMarkdown and Sweave</figcaption>
</figure>
<h1 id="what-to-use-it-for">What to use it for?</h1>
<p>I use it for:</p>
<ul>
<li>Creating presentations (<code>reveal.js</code>)</li>
<li>Writing reports (including code)</li>
<li>Writing papers (just text)</li>
<li>Making coffee</li>
</ul>
<h1 id="how-to-use-it">How to use it?</h1>
<h2 id="rstudio">RStudio</h2>
<figure>
<img src="figure/RStudio-Screenshot.png" alt="Screenshot of (part of) RStudio" /><figcaption>Screenshot of (part of) RStudio</figcaption>
</figure>
<h2 id="your-favourite-editor-here"><em>your favourite editor here</em></h2>
<figure>
<img src="figure/Sublime-Screenshot.png" alt="Screenshot of Sublime Editor with Markdown mode" /><figcaption>Screenshot of Sublime Editor with Markdown mode</figcaption>
</figure>
<h1 id="additional-pointers">Additional pointers</h1>
<ul>
<li>Markdown to <code>Reveal.js</code>: <a href="http://tverbeiren.github.io/BigDataBe-Spark/#/">http://tverbeiren.github.io/BigDataBe-Spark/#/</a></li>
<li>Markdown and <code>Pandoc</code> for writing a paper: <a href="http://homes.esat.kuleuven.be/~bioiuser/blog/?p=243">http://homes.esat.kuleuven.be/~bioiuser/blog/?p=243</a></li>
<li>Markdown and <code>Pandoc</code> for lecture notes: <a href="https://bitbucket.org/tverbeiren/i0u19a">https://bitbucket.org/tverbeiren/i0u19a</a></li>
<li>You can find everything I showed here at: <a href="http://github.io/tverbeiren/ReproducibleDataAnalysis/">http://github.io/tverbeiren/ReproducibleDataAnalysis/</a></li>
</ul>
</div>
</div>
</div>
</body>
</html>