-
Notifications
You must be signed in to change notification settings - Fork 4
/
main.tex
409 lines (351 loc) · 17.6 KB
/
main.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
%
%
% UCSD Doctoral Dissertation Template
% -----------------------------------
% http:\\ucsd-thesis.googlecode.com
%
%
% ----------------------------------------------------------------------
% WARNING:
%
% This template has not endorced by OGS or any other official entity.
% The official formatting guide can be obtained from OGS.
% It can be found on the web here:
% http://ogs.ucsd.edu/AcademicAffairs/Documents/Dissertations_Theses_Formatting_Manual.pdf
%
% No guaranty is made that this LaTeX class conforms to the official UCSD guidelines.
% Make sure that you check the final document against the Formatting Manual.
%
% That being said, this class has been used successfully for publication of
% doctoral theses.
%
% The ucsd.cls class files are only valid for doctoral dissertations.
%
%
% ----------------------------------------------------------------------
% GETTING STARTED:
%
% Lots of information can be found on the project wiki:
% http://code.google.com/p/ucsd-thesis/wiki/GettingStarted
%
%
% To make a pdf from this template use the command:
% pdflatex template
%
%
% To get started please read the comments in this template file
% and make changes as appropriate.
%
%
% ----------------------------------------------------------------------
%
% A thesis using this template and class file was last successfully
% submitted on 2009/03/19 (at least as far as I know).
%
% If you successfully submit a thesis with this package please let us
% know.
%
% ----------------------------------------------------------------------
% If you desire more control, please see the attached files:
%
% * ucsd.cls -- Class file
% * uct10.clo -- Configuration files for font sizes 10pt, 11pt, 12pt
% uct11.clo
% uct12.clo
%
% ----------------------------------------------------------------------
% Setup the documentclass
% default options: 11pt, oneside, final
%
% fonts: 10pt, 11pt, 12pt -- are valid for UCSD dissertations.
% sides: oneside, twoside -- note that two-sided theses are not accepted
% by OGS.
% mode: draft, final -- draft mode switches to single spacing,
% removes hyperlinks, and places a black box
% at every overfull hbox (check these before
% submission).
% chapterheads -- Include this if you want your chapters to read:
% Chapter 1
% Title of Chapter
%
% instead of
%
% 1 Title of Chapter
\def\RELEASE{}
\ifdefined\RELEASE
\documentclass[12pt,chapterheads]{ucsd}
\else
\documentclass[12pt,chapterheads,draft]{ucsd}
\fi
% Include all packages you need here.
% Some standard options are suggested below.
%
% See the project wiki for information on how to use
% these packages. Other useful packages are also listed there.
%
% http://code.google.com/p/ucsd-thesis/wiki/GettingStarted
%% AMS PACKAGES - Chances are you will want some or all
% of these if writing a dissertation that includes equations.
% \usepackage{amsmath, amscd, amssymb, amsthm}
%% GRAPHICX - This is the standard package for
% including graphics for latex/pdflatex.
\usepackage{graphicx}
\usepackage{booktabs}
\usepackage{tabularx}
\usepackage{array}
\newcolumntype{P}[1]{>{\centering\arraybackslash}p{#1}}
\newcolumntype{M}{>{\centering\arraybackslash} m{2cm} }
\usepackage{hyphenat} % allow for hyphenated words to be further hyphenated
% https://tex.stackexchange.com/a/2715
\usepackage{rotating}
\usepackage{multirow}
\usepackage{textcomp}
\usepackage{amssymb}
\usepackage{url}
\urlstyle{same}
%% SUBFIGURE - Use this to place multiple images in a
% single figure. Subfigure will handle placement and
% proper captioning (e.g. Figure 1.2(a))
% \usepackage{subfigure}
%% LATIN MODERN FONTS (replacements for Computer Modern)
% \usepackage{lmodern}
% \usepackage[T1]{fontenc}
%% INDEX
% Uncomment the following two lines to create an index:
\usepackage{makeidx}
\makeindex
% You will need to uncomment the \printindex line near the
% bibliography to display the index. Use the command
% \index{keyword}
% within the text to create an entry in the index for keyword.
%% HYPERLINKS
% To create a PDF with hyperlinks, you need to include the hyperref package.
% THIS HAS TO BE THE LAST PACKAGE INCLUDED!
% Note that the options plainpages=false and pdfpagelabels exist
% to fix indexing associated with having both (ii) and (2) as pages.
% Also, all links must be black according to OGS.
% See: http://www.tex.ac.uk/cgi-bin/texfaq2html?label=hyperdupdest
% Note: This may not work correctly with all DVI viewers (i.e. Yap breaks).
% NOTE: hyperref will NOT work in draft mode, as noted above.
\usepackage[colorlinks=true, pdfstartview=FitV,
linkcolor=black, citecolor=black,
urlcolor=black, plainpages=false,
pdfpagelabels]{hyperref}
% \hypersetup{ pdfauthor = {Your Name Here},
% pdftitle = {The Title of The Dissertation},
% pdfkeywords = {Keywords for Searching},
% pdfcreator = {pdfLaTeX with hyperref package},
% pdfproducer = {pdfLaTeX} }
\usepackage{verbatim}
\usepackage[toc,nopostdot,style=alttree]{glossaries}
\glssetwidest{PERMANOVAX}% widest name
\makeglossaries
\include{abbreviations}
\begin{document}
%% FRONT MATTER
%
% All of the front matter.
% This includes the title, degree, dedication, vita, abstract, etc..
\include{frontmatter}
%% DISSERTATION
\chapter{The hidden world within ourselves}\label{chapter_review}
Recent years have seen an increased interest in the study of microbial
communities, partly driven by the affordability of the enabling technologies.
This new focus provides a previously ignored axis of understanding to a vast
number of research fields. In addition, this has uncovered a number of
associations and demonstrated mechanisms, where microbial communities are key
explanatory variables.
In fields where microbial communities were already taken into consideration,
these advancements allowed experiments to be conducted at an unprecedented
scale \cite{RN35, RN4061, RN4267}. As a consequence, the underlying software
and methods to analyze these studies needed to accommodate volumes of
information previously unseen. Notably, most of the software used in this
context has been developed to interpret and analyze interdisciplinary
experiments that span a broad range of seemingly disjoint fields (for example
biofuel development \cite{RN4268}, built environment \cite{RN4270, RN4083},
forensics \cite{RN4269, RN4271}, etc). As such, we motivate the rest of the
work in this thesis by presenting a comprehensive survey focused on the human
microbiome. The following section appeared in the journal
\textsl{Annual Review of Pharmacology and Toxicology, 2017}.
\ifdefined\RELEASE
\include{chapter_review}
\else
\section{Annual Reviews Paper}
\fi
\chapter{Exploratory microbiome data analysis}\label{exploratory_chapter}
As we learnt in Chapter~\ref{chapter_review}, there's a wealth of information
that can be acquired through the study of microbial communities. The
acquisition of this knowledge, however, is often only possible through the
usage of specialized software for analysis and visualization. \gls{qiime},
introduced previously, is a successful\footnote{At the moment of this writing
the original paper has been cited over 8,000 times according to Google Scholar,
and the Python package has been downloaded over 34,000 times, data from
\url{https://pypi.org} and \url{https://anaconda.org}.} microbiome analysis
pipeline that integrates a collection of bioinformatics tools and makes these
programs available through a unified user-interface. Until its fifth release,
\gls{qiime} relied on \gls{king} to represent three-dimensional
$\beta$-diversity plots. \gls{king} was originally developed as a molecular
viewer and was \textit{hacked} to act as a scatter-plot viewer. As datasets got
larger and more common, it became clear that we needed to develop our own
solution, thus we created Emperor.
Emperor is an interactive $\beta$-diversity viewer, tailored to fit the
\gls{qiime} ecosystem. $\beta$-diversity plots, commonly referred to as
ordinations or dimensionality reductions, are often the starting point when
analyzing a microbiome dataset. Their ability to overview the relative
differences of an unlimited number of samples with an unlimited number of
covariates makes this representation invaluable to diagnose and troubleshoot
any problems that may have occurred during sample collection, preparation, or
processing. For example, batch effects are often easy to notice in this
representation \cite{Gibbons165910} but more challenging to see in
feature-level analyses, especially if those specific features are not affected
by the batch effects.
As use-cases arose, these were integrated as part of the software, something
previously not possible with \gls{king}. A notable feature allowed users to
animate longitudinal sampling schemes, of key importance for the work in
Chapter~\ref{chapter_ibds} and Chapter~\ref{chapter_fmts}. The following two
sections introduce Emperor, further discussing some of its statistical
capabilities, and the use of animated ordinations as a way to interact with
microbiome data.
Chapter~\ref{section_emperor} was published in the journal \textsl{GigaScience,
2013} and Chapter~\ref{section_animations} was published in the journal
\textsl{Cell Host \& Microbe, 2017}. As the lead contributor of these two
sections, I co-wrote the text, generated the main figures, wrote the software
for the Python package, wrote documentation, and still maintain the project in
the context of \gls{qiime} and provide support through the \gls{qiime} Forum.
\ifdefined\RELEASE
\include{chapter_EMPeror}
\include{chapter_animations}
\else
\section{EMPeror paper}\label{section_emperor}
\section{EMPeror Animations paper}\label{section_animations}
\fi
\chapter{Inflammatory bowel disease in Dogs and Humans}\label{chapter_dogs}
\gls{ibd} is a family of conditions divided into two major categories, \gls{uc}
and \gls{cd}, each of which can be further divided into other subtypes
specifying location, age of diagnosis, severity, and behaviour of the disease
\cite{RN4265}. In all cases \gls{ibd} is associated with diarrhea,
inflammation, and flaring episodes of exacerbated discomfort. In general,
diagnostic methods rely on questionnaires, direct examinations of affected
intestinal tissue, or quantification of calprotectin \cite{Sipponen2008}.
In this chapter, we introduce work motivated by our recent efforts at
characterizing the microbiome in new-onset and treatment-naive Crohn's disease
subjects \cite{RN154}. Specifically, we focus on comparing the microbiomes of
\gls{ibd}-affected humans and dogs. As a result, we further expanded the
repertoire of features that we know and associate with \gls{ibd} in dogs
\cite{RN153}, and through a meta-analysis contextualize them with human fecal
samples \cite{RN154}.
As we did before \cite{RN154}, we create a dysbiosis network\footnote{A
dysbiosis network is a graph constructed using a correlation matrix as a
weighted adjacency matrix.} to statistically infer the bacterial genera
associated with the disease. This representation allows us to directly compare
microbial groups across datasets, and explore their relationship to each other.
We find that the human and dog dysbiosis networks are overlapping, both in
structure and in the microbes associated with \gls{ibd} and non-\gls{ibd}
communities. However, we uncover a taxon that shifts from being non
\gls{ibd}-associated in humans to being \gls{ibd}-associated in dogs.
Chapter~\ref{dogs} appeared in the journal \textsl{Nature Microbiology, 2016}.
As the lead contributor of this project, I co-wrote the text, generated the
main figures, processed, analyzed, interpreted, and deposited the data into a
public repository.
\ifdefined\RELEASE
\include{chapter_dogs}
\else
\section{IBD dogs paper}\label{dogs}
\fi
\chapter{Dynamic features of inflammatory bowel disease}\label{chapter_ibds}
In recent years, clinical research in \gls{ibd} has benefited from multi-'omic
(genomic \cite{RN4217}, metabolomic \cite{RN4264}, and metagenomic
\cite{RN4263}) characterizations of the condition. Each of these approaches
describes a new component of relationships that, one by one, seem to uncover
the processes regulating the inflammation and well-being of the affected hosts.
Nonetheless, longitudinal descriptions (and as a consequence representations)
of \gls{ibd} were largely unexplored.
This chapter introduces two pioneering longitudinal studies of \gls{ibd}. One
where the sampling is sparse (every three months) and another where the
sampling is dense (as much as every day). Both cohorts present increased rates
of microbial variation. Specifically, we see that this appears to be different
for the different subtypes of \gls{ibd}. In order to visualize the increased
variation, we rely on the techniques introduced in
Chapter~\ref{exploratory_chapter}, and on the definition of a reference plane
that acts as a proxy for healthy variation.
Motivated by these findings, we use an classification model to
determine the benefits of increased longitudinal sampling. For both
(independently collected and sequenced) datasets, we observe that including
more than one sample per subject increases the performance of a classifier.
This property is only possible when we transform the original representation of
the data and create per-subject consensus features based on multiple
timepoints. Briefly, these features aggregate multiple samples together and
exploit the increased instability as an informative classification feature.
Although we exercise and test this method with human-gut data, the approach is
not restricted in any way to this environment. This approach could potentially
benefit other applications, specially where longitudinal descriptors might be
predictive of a state of interest.
% TODO: add publication
Chapter~\ref{section_plane} appeared in \textsl{Nature Microbiology, 2017}. My
role in this project was to act as the lead analyst of the project, I developed
a collection of new algorithms to represent the data, produced a series of
visualizations used in the paper, co-wrote the text, and interpreted the
results. Chapter~\ref{section_ibd} appeared in the journal \textsl{Gut, 2017}.
As the lead contributor of this project, I co-wrote the text, generated the
main figures, wrote the software used for analysis, interpreted the results,
and deposited the data into a public repository.
\ifdefined\RELEASE
\include{chapter_plane}
\include{chapter_ibd}
\else
\section{Janet's IBD paper}\label{section_plane}
\section{Balfour's IBD paper}\label{section_ibd}
\fi
\chapter{Restoring a lost ecosystem}\label{chapter_fmts}
\glsresetall
\gls{cdi} is a hospital-acquired infection resulting from antibiotic
administration for an unrelated condition. Paradoxically, the treatment for
\gls{cdi} is to prescribe more antibiotics. This treatment generally
translatesto depleted microbial diversity. As a consequence, resources that
could be consumed by other communities are now available for
\textit{Clostridium difficile} to thrive on. More recently \gls{cdi} has been
treated through the administration of \glspl{fmt}, with a success rate above
90\% \cite{RN4129}. A \gls{fmt} consists of reintroducing bacteria into the
\gls{gi} tract of an affected patient. Frequently, after a day, the symptoms of
\gls{cdi} disappear and the recipients recover. In this chapter, we examine two
cohorts and focus on the short and long-term changes in the gut microbiota
after a \gls{fmt}.
In the first cohort, relying on an animated $\beta$-diversity plot (as
introduced in Chapter~\ref{section_animations}), we visualize the immediate
changes in the gut microbiome after a \gls{fmt}. In addition, we also visualize
the stability of the microbiome after the transplant and note that the only
point of instability occurs immediately after the \gls{fmt}.
The second cohort is composed of patients suffering from \gls{cdi} and in some
cases with underlying \gls{ibd}. We find that, while both groups recover from
the \gls{cdi}, the micro-ecological effects of \gls{fmt} appear to be dampened
in subjects with underlying \gls{ibd}. Overall, individual phylogenetic
diversity is lower, and the number of taxa differentially present before and
after the transplant is also smaller (when compared to the non-\gls{ibd}
counterparts).
Chapter~\ref{section_moviefmt} appeared in the journal \textsl{Microbiome,
2015}. As the lead analyst in this project, I contributed to the text,
analyzed and interpreted the data, wrote software used for the analysis, and
generated the main visuals. Chapter~\ref{section_fmt} appered in the journal
\textsl{Microbiome, 2017}. As the co-lead contributor to this project, I
co-wrote the text, generated the figures, analyzed, interpreted, and deposited
the data into a public repository.
\ifdefined\RELEASE
\include{chapter_moviefmt}
\include{chapter_fmt}
\else
\section{FMT paper 1}\label{section_moviefmt}
\section{FMT paper 2}\label{section_fmt}
\fi
\chapter{Future Work}
\include{chapter_conclusions}
\appendix
\include{appendix}
%% END MATTER
\printindex %% Uncomment to display the index
% \nocite{} %% Put any references that you want to include in the bib
% but haven't cited in the braces.
% \bibliographystyle{alpha} %% This is just my personal favorite style.
% There are many others.
\bibliography{references}
\bibliographystyle{plain}
\end{document}