From 85078caa44542d0a42d2915f72159b27c6b5b256 Mon Sep 17 00:00:00 2001 From: cansavvy Date: Tue, 23 Apr 2024 08:21:03 -0400 Subject: [PATCH] Fix spelling and urls --- 09a-WGS-and-WXS.Rmd | 2 +- 11a-ATAC-Seq.Rmd | 2 +- 11c-ChIP-Seq.Rmd | 2 +- 11d-CUT-and-RUN.Rmd | 4 ++-- 13-microbiome.Rmd | 20 ++++++++++---------- 14-tool-glossary.Rmd | 2 +- 6 files changed, 16 insertions(+), 16 deletions(-) diff --git a/09a-WGS-and-WXS.Rmd b/09a-WGS-and-WXS.Rmd index 5f202a0b..dbcfc116 100644 --- a/09a-WGS-and-WXS.Rmd +++ b/09a-WGS-and-WXS.Rmd @@ -57,7 +57,7 @@ For WXS or other targeted sequencing specifically (so not relevant to WGS data), - [Hybridization based enrichment](https://www.paragongenomics.com/target-enrichment/). This includes a variety of widely used methods that we will broadly categorize in two groups: Array-based and In-solution: - [Array-based capture](https://en.wikipedia.org/wiki/Exome_sequencing#:~:text=Target%2Denrichment%20strategies-,Array%2Dbased%20capture,-In%2Dsolution%20capture) uses microarrays that have probes designed to bind to known coding sequences. Fragments that do not bind to these probes are washed away, leaving the sample with known coding sequences bound and ready for PCR amplification [@Hodges2007; @Turner2009]. - - [In-solution capture](https://en.wikipedia.org/wiki/Exome_sequencing#In-solution_capture) has become more popular in recent years because it [requires less sample DNA than array-base capture](https://sequencing.roche.com/global/en/article-listing/what-is-ngs-target-enrichment-and-why-is-it-important.html). To enrich for coding sequences, in-solution capture has a pool of custom probes that are designed to bind to the coding regions in the sample. Attached to these probes are beads which can be physically separated from DNA that is not bound to the probes (this should be the non-coding sequences) [@Mamanova2010]. + - [In-solution capture](https://en.wikipedia.org/wiki/Exome_sequencing#In-solution_capture) has become more popular in recent years because it [requires less sample DNA than array-base capture](https://www.illumina.com/techniques/sequencing/dna-sequencing/targeted-resequencing/target-enrichment.html). To enrich for coding sequences, in-solution capture has a pool of custom probes that are designed to bind to the coding regions in the sample. Attached to these probes are beads which can be physically separated from DNA that is not bound to the probes (this should be the non-coding sequences) [@Mamanova2010]. - [PCR/Amplicon based enrichment](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9318977/) requires even less sample than the other two strategies and so is ideal for when the amount of sample is limited or the DNA has been otherwise processed harshly (e.g. with paraffin embedding). Because the other two enrichment methods are done after PCR amplification has been done to the whole genomic DNA sample, its thought that this method of selective PCR amplification for enrichment can result in more uniformly amplified DNA in the resulting sample. However this is less suitable the more gene targets you have (like if you truly need to sequence all of the exome) since amplicons need to be designed for each target. Overall it is much more affordable of a method. There are several variations of this method that are [discussed thoroughly by @Singh2022](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9318977/). ## DNA Sequencing Pipeline Overview diff --git a/11a-ATAC-Seq.Rmd b/11a-ATAC-Seq.Rmd index 0e0df953..294bf2ba 100644 --- a/11a-ATAC-Seq.Rmd +++ b/11a-ATAC-Seq.Rmd @@ -255,7 +255,7 @@ This section has been written by AI and needs verification by experts. This is m ## More resources about ATAC-seq data - [ATAC-seq overview from Galaxy](https://training.galaxyproject.org/training-material/topics/epigenetics/tutorials/atac-seq/slides.html#1) - these slides explain the overarching concepts of ATAC-seq. -- [ATAC seq guidelines from Harvard](https://informatics.fas.harvard.edu/atac-seq-guidelines.html) - this workflow runs through step by step how to analysis ATAC-seq data and what different parameters mean. +- [ATAC seq guidelines from ENCODE](https://www.encodeproject.org/atac-seq/) - this step by step overview covers ATAC-seq workflow and considerations. - [ATAC-seq review](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1929-3) - this paper gives a great overview of ATAC-seq data and step by step what needs to be considered. - [Identifying and mitigating bias in chromatin](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4473780/) - [CHIP Snakemake pipeline for analyzing ChIP-seq and chromatin accessibility data](https://f1000research.com/articles/10-517) diff --git a/11c-ChIP-Seq.Rmd b/11c-ChIP-Seq.Rmd index 281d48e1..f5c65c4d 100644 --- a/11c-ChIP-Seq.Rmd +++ b/11c-ChIP-Seq.Rmd @@ -139,7 +139,7 @@ Annotation - [EnrichedHeatmap](https://bioconductor.org/packages/release/bioc/html/EnrichedHeatmap.html)is an R package for making heatmaps that visualize the enrichment of genomic signals on specific target regions. - [SeqMonk](https://www.bioinformatics.babraham.ac.uk/projects/seqmonk/) is a software package designed for the visualization and analysis of large-scale genomic data. It includes a heatmap function that can generate heatmaps from ChIP-seq data. - [ngs.plot](https://github.com/shenlab-sinai/ngsplot) is a tool that can generate different types of plots, including heatmaps, from NGS data. It includes a ChIP-seq specific mode that can be used to generate heatmaps from ChIP-seq data. -- [ChAsE: ChAsE (ChIP-seq Analysis Engine)](http://chase.cs.univie.ac.at/overview) is a web-based platform for ChIP-seq analysis that includes a heatmap function that can generate heatmaps from ChIP-seq data. +- [ChAsE: ChAsE (ChIP-seq Analysis Engine)](https://github.com/hyounesy/ChAsE/) is a web-based platform for ChIP-seq analysis that includes a heatmap function that can generate heatmaps from ChIP-seq data. These tools allow users to generate heatmaps of ChIP-seq data, which can be used to identify enriched regions of binding and to visualize patterns of binding across genomic regions. diff --git a/11d-CUT-and-RUN.Rmd b/11d-CUT-and-RUN.Rmd index fd446c07..ae7bea23 100644 --- a/11d-CUT-and-RUN.Rmd +++ b/11d-CUT-and-RUN.Rmd @@ -48,7 +48,7 @@ ottrpal::include_slide("https://docs.google.com/presentation/d/1YwxXy2rnUgbx_7B7 ### CUT&RUN -**Cleavage Under Targets and Release Using Nuclease**, **CUT&RUN** for short, is an antibody-targeted chromatin profiling method to measure the histone modification enrichment or transcription factor binding. This is a more advanced technology for epigenomic landscape profiling compared to the tradditional ChIP-seq technology and known for its easy implementation and low cost. The procedure is carried out in situ where micrococcal nuclease tethered to protein A binds to an antibody of choice and cuts immediately adjacent DNA, releasing DNA-bound to the antibody target. Therefore, CUT&RUN produces precise transcription factor or histone modification profiles while avoiding crosslinking and solubilization issues. Extremely low backgrounds make profiling possible with typically one-tenth of the sequencing depth required for ChIP-seq and permit profiling using low cell numbers (i.e., a few hundred cells) without losing quality. +**Cleavage Under Targets and Release Using Nuclease**, **CUT&RUN** for short, is an antibody-targeted chromatin profiling method to measure the histone modification enrichment or transcription factor binding. This is a more advanced technology for epigenomic landscape profiling compared to the traditional ChIP-seq technology and known for its easy implementation and low cost. The procedure is carried out in situ where micrococcal nuclease tethered to protein A binds to an antibody of choice and cuts immediately adjacent DNA, releasing DNA-bound to the antibody target. Therefore, CUT&RUN produces precise transcription factor or histone modification profiles while avoiding crosslinking and solubilization issues. Extremely low backgrounds make profiling possible with typically one-tenth of the sequencing depth required for ChIP-seq and permit profiling using low cell numbers (i.e., a few hundred cells) without losing quality. @@ -81,7 +81,7 @@ CUT&RUN has been automated using a Beckman Biomek FX liquid-handling robot so th ### CUT&Tag -**Cleavage Under Targets and Tagmentation**, **CUT&Tag** for short, is an enzyme tethering approach to profiling chromatin proteins, including histone marks and RNA Pol II. CUT&Tag generates sequence-ready libraries without the need for end polishing and adaptor ligation. It uses a proteinA-Tn5 fusion to tether Tn5 transposase near the site of an antibody to a chromatin protein of interest. A secondary antibody, such as guinea pig anti-rabbit antibody, is used to increase the efficiency of tethering the pA-Tn5 to the target primary antibody. The pA-Tn5 complex is pre-loaded with sequencing adapters that insert into adjacent DNA upon activation with magnesium. CUT&Tag has a very low background and can be performed in a single tube in as little as a day, though primary antibodies are typically incubated overnight. It can also be used with the ICELL8 nano dispensation system to profile single cells. +**Cleavage Under Targets and Tagmentation**, **CUT&Tag** for short, is an enzyme tethering approach to profiling chromatin proteins, including histone marks and RNA Pol II. CUT&Tag generates sequence-ready libraries without the need for end polishing and adapter ligation. It uses a proteinA-Tn5 fusion to tether Tn5 transposase near the site of an antibody to a chromatin protein of interest. A secondary antibody, such as guinea pig anti-rabbit antibody, is used to increase the efficiency of tethering the pA-Tn5 to the target primary antibody. The pA-Tn5 complex is pre-loaded with sequencing adapters that insert into adjacent DNA upon activation with magnesium. CUT&Tag has a very low background and can be performed in a single tube in as little as a day, though primary antibodies are typically incubated overnight. It can also be used with the ICELL8 nano dispensation system to profile single cells. A streamlined CUT&Tag protocol was introduced by the [Henikoff Lab](https://research.fredhutch.org/henikoff/en.html) that suppresses DNA accessibility artifacts to ensure high-fidelity mapping of the antibody-targeted protein and improves the signal-to-noise ratio over current chromatin profiling methods. Streamlined CUT&Tag can be performed in a single PCR tube, from cells to amplified libraries, providing low-cost genome-wide chromatin maps. By simplifying library preparation, CUT&Tag-direct requires less than a day at the bench, from live cells to sequencing-ready barcoded libraries. As a result of low background levels, barcoded and pooled CUT&Tag libraries can be sequenced for as little as $25 per sample. This enables routine genome-wide profiling of chromatin proteins and modifications and requires no special skills or equipment. diff --git a/13-microbiome.Rmd b/13-microbiome.Rmd index 1b00ac59..cc42353f 100644 --- a/13-microbiome.Rmd +++ b/13-microbiome.Rmd @@ -16,10 +16,10 @@ ottrpal::include_slide("https://docs.google.com/presentation/d/1YwxXy2rnUgbx_7B7 ``` ## A Brief Introduction to Microbiomes Microbes are everywhere. We have found these tiny organisms in the deepest regions of the ocean and in the upper atmosphere. We have found them in: -+ water that has been solid ice for millennia in the Antarctic -+ boiling water in the geysers of Yellowstone National Park. -+ the driest natural environments on Earth, including the Atacama Desert in Chile, where desiccation resistant microbes hide in the soil sometimes waiting ten years for the drop of rain that will jump start their metabolism long enough for them to reproduce before they return to dormancy. -+ perpetually damp environments, like the intestinal tract of the human body where they are constantly the subject of inspection by our diligent immune cells, and where they impact our health in positive and negative ways that we are only beginning to understand. ++ water that has been solid ice for millennia in the Antarctic ++ boiling water in the geysers of Yellowstone National Park. ++ the driest natural environments on Earth, including the Atacama Desert in Chile, where desiccation resistant microbes hide in the soil sometimes waiting ten years for the drop of rain that will jump start their metabolism long enough for them to reproduce before they return to dormancy. ++ perpetually damp environments, like the intestinal tract of the human body where they are constantly the subject of inspection by our diligent immune cells, and where they impact our health in positive and negative ways that we are only beginning to understand. + our nuclear reactors, prompting questions about whether we could harness them as tiny machines to help us remediate environmental disasters of the past, present, and future. If we looked hard enough, I think we’d find them on the surface of the moon and Mars, though they are probably microbes who stowed away on our spacecraft and are now patiently waiting for a drop of water that may or may not ever show up. If we ever colonize those worlds, microbes will be an indispensable ally in creating an environment that could sustain us. @@ -27,17 +27,17 @@ Microbes are everywhere. We have found these tiny organisms in the deepest regio ```{r, fig.alt = "Learning Objectives", out.width = "100%", echo = FALSE} ottrpal::include_slide("https://docs.google.com/presentation/d/1YwxXy2rnUgbx_7B7ENH9wpDX-j6JpJz6lGVzOkjo0qY/edit#slide=id.g26ebab787e9_0_0") ``` -This figure is adapted from [@Tignat-Perrier2022] under Creative Commons license. +This figure is adapted from [@Tignat-Perrier2022] under Creative Commons license. Microbes almost never live alone in the real world (i.e., outside of a laboratory). Rather they exist in communities of different species who are interacting with each other and their environment. Some of these communities will have many different types of organisms, and some will have only a few. Because of the large number of species and individuals involved, no two communities will ever be exactly alike, and quantifying differences between microbial communities is an important area of research at the moment. The types of interactions between organisms are also highly varied. These can include mutualistic relationships, where both organisms benefit from the interaction; parasitic relationships, where one organism exclusively benefits to the detriment of the other; and the full gradient in between. -Microbiome science is everywhere. There are tens of articles published daily in the scientific literature, and many popular science articles and books present these findings to the world of non-scientists. Understanding the promises and limitations of the methods of microbiome science can help avoid misconceptions about microbiome research, and it’s important for practitioners of microbiome science to understand and convey the promise and limitations of our field. Misconceptions abound, frequently arising from the same sources as high-quality popular science microbiome reporting. +Microbiome science is everywhere. There are tens of articles published daily in the scientific literature, and many popular science articles and books present these findings to the world of non-scientists. Understanding the promises and limitations of the methods of microbiome science can help avoid misconceptions about microbiome research, and it’s important for practitioners of microbiome science to understand and convey the promise and limitations of our field. Misconceptions abound, frequently arising from the same sources as high-quality popular science microbiome reporting. For example, on 5 Feb 2015 an article appeared in the New York Times noting (almost offhand) that Yersinia pestis, the organism responsible for Bubonic plague, had been found in multiple locations throughout the New York City subway system as part of its normal built environment microbiome. This was rapidly followed up on 6 Feb 2015 with an article noting that there was probably not Bubonic plague on the subway system after all, but rather that the approaches used by the research team are limited in their taxonomic resolution, and that likely a harmless close relative of Y. pestis was observed: “What the researchers probably found, [a spokesman for the university where the study originated] said, was bacteria from an unknown species or from organisms that happened to share some gene sequences with the plague bacterium…”. As microbiome services and products are increasingly marketed directly to the public, consumers of microbiome research findings, products, and services need to know how to critically evaluate these offerings and their associated claims. As practitioners in the field, we can help by ensuring that the methods we apply are appropriate and reliable, and that we make our work accessible. -## Goals of Amplicon analysis +## Goals of Amplicon analysis The technologies that are enabling work in microbiome science are the same that are driving the data revolution in biology. Primarily this work is driven by high-throughput DNA sequencing, which is applied for profiling microbial community composition: @@ -48,12 +48,12 @@ The technologies that are enabling work in microbiome science are the same that Other “omics” technologies are now playing an increasing role in microbiome research, such as: + mass-spectrometry-based metabolomics, which provides profiles of small molecule metabolites in an environment. + metaproteomics which provides more detailed descriptions of functional activities of microbes (and their hosts, if applicable). - + As a result, bioinformatics software tools are essential to microbiome research. For many microbiome researchers, bioinformatics is an intimidating and challenging aspect of their projects. -## Microbiome Analysis with QIIME 2 -QIIME 2 is an all in one bioinformatics microbiome analysis plaform. This platform allows for users to go from sequenced microbiome data to publication ready visualizations. The original QIIME, now referred to as QIIME 1, was published in 2010 [@Caporaso2010] and has been cited tens of thousands of times in the primary literature. QIIME 2, which was published in July of 2019 [@Bolyen2019], succeeded QIIME 1 on 1 January 2018. QIIME 2 is better than QIIME 1 in all ways, and QIIME 1 is no longer actively supported. If you have previously used QIIME 1, you should invest time in learning and switching to QIIME 2. If you’re new to QIIME, start with QIIME 2. (When I refer to QIIME in this book, without specifying whether I’m referring to QIIME 1 or QIIME 2, I’m referring to the platform generally.) +## Microbiome Analysis with QIIME 2 +QIIME 2 is an all in one bioinformatics microbiome analysis platform. This platform allows for users to go from sequenced microbiome data to publication ready visualizations. The original QIIME, now referred to as QIIME 1, was published in 2010 [@Caporaso2010] and has been cited tens of thousands of times in the primary literature. QIIME 2, which was published in July of 2019 [@Bolyen2019], succeeded QIIME 1 on 1 January 2018. QIIME 2 is better than QIIME 1 in all ways, and QIIME 1 is no longer actively supported. If you have previously used QIIME 1, you should invest time in learning and switching to QIIME 2. If you’re new to QIIME, start with QIIME 2. (When I refer to QIIME in this book, without specifying whether I’m referring to QIIME 1 or QIIME 2, I’m referring to the platform generally.) QIIME 2 has large and growing user and developer communities, and these communities make QIIME 2 possible. The epicenter of the community is the QIIME 2 Forum. The forum is primarily known as a place where users can get technical support with QIIME 2 for no charge. Developers of QIIME 2 moderate the forum, and typically respond to technical support questions within a couple of business days. The forum is also a great place to discuss general topics in microbiome bioinformatics, or microbiome research methods generally. There are many active discussions on these topics on the forum. Keeping up with the discussions on the forum is a great way to learn about current topics in microbiome research methods. There’s also a free job board on the forum - you can use the forum to find jobs, or post your own job ads there to find employees who are well-versed in QIIME 2 and other bioinformatics tools. If you’re not already a member of the QIIME 2 Forum, you should consider joining. It’s a great way for you to get help, and as you develop your QIIME 2 skills helping others on the forum is a great way to reenforce your learning and to get involved in the community. diff --git a/14-tool-glossary.Rmd b/14-tool-glossary.Rmd index d4c66495..b2a58c19 100644 --- a/14-tool-glossary.Rmd +++ b/14-tool-glossary.Rmd @@ -71,7 +71,7 @@ Get started at www.cancermodels.org to browse and query models by cancer type ## CTAT -The Trinity Cancer Transcriptome Analysis Toolkit (CTAT, https://github.com/NCIP/Trinity_CTAT/wiki) provides a diverse collection of tools to gain insights into the biology of cancer through the lens of the transcriptome. Using RNA-seq as input, CTAT modules enable detection of mutations, fusion transcripts, copy number aberrations, cancer-specific splicing aberrations, and oncogenic viruses including insertions into the human genome. CTAT uses both read mapping and de novo assembly methods to analyze RNA-seq, leveraging tumor bulk and single cell transcriptomes. CTAT modules provide interactive visualizations as outputs, are easily installed for local execution or run via cloud computing (eg. Terra), have detailed user guides and tutorials, and are well-supported through user forums. +The Trinity Cancer Transcriptome Analysis Toolkit, CTAT https://github.com/NCIP/Trinity_CTAT/wiki provides a diverse collection of tools to gain insights into the biology of cancer through the lens of the transcriptome. Using RNA-seq as input, CTAT modules enable detection of mutations, fusion transcripts, copy number aberrations, cancer-specific splicing aberrations, and oncogenic viruses including insertions into the human genome. CTAT uses both read mapping and de novo assembly methods to analyze RNA-seq, leveraging tumor bulk and single cell transcriptomes. CTAT modules provide interactive visualizations as outputs, are easily installed for local execution or run via cloud computing (eg. Terra), have detailed user guides and tutorials, and are well-supported through user forums. ## DeepPhe