Skip to content

Commit

Permalink
deploy: 2b30f56
Browse files Browse the repository at this point in the history
  • Loading branch information
enryH committed May 30, 2024
1 parent e45eed1 commit d87f153
Show file tree
Hide file tree
Showing 5 changed files with 110 additions and 3 deletions.
Binary file modified .doctrees/environment.pickle
Binary file not shown.
Binary file modified .doctrees/index.doctree
Binary file not shown.
2 changes: 1 addition & 1 deletion _sources/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ Claster
Modeling nascent RNA transcription from chromatin landscape and structure


.. include:: ../Readme.md
.. include:: Readme.md
:parser: myst_parser.sphinx_
:start-line: 3

Expand Down
109 changes: 108 additions & 1 deletion index.html
Original file line number Diff line number Diff line change
Expand Up @@ -356,7 +356,9 @@
</button>
`);
</script>

<label class="sidebar-toggle secondary-toggle btn btn-sm" for="__secondary"title="Toggle secondary sidebar" data-bs-placement="bottom" data-bs-toggle="tooltip">
<span class="fa-solid fa-list"></span>
</label>
</div></div>

</div>
Expand All @@ -372,6 +374,22 @@ <h1>Claster</h1>
<div id="print-main-content">
<div id="jb-print-toc">

<div>
<h2> Contents </h2>
</div>
<nav aria-label="Page">
<ul class="visible nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#abstract">Abstract</a></li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#in-this-repository">In this repository</a><ul class="nav section-nav flex-column">
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#configurations"><code class="docutils literal notranslate"><span class="pre">configurations</span></code></a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#images"><code class="docutils literal notranslate"><span class="pre">images</span></code></a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#inputs"><code class="docutils literal notranslate"><span class="pre">inputs</span></code></a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#scripts"><code class="docutils literal notranslate"><span class="pre">scripts</span></code></a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#targets"><code class="docutils literal notranslate"><span class="pre">targets</span></code></a></li>
</ul>
</li>
</ul>
</nav>
</div>
</div>
</div>
Expand All @@ -384,6 +402,72 @@ <h1>Claster</h1>
<section id="claster">
<h1>Claster<a class="headerlink" href="#claster" title="Link to this heading">#</a></h1>
<p>Modeling nascent RNA transcription from chromatin landscape and structure</p>
<section id="abstract">
<h2>Abstract<a class="headerlink" href="#abstract" title="Link to this heading">#</a></h2>
<p><em>Different cell types and their associated functionalities can emerge from a single genomic sequence when certain regions are expressed while others remain silenced. The study of gene regulation and its potential malfunctioning in different cellular contexts is hence pivotal to understand both development and disease. We present the Chromatin Landscape and Structure to Expression Regressor (CLASTER), an epigenetic-based deep neural network that can integrate different data modalities describing the chromatin landscape and its 3D structure in their raw format. CLASTER effectively translates them into nascent transcription levels measured by EU-seq at a kilobasepair resolution. Our predictions reached a Pearson correlation with targets above r=0.86 at both bin and gene levels, without relying on DNA sequence nor explicitly extracted chromatin features. The model mostly used the information found within 10 kbp of the predicted locus to perform the predictions, even when a wide genomic region of 1 Mbp was available. Explicit modeling of long-range interactions using multi-headed attention and high-resolution chromatin contact maps had little impact on model performance, despite the model correctly identifying elements in these inputs influencing nascent transcription. The trained model then served as a platform to predict the transcriptional impact of simulated epigenetic silencing perturbations. Our results point towards a rather local, integrative and combinatorial paradigm of gene regulation, where changes in the chromatin environment surrounding a gene shape its context-specific transcription. We conclude that the predominant locality and limitations of current machine learning approaches might emerge as a genuine signature of genomic organization, having broad implications for future modeling approaches.</em></p>
<p><img alt="Claster image" src="https://raw.githubusercontent.com/RasmussenLab/CLASTER/master/images/Claster_image.png" /></p>
<p><strong>CLASTER overview</strong> CLASTER integrates the chromatin landscape (accessibility, promoter and enhancer activities and chromatin silencing) and structure (Micro-C) to predict nascent transcription levels measured by EU-seq.</p>
</section>
<section id="in-this-repository">
<h2>In this repository<a class="headerlink" href="#in-this-repository" title="Link to this heading">#</a></h2>
<p>This repository contains the files and scripts required to reproduce the results of the paper and a short tutorial. The repository consists of the following folders:</p>
<section id="configurations">
<h3><code class="docutils literal notranslate"><span class="pre">configurations</span></code><a class="headerlink" href="#configurations" title="Link to this heading">#</a></h3>
<ul class="simple">
<li><p>Configuration files (.yaml) required to build different flavours of CLASTER.</p></li>
</ul>
</section>
<section id="images">
<h3><code class="docutils literal notranslate"><span class="pre">images</span></code><a class="headerlink" href="#images" title="Link to this heading">#</a></h3>
<ul class="simple">
<li><p>Overview of CLASTER’s architecture.</p></li>
</ul>
</section>
<section id="inputs">
<h3><code class="docutils literal notranslate"><span class="pre">inputs</span></code><a class="headerlink" href="#inputs" title="Link to this heading">#</a></h3>
<p>The folder contains the test set inputs for both data modalities, i.e. samples exploring regions of 1 Mbp centered at the TSS of protein coding genes found in chr4 (in mice). They will be used in the tutorial to exemplify how can we train and validate CLASTER.</p>
</section>
<section id="scripts">
<h3><code class="docutils literal notranslate"><span class="pre">scripts</span></code><a class="headerlink" href="#scripts" title="Link to this heading">#</a></h3>
<ul class="simple">
<li><p><a class="reference external" href="https://github.com/RasmussenLab/CLASTER/blob/master/scripts/0_Tutorial.ipynb"><code class="docutils literal notranslate"><span class="pre">0_Tutorial.ipynb</span></code></a>: The notebook provides a rapid overview of the most important steps in CLASTER’s pipeline, including training and validating the network using the EIR framework.</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">1_Data_obtention.ipynb</span></code>: This notebook guides the user through the data obtention process, including:</p>
<ul>
<li><p>Data download from publicly available repositories:</p>
<ul>
<li><p>Inputs: Chromatin landscape (ATAC-seq, H3K4me3, H3K27ac and H3K27me3 in mESCs) and structure (Micro-C maps in mESCs)</p></li>
<li><p>Outputs: Nascent transcription profiles (EU-seq).</p></li>
<li><p>Reference genome and gene annotations.</p></li>
<li><p>Enhancer-Like Signatures (ELS).</p></li>
</ul>
</li>
<li><p>Data filtering and preprocessing:</p>
<ul>
<li><p>Obtain numpy arrays for the inputs.</p></li>
<li><p>Obtain csv files for the targets.</p></li>
</ul>
</li>
</ul>
</li>
<li><p><code class="docutils literal notranslate"><span class="pre">2_Run_CLASTER.ipynb</span></code>: This notebook creates the configuration files required to train and test CLASTER using the EIR framework.</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">2b_Run_HyenaDNA_and_Enformer.ipynb</span></code>: The notebook contains our adaptations of the code building</p>
<ul>
<li><p>Hyena-DNA (https://github.com/HazyResearch/hyena-dna) in its public colab version.</p></li>
<li><p>Enformer (https://github.com/lucidrains/enformer-pytorch) in its python implementation.
These were used to benchmark CLASTER. It includes:</p></li>
<li><p>The obtention of sequence embeddings from both model’s backbones when loading the pretrained weights.</p></li>
<li><p>The addition of a model head on top of the embeddings to match our regression outputs.</p></li>
<li><p>Code to fine-tune Hyena-DNA’s backbone and the added head together.</p></li>
</ul>
</li>
<li><p><code class="docutils literal notranslate"><span class="pre">3_Data_analysis.ipynb</span></code>: The notebook contains the functions used to perform the data analysis and create the figures included in the manuscript.</p></li>
</ul>
</section>
<section id="targets">
<h3><code class="docutils literal notranslate"><span class="pre">targets</span></code><a class="headerlink" href="#targets" title="Link to this heading">#</a></h3>
<p>The folder contains the target EU-seq profiles matching the input (test) samples.</p>
</section>
</section>
<div class="toctree-wrapper compound">
<p aria-level="2" class="caption" role="heading"><span class="caption-text">Tutorial:</span></p>
<ul>
Expand Down Expand Up @@ -459,6 +543,29 @@ <h1>Claster<a class="headerlink" href="#claster" title="Link to this heading">#<



<div class="bd-sidebar-secondary bd-toc"><div class="sidebar-secondary-items sidebar-secondary__inner">


<div class="sidebar-secondary-item">
<div class="page-toc tocsection onthispage">
<i class="fa-solid fa-list"></i> Contents
</div>
<nav class="bd-toc-nav page-toc">
<ul class="visible nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#abstract">Abstract</a></li>
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#in-this-repository">In this repository</a><ul class="nav section-nav flex-column">
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#configurations"><code class="docutils literal notranslate"><span class="pre">configurations</span></code></a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#images"><code class="docutils literal notranslate"><span class="pre">images</span></code></a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#inputs"><code class="docutils literal notranslate"><span class="pre">inputs</span></code></a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#scripts"><code class="docutils literal notranslate"><span class="pre">scripts</span></code></a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#targets"><code class="docutils literal notranslate"><span class="pre">targets</span></code></a></li>
</ul>
</li>
</ul>
</nav></div>

</div></div>


</div>
<footer class="bd-footer-content">
Expand Down
2 changes: 1 addition & 1 deletion searchindex.js

Large diffs are not rendered by default.

0 comments on commit d87f153

Please sign in to comment.