Skip to content

Commit

Permalink
render html docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Maxim Moinat committed Oct 6, 2021
1 parent 05615a8 commit 5ca9190
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 21 deletions.
39 changes: 21 additions & 18 deletions docs/RabbitInAHat.html
Original file line number Diff line number Diff line change
Expand Up @@ -392,13 +392,13 @@ <h2>Process Overview</h2>
<li>Save Rabbit-In-a-Hat work and export to a MS Word document.</li>
</ol>
</div>
<div id="installation-and-support" class="section level2">
<h2>Installation and support</h2>
<p>Rabbit-In-a-Hat comes with WhiteRabbit, refer to step 1 and 2 of <a href="WhiteRabbit.html#installation">WhiteRabbit’s installation section</a>.</p>
</div>
<div id="installation-and-support" class="section level1">
<h1>Installation and support</h1>
<p>Rabbit-In-a-Hat comes with WhiteRabbit, refer to step 1 and 2 of <a href="WhiteRabbit.html#installation">WhiteRabbit’s installation section</a>.</p>
</div>
<div id="getting-started" class="section level1">
<h1>Getting Started</h1>
<div id="using-the-application-functions" class="section level1">
<h1>Using the application functions</h1>
<div id="creating-a-new-document" class="section level2">
<h2>Creating a New Document</h2>
<p>To create a new document, navigate to <em>File –&gt; Open Scan Report</em>. Use the “Open” window to browse for the scan document created by WhiteRabbit. When a scan report is opened, the tables scanned will appear in orange boxes on the “Source” side of the Tables.</p>
Expand Down Expand Up @@ -429,7 +429,7 @@ <h2>Loading in a Custom CDM</h2>
<div id="stem-table" class="section level2">
<h2>Stem table</h2>
<p>In some cases a source domains maps to multiple OMOP CDM target domains. For example lab values that map to both the measurement and observation domain. Using the stem table will remove some overhead of repeating the mapping for every target and will also ease implementation (see below).</p>
<p>The idea of the stem table is that it contains all the types of columns that you need regardless of the CDM table the data ultimately ends up in. There is a pre-specified map from stem to all CDM clinical event tables, linking every stem field to one or multiple fields in the CDM. When implementing the ETL, the vocabulary decides where a particular row mapped to stem table ultimately goes. The <a href="https://github.com/OHDSI/CommonDataModel/wiki/Data-Model-Conventions#content-of-each-table">OMOP CDM Data Model Conventions</a> mentions:</p>
<p>The idea of the stem table is that it contains all the types of columns that you need regardless of the CDM table the data ultimately ends up in. There is a pre-specified map from stem to all CDM clinical event tables, linking every stem field to one or multiple fields in the CDM. When implementing the ETL, the vocabulary decides where a particular row mapped to stem table ultimately goes. The <a href="https://ohdsi.github.io/CommonDataModel/dataModelConventions.html">OMOP CDM Data Model Conventions</a> mentions:</p>
<blockquote>
<p>Write the data record into the table(s) corresponding to the domain of the Standard CONCEPT_ID(s).</p>
</blockquote>
Expand All @@ -444,10 +444,9 @@ <h2>Concept id hints (<em>v0.9.0</em>)</h2>
<p><img src="images/riah_concept_id_hints.png" /></p>
<p>The concept id hints are stored statically in <a href="https://github.com/OHDSI/WhiteRabbit/blob/master/rabbitinahat/src/main/resources/org/ohdsi/rabbitInAHat/dataModel/CDMConceptIDHints_v5.0_MAR-18.csv">a csv file</a> and are not automatically updated. The <a href="https://github.com/OHDSI/WhiteRabbit/blob/master/rabbitinahat/src/main/resources/org/ohdsi/rabbitInAHat/dataModel/concept_id_hint_select.sql">code used to create the aforementioned csv file</a> is also included in the repo.</p>
</div>
</div>
<div id="table-to-table-mappings" class="section level1">
<h1>Table to Table Mappings</h1>
<p>It is assumed that the owners of the source data should be able to provide detail of what the data table contains, Rabbit-In-a-Hat will describe the columns within the table but will not provide the context a data owner should provide. For the CDM tables, if more information is needed navigate to the <a href="https://github.com/ohdsi/commondatamodel/wiki">OMOP CDM wiki</a> and review the current OMOP specification.</p>
<div id="table-to-table-mappings" class="section level2">
<h2>Table to Table Mappings</h2>
<p>It is assumed that the owners of the source data should be able to provide detail of what the data table contains, Rabbit-In-a-Hat will describe the columns within the table but will not provide the context a data owner should provide. For the CDM tables, if more information is needed navigate to the <a href="https://ohdsi.github.io/CommonDataModel/index.html">OMOP CDM documentation</a> and review the current OMOP specification.</p>
<p>To connect a source table to a CDM table, simply hover over the source table until an arrow head appears.</p>
<p><img src="images/rabbitinahat-drugclaims.png" /></p>
<p>Use your mouse to grab the arrow head and drag it to the corresponding CDM table. In the example below, the <em>drug_claims</em> data will provide information for the <em>drug_exposure</em> table.</p>
Expand All @@ -456,26 +455,28 @@ <h1>Table to Table Mappings</h1>
<p><img src="images/rabbitinahat-arrow.png" /></p>
<p>Continue this process until all tables that are needed to build a CDM are mapped to their corresponding CDM tables. One source table can map to multiple CDM tables and one CDM table can receive multiple mappings. There may be tables in the source data that should not be map into the CDM and there may be tables in the CDM that cannot be populated from the source data.</p>
</div>
<div id="field-to-field-mappings" class="section level1">
<h1>Field to Field Mappings</h1>
<div id="field-to-field-mappings" class="section level2">
<h2>Field to Field Mappings</h2>
<p>By double clicking on an arrow connecting a source and CDM table, it will open a <em>Fields</em> pane below the arrow selected. The <em>Fields</em> pane will have all the source table and CDM fields and is meant to make the specific column mappings between tables. Hovering over a source table will generate an arrow head that can then be selected and dragged to its corresponding CDM field. For example, in the <em>drug_claims</em> to <em>drug_exposure</em> table mapping example, the source data owners know that <em>patient_id</em> is the patient identifier and corresponds to the <em>CDM.person_id</em>. Also, just as before, the arrow can be selected and <em>Logic</em> and <em>Comments</em> can be added.</p>
<p><img src="images/rabbitinahat-fields.png" /></p>
<p>If you select the source table orange box, Rabbit-In-a-Hat will expose values the source data has for that table. This is meant to help in the process in understanding the source data and what logic may be required to handle the data in the ETL. In the example below <em>ndcnum</em> is selected and raw NDC codes are displayed starting with most frequent (note that in the WhiteRabbit scan a “Min cell count” could have been selected and values smaller than that count will not show).</p>
<p><img src="images/rabbitinahat-fieldex.png" /></p>
<p>Continue this process until all source columns necessary in all mapped tables have been mapped to the corresponding CDM column. Not all columns must be mapped into a CDM column and not all CDM columns require a mapping. One source column may supply information to multiple CDM columns and one CDM column can receive information from multiple columns.</p>
</div>
<div id="generating-an-etl-document" class="section level1">
<h1>Generating an ETL Document</h1>
<div id="output-generation" class="section level2">
<h2>Output generation</h2>
<div id="generating-an-etl-document" class="section level3">
<h3>Generating an ETL Document</h3>
<p>To generate an ETL MS Word document use <em>File –&gt; Generate ETL document –&gt; Generate ETL Word document</em> and select a location to save. The ETL document can also be exported to markdown or html. In this case, a file per target table is created and you will be prompted to select a folder. Regardless of the format, the generated document will contain all mappings and notes from Rabbit-In-a-Hat.</p>
<p>Once the information is in the document, if an update is needed you must either update the information in Rabbit-In-a-Hat and regenerate the document or update the document. If you make changes in the document, Rabbit-In-a-Hat will not read those changes and update the information in the tool. However, it is common to generate the document with the core mapping information and fill in more detail within the document.</p>
<p>Once the document is completed, this should be shared with the individuals who plan to implement the code to execute the ETL. The markdown and html format enable easy publishing as a web page on e.g. Github. A good example is the <a href="https://ohdsi.github.io/ETL-Synthea/">Synthea ETL documentation</a>.</p>
</div>
<div id="generating-a-testing-framework" class="section level1">
<h1>Generating a Testing Framework</h1>
<div id="generating-a-testing-framework" class="section level3">
<h3>Generating a Testing Framework</h3>
<p>To make sure the ETL process is working as specified, it is highly recommended creating <a href="https://en.wikipedia.org/wiki/Unit_testing">unit tests</a> that evaluate the behavior of the ETL process. To efficiently create a set of unit tests Rabbit-in-a-Hat can <a href="riah_test_framework.html">generate a testing framework</a>.</p>
</div>
<div id="generating-a-sql-skeleton-v0.9.0" class="section level1">
<h1>Generating a SQL Skeleton (<em>v0.9.0</em>)</h1>
<div id="generating-a-sql-skeleton-v0.9.0" class="section level3">
<h3>Generating a SQL Skeleton (<em>v0.9.0</em>)</h3>
<p>The step after documenting your ETL process is to implement it in an ETL framework of your choice. As many implementations involve SQL, Rabbit-In-a-Hat provides a convenience function to export your design to an SQL skeleton. This contains all field to field mappings, with logic/descriptions as comments, as non-functional pseudo-code. This saves you copying names into your SQL code, but still requires you to implement the actual logic. The general format of the skeleton is:</p>
<pre class="sql"><code>INSERT INTO &lt;target_table&gt; (
&lt;target_fields&gt;
Expand All @@ -485,6 +486,8 @@ <h1>Generating a SQL Skeleton (<em>v0.9.0</em>)</h1>
FROM &lt;source_table&gt;
;</code></pre>
</div>
</div>
</div>



Expand Down
6 changes: 3 additions & 3 deletions docs/WhiteRabbit.html
Original file line number Diff line number Diff line change
Expand Up @@ -406,7 +406,7 @@ <h2>Installation</h2>
See <a href="#running-from-the-command-line">Running from the command line</a> for details on how to run from the command line instead.</li>
<li>Go to <a href="#using-the-application-functions">Using the Application Functions</a> for detailed instructions on how to make a scan of your data.</li>
</ol>
<p>Note: on releases earlier than version 0.8.0, open the respective WhiteRabbit.jar or RabbitInAHat.jar files instead.</p>
<p>Note: on releases earlier than version 0.8.0, open the respective WhiteRabbit.jar or RabbitInAHat.jar files instead. Note: WhiteRabbit and RabbitInaHat only work from a path with only ascii characters.</p>
<div id="memory" class="section level3">
<h3>Memory</h3>
<p>WhiteRabbit possibly does not start when the memory allocated by the JVM is too big or too small. By default this is set to 1200m. To increase the memory (in this example to 2400m), either set the environment variable <code>EXTRA_JVM_ARGUMENTS=-Xmx2400m</code> before starting or edit in <code>bin/WhiteRabbit.bat</code> the line <code>%JAVACMD% %JAVA_OPTS% -Xmx2400m...</code>. To lower the memory, set one of these variables to e.g. <code>-Xmx600m</code>. If you have a 32-bit Java VM installed and problems persist, consider installing 64-bit Java.</p>
Expand All @@ -420,7 +420,7 @@ <h2>Support</h2>
</div>
</div>
<div id="using-the-application-functions" class="section level1">
<h1>Using the Application Functions</h1>
<h1>Using the application functions</h1>
<div id="specifying-the-location-of-source-data" class="section level2">
<h2>Specifying the Location of Source Data</h2>
<p><img src="images/whiterabbitscreenshot_v0.10.1.PNG" /></p>
Expand Down Expand Up @@ -482,7 +482,7 @@ <h4>SQL Server</h4>
<div id="postgresql" class="section level4">
<h4>PostgreSQL</h4>
<ul>
<li><em><strong>Server location:</strong></em> this field contains the host name and database name (<host>/<database>)</li>
<li><em><strong>Server location:</strong></em> this field contains the host name and database name (<code>&lt;host&gt;/&lt;database&gt;</code>). You can also specify the port (ex: <code>&lt;host&gt;:&lt;port&gt;/&lt;database&gt;</code>), which defaults to 5432.</li>
<li><em><strong>User name:</strong></em> name of the user used to log into the server</li>
<li><em><strong>Password:</strong></em> password for the supplied user name</li>
<li><em><strong>Database name:</strong></em> this field contains the schema containing the tables</li>
Expand Down

0 comments on commit 5ca9190

Please sign in to comment.