diff --git a/.gitignore b/.gitignore index 31e10b3..eb473dc 100644 --- a/.gitignore +++ b/.gitignore @@ -13,6 +13,7 @@ __pycache__/ # Tests and coverage /data/ +ehrapy_data/ /node_modules/ # docs diff --git a/docs/api.md b/docs/api.md index 4eb165f..27d8289 100644 --- a/docs/api.md +++ b/docs/api.md @@ -4,7 +4,6 @@ ```{eval-rst} .. module:: ehrdata -.. currentmodule:: ehrdata .. autosummary:: :toctree: generated @@ -22,11 +21,9 @@ :toctree: generated io.omop.load - io.omop.extract_tables io.omop.extract_person io.omop.extract_observation_period io.omop.extract_measurement - io.omop.time_interval_table io.omop.extract_observation io.omop.extract_procedure_occurrence io.omop.extract_specimen diff --git a/docs/conf.py b/docs/conf.py index dd1206e..6f3a600 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -124,8 +124,9 @@ pygments_style = "default" +# If building the documentation fails because of a missing link that is outside your control, +# you can add an exception to this list: nitpick_ignore = [ - # If building the documentation fails because of a missing link that is outside your control, - # you can add an exception to this list. - # ("py:class", "igraph.Graph"), + # https://github.com/duckdb/duckdb-web/issues/3806 + ("py:class", "duckdb.duckdb.DuckDBPyConnection"), ] diff --git a/docs/notebooks/omop_tables_tutorial.ipynb b/docs/notebooks/omop_tables_tutorial.ipynb index 8c5c3d3..3d7a2a3 100644 --- a/docs/notebooks/omop_tables_tutorial.ipynb +++ b/docs/notebooks/omop_tables_tutorial.ipynb @@ -42,12 +42,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Introduction to Vocabularies\n", + "## Introduction to Vocabularies\n", "\n", "\n", "We explain these along the first three tables of the OMOP CDM.\n", "\n", - "#### 0.1 Concept\n", + "### 0.1 Concept\n", "\n", "Purpose: Clinical events in OMOP are expressed as concepts, the fundamental building block of data records. For this, OMOP gathers concepts from many existing vocabularies, such as WHO's [ICD10](https://www.icd-code.de/) and [SNOMED](https://www.snomed.org/). There are many concepts in the OMOP CDM; the concepts that are actually used for a specific dataset are listed in this table of the database.\n", "\n", @@ -185,7 +185,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### 0.2 Concept Relationship\n", + "### 0.2 Concept Relationship\n", "Any two concepts can have a relationship between each other. The most common two relationships are \"Maps to\" and \"Maps from\", where a non-standard concept from the source database is mapped to a standard concept in the CDM." ] }, @@ -278,7 +278,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### 0.3 Concept Ancestry\n", + "### 0.3 Concept Ancestry\n", "(is built automatically from the concept relationship table if there are is a relationships. Not sure if should include..?)" ] }, @@ -286,7 +286,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### Internal Reference Tables\n", + "### Internal Reference Tables\n", "There are tables DOMAIN, VOCABULARY, CONCEPT_CLASS, RELATIONSHIP; these tables duplicate the fields already in CONCEPT and CONCEPT_RELATIONSHIP, and can provide more information with an additional *_NAME field.\n", "\n", "We here omit them, as they can at any stage be created from the latter two tables." @@ -296,7 +296,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### 1. Person\n", + "### 1. Person\n", "\n", "- Purpose: Contains demographic information about each patient.\n", "- Key Fields: person_id, gender_concept_id, year_of_birth, race_concept_id, ethnicity_concept_id\n", @@ -405,7 +405,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### 2. Observation Period\n", + "### 2. Observation Period\n", "Purpose: Defines periods of time during which the patient’s data is considered reliable and available.\n", "\n", "OMOP CDM: \"This table contains records which define spans of time during which two conditions are expected to hold: (i) Clinical Events that happened to the Person are recorded in the Event tables, and (ii) absence of records indicate such Events did not occur during this span of time.\"\n", @@ -483,7 +483,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### 3. Visit Occurrence\n", + "### 3. Visit Occurrence\n", "\n", "\"This table contains Events where Persons engage with the healthcare system for a duration of time. They are often also called “Encounters”. Visits are defined by a configuration of circumstances under which they occur, such as (i) whether the patient comes to a healthcare institution, the other way around, or the interaction is remote, (ii) whether and what kind of trained medical staff is delivering the service during the Visit, and (iii) whether the Visit is transient or for a longer period involving a stay in bed.\"\n", "\n", @@ -623,7 +623,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### 4. Visit Detail (OPTIONAL)\n", + "### 4. Visit Detail (OPTIONAL)\n", "- Purpose: More details on visit, such as movement between units in an inpatient stay. There can be 0 or more entries in visit_detail per entry in visit_occurrence.\n", "- Key Fields: visit_detail_id, person_id, visit_detail_concept_id, visit_detail_start_date, visit_detail_end_date\n" ] @@ -905,7 +905,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### 5. Drug Exposure\n", + "### 5. Drug Exposure\n", "\n", "\"The purpose of records in this table is to indicate an exposure to a certain drug as best as possible. In this context a drug is defined as an active ingredient. Drug Exposures are defined by Concepts from the Drug domain, which form a complex hierarchy. As a result, one DRUG_SOURCE_CONCEPT_ID may map to multiple standard concept ids if it is a combination product.\"\n", "\n", @@ -1059,7 +1059,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### 6. Procedure Occurrence\n", + "### 6. Procedure Occurrence\n", "\n", "\"This table contains records of activities or processes ordered by, or carried out by, a healthcare provider on the patient with a diagnostic or therapeutic purpose.\"\n", "\n", @@ -1184,7 +1184,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### 7. Device Exposure\n", + "### 7. Device Exposure\n", "\"The Device domain captures information about a person’s exposure to a foreign physical object or instrument which is used for diagnostic or therapeutic purposes through a mechanism beyond chemical action. Devices include implantable objects (e.g. pacemakers, stents, artificial joints), medical equipment and supplies (e.g. bandages, crutches, syringes), other instruments used in medical procedures (e.g. sutures, defibrillators) and material used in clinical care (e.g. adhesives, body material, dental material, surgical material).\"\n", "\n", "- Key Fields: device_exposure_id, person_id, device_concept_id, device_exposure_start_date, device_concept_type_id" @@ -1313,7 +1313,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### 7. Measurement\n", + "### 7. Measurement\n", "\n", "- Purpose: Captures clinical measurements or laboratory test results.\n", "- Key Fields: measurement_id, person_id, measurement_concept_id, measurement_date, value_as_number" @@ -1465,7 +1465,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### 8. Observation\n", + "### 8. Observation\n", "\n", "\"Observations differ from Measurements in that they do not require a standardized test or some other activity to generate clinical fact. Typical observations are medical history, family history, the stated need for certain treatment, social circumstances, lifestyle choices, healthcare utilization patterns, etc. If the generation clinical facts requires a standardized testing such as lab testing or imaging and leads to a standardized result, the data item is recorded in the MEASUREMENT table.\"\n", "\n", @@ -1605,7 +1605,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### 9. Death\n", + "### 9. Death\n", "- Purpose: Captures information related to patient death.\n", "- Key Fields: person_id, death_date, death_type_concept_id, cause_concept_id" ] @@ -1697,7 +1697,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### 10. Note\n", + "### 10. Note\n", "- Purpose: Contains unstructured clinical notes.\n", "- Key Fields: note_id, person_id, note_date, note_text" ] @@ -1768,7 +1768,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### 13. Note_NLP\n", + "### 13. Note_NLP\n", "- Purpose: Encodes all output of NLP on clinical notes. Each row represents a single extracted term from a note.\n", "- Key Fields: note_nlp_id, note_id, lexical_variant, note_nlp_concept_id" ] @@ -1839,7 +1839,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "#### 14. Specimen\n", + "### 14. Specimen\n", "The specimen domain contains the records identifying biological samples from a person.\n", "\n", "- Purpose:\n", diff --git a/docs/notebooks/tutorial_ehrdata_omop.ipynb b/docs/notebooks/tutorial_ehrdata_omop.ipynb index 3599855..022fcc1 100644 --- a/docs/notebooks/tutorial_ehrdata_omop.ipynb +++ b/docs/notebooks/tutorial_ehrdata_omop.ipynb @@ -626,7 +626,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "##### Interlude - Irregularly sampled time series data\n", + "#### Interlude - Irregularly sampled time series data\n", "Electronic health records can be regarded as (that is, form a model of a person via) irregular sampling irregularly sampled time series.\n", "\n", "Following notation and explanation from [Horn et al.](https://proceedings.mlr.press/v119/horn20a.html), a time series of a patient can be described as a set of tuples (t, z, m), where t denotes the time, z the observed value, and m a modality description of the measurement.\n", @@ -1114,7 +1114,7 @@ ], "metadata": { "kernelspec": { - "display_name": "ehrapy_venv_july", + "display_name": "Python 3", "language": "python", "name": "python3" }, @@ -1128,7 +1128,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.19" + "version": "3.11.7" } }, "nbformat": 4, diff --git a/pyproject.toml b/pyproject.toml index 9b75188..4a5e484 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -36,6 +36,7 @@ optional-dependencies.dev = [ ] optional-dependencies.doc = [ "docutils>=0.8,!=0.18.*,!=0.19.*", + "ehrapy[lamin]", "ipykernel", "ipython", "myst-nb>=1.1", @@ -57,6 +58,7 @@ optional-dependencies.lamin = [ "bionty", "lamindb", "omop", + "rich", ] optional-dependencies.test = [ "coverage", @@ -75,7 +77,7 @@ installer = "uv" features = [ "dev" ] [tool.hatch.envs.docs] -extra-features = [ "doc" ] +features = [ "doc" ] scripts.build = "sphinx-build -M html docs docs/_build {args}" scripts.open = "python -m webbrowser -t docs/_build/html/index.html" scripts.clean = "git clean -fdX -- {args:docs}"