Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: ✨ Initial draft of functions to extract osdc population #71

Merged
merged 58 commits into from
May 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
4b228bd
docs: :sparkles: init overview of functions to create osdc population
signekb Mar 17, 2024
0f6856d
docs: alternative visualisation of functions used for extracting diab…
signekb Mar 17, 2024
abdb477
docs: flow chart expanding on exclusion based on pregnancy window
signekb Mar 17, 2024
1593294
Merge branch 'main' into docs/general-functionality-flow
signekb Mar 17, 2024
7d24d9c
docs: simplify functionality flow of d population to only include fun…
signekb Mar 22, 2024
2969292
docs: :sparkles: initialise functionality flow post
signekb Mar 22, 2024
d1dea76
docs: :art: update overview section w. hyperlinks
signekb Apr 3, 2024
0237ba1
docs: :lipstick: simplify and add colours to inclusion and exclusion …
signekb Apr 3, 2024
287a3d9
docs: update extracting the diabetes population section
signekb Apr 3, 2024
2de0d12
docs: init hba1c tests section
signekb Apr 3, 2024
ffa2dc2
docs: init diagnosis section
signekb Apr 3, 2024
b2a67e6
docs: init podiatrist services section
signekb Apr 3, 2024
128ce5c
docs: init gld purchases section
signekb Apr 3, 2024
2649919
docs: init excl events during pregnancy windows section
signekb Apr 3, 2024
c241be2
docs: init gld drugts for weight loss section
signekb Apr 3, 2024
ed742b7
docs: init metformin purchases for women below 40 section
signekb Apr 3, 2024
700efaf
docs: init extract diagnosis classification date section
signekb Apr 3, 2024
6fcd10a
style: remove old comment and white spaces
signekb Apr 3, 2024
7443a70
Merge branch 'main' into docs/functionality-flow-diabetes-population
signekb Apr 3, 2024
46cb5ef
docs: :fire: remove old/irrelevant figures
signekb Apr 3, 2024
ca809c2
fix: remove all mentions of "before index date" since these filters a…
signekb Apr 12, 2024
2d02224
fix: typo
signekb Apr 12, 2024
23268cf
docs: add that gld purchases are only from 1997 and onwards
signekb Apr 12, 2024
445a118
docs: add TODO note to remember to update whether function will be re…
signekb Apr 12, 2024
56cc88f
docs: remove comment on adding arguments to functions
signekb Apr 12, 2024
505ac73
Merge branch 'main' into docs/functionality-flow-diabetes-population
signekb Apr 12, 2024
9b62a7d
style: :lipstick: format text
signekb Apr 17, 2024
4d222cd
fix: minor text edits and change "classification date" to "inclusion …
signekb Apr 17, 2024
0807c62
docs: add the notion of "raw" vs "stable" inclusion dates to text and…
signekb Apr 17, 2024
2c7ab62
style: format text
signekb Apr 17, 2024
5df493d
docs: :zap: add english translations of register names
signekb Apr 25, 2024
5000d85
Merged origin/main into docs/functionality-flow-diabetes-population
lwjohnst86 Apr 26, 2024
f52b394
docs: apply suggestions from review
lwjohnst86 Apr 26, 2024
9603f76
docs: remove redundant information and link instead to the design doc
lwjohnst86 Apr 26, 2024
221a637
Merge branch 'docs/functionality-flow-diabetes-population' of https:/…
lwjohnst86 Apr 26, 2024
2d6181d
docs: simplify the headers
lwjohnst86 Apr 26, 2024
ac9bf8a
docs: use `()` to indicate object is function, small renaming of func…
lwjohnst86 Apr 26, 2024
024e758
chore: :wrench: set tabsize to 2 for R and qmd files
lwjohnst86 Apr 27, 2024
6fcbaf1
build: :bug: the newest version of PlantUML has an error with buildin…
lwjohnst86 Apr 27, 2024
084485c
docs: :memo: update puml file to use flowchart instead and to include…
lwjohnst86 Apr 27, 2024
aaccf54
docs: :memo: include some guidelines for writing functions
lwjohnst86 Apr 27, 2024
d650c22
docs: :memo: revise function flow doc to match flowchart
lwjohnst86 Apr 27, 2024
49a6610
chore: :truck: rename to function-flow
lwjohnst86 Apr 27, 2024
67029ca
chore: :memo: regenerate puml images
lwjohnst86 Apr 27, 2024
ded4c3d
docs: forgot to change this name
lwjohnst86 Apr 27, 2024
109bbfb
Merged origin/main into docs/functionality-flow-diabetes-population
lwjohnst86 Apr 27, 2024
fee94d4
docs: apply suggestions from review
lwjohnst86 May 2, 2024
30a7881
docs: :art: add introduction section
signekb May 2, 2024
76e2e7d
docs: :art: collect initial and final diagnosis date in one function
signekb May 2, 2024
4614171
Update vignettes/function-flow.Rmd
signekb May 2, 2024
3896ee3
docs: :art: format Rmd
signekb May 2, 2024
baf89c7
docs: add todo item on adding which register get_potential_pcos relie…
signekb May 2, 2024
dc5d771
fix: add "_status" to the name of the "classify_diabetes" function
signekb May 2, 2024
3b7fcd0
docs: :memo: add`_join` helper functions to naming
signekb May 2, 2024
902ef41
docs: update the function flow with the join of lpr2 and 3 data sources
signekb May 2, 2024
6f619dd
docs: remove backticks from package name
signekb May 2, 2024
394f60e
Apply suggestions from code review
signekb May 2, 2024
52535e9
docs: update png based on review feedback
signekb May 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,9 @@
"r.session.levelOfObjectDetail": "Detailed",
"r.session.data.rowLimit": 1000,
"r.plot.useHttpgd": true,
"[r]": {
"editor.defaultFormatter": "REditorSupport.r"
"[r,quarto]": {
"editor.defaultFormatter": "REditorSupport.r",
"editor.tabSize": 2,
},
"cSpell.language": "en-GB",
"cSpell.words": [
Expand Down
4 changes: 2 additions & 2 deletions justfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@

# Generate PNG images from all PlantUML files
generate-puml-all:
docker run --rm -v $(pwd):/puml -w /puml ghcr.io/plantuml/plantuml:latest -tpng "**/*.puml"
docker run --rm -v $(pwd):/puml -w /puml ghcr.io/plantuml/plantuml:1.2024.3 -tpng "**/*.puml"

# Generate PNG image from specific PlantUML file
generate-puml name:
docker run --rm -v $(pwd):/puml -w /puml ghcr.io/plantuml/plantuml:latest -tpng "**/{{name}}.puml"
docker run --rm -v $(pwd):/puml -w /puml ghcr.io/plantuml/plantuml:1.2024.3 -tpng "**/{{name}}.puml"
178 changes: 178 additions & 0 deletions vignettes/function-flow.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
---
title: "Function flow"
output: rmarkdown::html_vignette
bibliography: references.bib
csl: vancouver.csl
vignette: >
%\VignetteIndexEntry{Function flow}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

signekb marked this conversation as resolved.
Show resolved Hide resolved
## Introduction

This vignette describes the function conventions and function flow of
the osdc package. The function convention sections go over how we name
functions and how we structure them in terms of input and output. The
function flow describes the functions within the package, both internal
and user-facing, which data sources they rely on, and how they are
connected to each other. First, the functions for classifying diabetes
status are presented, followed by the functions for classifying the
diabetes type.

## Function conventions

The below conventions are *ideals* only, to be used as a guidelines to
help with development and understanding of the code. They are not hard
rules.

### Naming

- First word is an action verb, later words are objects or conditions.
- Exclusion criteria are prefixed with `exclude_`.
- Inclusion criteria are prefixed with `include_`.
- Helpers that get or extract a condition (e.g., "pregnancy" or "date
of visit") are prefixed with `get_`.
- Helpers that drop or keep a specific condition are prefixed with
`drop_` or `keep_` (e.g., "first visit date to maternal care for
pregnancy after 40 weeks"). These types of helpers likely are
contained in the `get_` functions.
- Helpers that join registers or output of other functions are
prefixed with `join_`.

### Input and output

- Few arguments, with one or two core required argument.
- `include_` functions take a register as the first argument.
- One input register database at a time.
- `exclude_` functions can take a register as the first argument or
take the output from an `include_` function.
- Second argument can be an output data from another function.

## Function flow

The OSDC algorithm - and thereby, the osdc package - contains one main
function that will classify individuals into those with either type 1 or
type 2 diabetes using the Danish registers:
`classify_diabetes_status()`. This function classifies those with
diabetes (type 1 or 2) based on the Danish registers described in the
`vignette("design")`. All data sources are used as input for this
function. The specific inclusion and exclusion details are also
described in the `vignette("design")`.

This results in the functionality flow for classifying diabetes status
seen below. All functions take a `data.frame` type object as input and
outputs the same type of object as the input object (a `data.frame`
type). For instance, if the input is a `data.table` object, the output
will also be a `data.table`.

![Flow of functions, as well as their required input registers, for
classifying diabetes status using the `osdc` package. Light blue and
orange boxes represent filtering functions (inclusion and exclusion
events, respectively). Uncoloured boxes are helper functions that get or
extract a condition or joins data or function
outputs.](images/function-flow.png)

## Inclusion events

### HbA1c tests above 48 mmol/mol

The function `include_hba1c()` uses `lab_forsker` as the input data to
extract all events of tests above 48 mmol/mol.

<!-- TODO: Add details on how this filtering should be done -->

### Hospital diagnosis of diabetes

The function `include_diabetes_diagnoses()` uses the hospital contacts
from LPR2 and 3 to include all dates of diabetes diagnoses. Diabetes
diagnoses from both ICD 8 and ICD 10 are included.

This function contains two helper functions:

- `keep_diabetes_icd10()`
- `keep_diabetes_icd8()`

<!-- TODO: Add details on how this filtering should be done, e.g., diagnosis codes -->

<!-- TODO: Which specific ICD 8 and 10 codes are included? -->

### Diabetes-specific podiatrist services

The function `include_podiatrist_services()` uses `sysi` or `sssy` as
signekb marked this conversation as resolved.
Show resolved Hide resolved
input to extract the dates of all diabetes-specific podiatrist services.

<!-- TODO: Add details on how this filtering should be done -->

### GLD purchases

The function `include_gld_purchases()` uses `lmdb` to extract the dates
of all GLD purchases (from 1997 onwards).

<!-- TODO: Add details on how this filtering should be done -->

<!-- TODO: Add this + link to resource "For details about this, see [link]." -->

## Exclusion events

### HbA1c tests and GLD purchases during pregnancy

The function `exclude_pregnancy()` uses diagnoses from LPR2 or LPR3 as
input and is used to exclude both HbA1c tests and GLD purchases during
pregnancy.

Internally, this relies on the function `get_pregnancy_dates()` that
contains the following three helper functions:

- `calculate_pregnancy_index_date_for_mc_visits_wo_end_date()` (this
might be removed with the inclusion of the birth register)
Comment on lines +127 to +128
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really know what this does.

- `get_pregnancy_end_dates()`: Keep maternal care visits with an end
date and drop visits between 40 weeks before end date and 12 weeks
after end date.
- `get_maternal_care_visit_dates_without_end_date()`: Uses the output
from `get_pregnancy_end_dates()` which identifies maternal care
visits *with* end dates to derive maternal care visits *without* end
dates. below.

<!-- TODO: What is done with the mc visits without end dates then? -->

<!-- TODO: Add details on how this filtering should be done -->

### Glucose-lowering brand drugs for weight loss

The function `exclude_purchases_of_weight_loss_drugs()` uses REGISTER as
input and excludes BRANDS.

<!-- TODO: Add details on how this filtering should be done -->

<!-- TODO: Add data source and which brands are excluded -->

### Metformin purchases for women below age 40

The function `exclude_potential_pcos()` as input to exclude all
purchases of metformin by women below age 40 (i.e., \<= 39 years old) at
the date of purchase. It relies on REGISTER as input.

This function contains two helper functions:

- `keep_women()`
- `drop_age_40_below()`

<!-- TODO: Add details on how this filtering should be done -->

<!-- TODO: Add which register this uses -->

## Get diagnosis date

The function `get_diagnosis_date()` combines the outputs from the
inclusion and exclusion functions to get the final diagnosis date.
Initially, it drops the first inclusion and exclusion events from the
function outputs with the helper `drop_first_event()`, so that only
those with two or more events are kept. This is then used to assign an
initial diagnosis according to OSDC. Then, all the outputs are joined
together with `join_diagnosis_dates()`.

Finally, the dates outside of the data coverage period are dropped with
`drop_diagnosis_dates_outside_coverage()` to end with a final diagnosis
date. For details on this censoring based on periods with insufficient
data coverage, see the `vignette("algorithm-logic")`.
Binary file added vignettes/images/function-flow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
86 changes: 86 additions & 0 deletions vignettes/images/function-flow.puml
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
@startuml function-flow
!theme cerulean-outline
<style>
action {
FontColor black
}
database {
FontColor black
}
.inclusion {
BackgroundColor lightblue
}
.exclusion {
BackgroundColor orange
}
</style>

hide <<inclusion>> stereotype
hide <<exclusion>> stereotype

card classify_diabetes_status() as cd {
together {
database sssy
database sysi
database lpr_diag
database lpr_adm
database lmdb
database lab_forsker
database kontakter
database diagnoser
database bef
}

action "get_pregnancy_dates()" as pregnancy
action "get_potential_pcos()" as pcos
action "get_diagnosis_date()" as diagnosis_date
action "join_lpr2()" as lpr2
action "join_lpr3()" as lpr3

together {
action "exclude_pregnancy()" as ex_pregnancy <<exclusion>>
action "exclude_purchases_of_weight_loss_drugs()" as ex_wld <<exclusion>>
action "exclude_potential_pcos()" as ex_pcos <<exclusion>>
}

together {
action "include_hba1c()" as in_hba1c <<inclusion>>
action "include_diabetes_diagnosis()" as in_diagnosis <<inclusion>>
action "include_podiatrist_services()" as in_podiatrist <<inclusion>>
action "include_purchases_gld()" as in_gld <<inclusion>>
}

lpr_diag --> lpr2
lpr_adm --> lpr2
kontakter --> lpr3
diagnoser --> lpr3

lab_forsker --> in_hba1c
in_hba1c --> ex_pregnancy

lpr2 --> pregnancy
lpr3 --> pregnancy
pregnancy -> ex_pregnancy

lpr2 --> in_diagnosis
lpr3 --> in_diagnosis

sssy --> in_podiatrist
sysi --> in_podiatrist

lmdb --> in_gld
in_gld --> ex_pregnancy
in_gld --> ex_wld

bef --> pcos
in_gld --> ex_pcos
pcos --> ex_pcos

ex_wld --> diagnosis_date
ex_pregnancy --> diagnosis_date
ex_pcos --> diagnosis_date
in_podiatrist --> diagnosis_date
in_diagnosis --> diagnosis_date

}
@enduml
Loading