Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

333 c02 link delayed discharges episodes #639

Merged
merged 50 commits into from
Jun 2, 2023
Merged
Show file tree
Hide file tree
Changes from 36 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
9adec47
initial rough work on delay discharge
lizihao-anu Apr 18, 2023
8e5ba80
Update documentation
lizihao-anu Apr 18, 2023
c021189
some conversion from SPSS
lizihao-anu Apr 19, 2023
a3b6d05
Style code
lizihao-anu Apr 19, 2023
d18f061
a function of adding delay discharge to episode data
lizihao-anu Apr 26, 2023
f3ba46f
Style code
lizihao-anu Apr 26, 2023
5d00df4
Update R/add_dd.R
lizihao-anu Apr 26, 2023
75017d9
Update R/add_dd.R
lizihao-anu Apr 26, 2023
9575755
Update R/add_dd.R
lizihao-anu Apr 26, 2023
65cc28f
Merge branch 'main-R' into 333-c02-link-delayed-discharges-episodes
lizihao-anu Apr 26, 2023
352d4a8
add_dd functions
lizihao-anu May 2, 2023
7eac522
Style code
lizihao-anu May 2, 2023
cc5caf0
remove duplicated rows when many to many inner join
lizihao-anu May 3, 2023
d17a0d4
Style code
lizihao-anu May 3, 2023
c8cb968
fix missing %>%
lizihao-anu May 3, 2023
040f2e3
Update documentation
lizihao-anu May 3, 2023
2f0d251
Style code
lizihao-anu May 3, 2023
625470e
Merge branch 'main-R' into 333-c02-link-delayed-discharges-episodes
lizihao-anu May 3, 2023
badcea7
assign 1APE cij_end_date to keydate2_dd
lizihao-anu May 3, 2023
bff4092
Merge branch '333-c02-link-delayed-discharges-episodes' of github.com…
lizihao-anu May 3, 2023
bd0fab5
Style code
lizihao-anu May 3, 2023
174bdfa
Merge branch 'main-R' into 333-c02-link-delayed-discharges-episodes
lizihao-anu May 3, 2023
3787e97
corporate add_dd to run_episode_file
lizihao-anu May 9, 2023
05fda77
Style code
lizihao-anu May 9, 2023
7748218
Merge branch 'main-R' into 333-c02-link-delayed-discharges-episodes
lizihao-anu May 16, 2023
56a700a
[check-spelling] Update metadata
Moohan May 17, 2023
acf960e
Update R/add_dd.R
lizihao-anu May 23, 2023
e0991b9
Merge branch 'main-R' into 333-c02-link-delayed-discharges-episodes
lizihao-anu May 23, 2023
4b5f70f
select the correct lines for delayed discharge
lizihao-anu May 23, 2023
eb39b23
Merge branch '333-c02-link-delayed-discharges-episodes' of github.com…
lizihao-anu May 23, 2023
da8449b
Style code
lizihao-anu May 23, 2023
42809fa
add_dd lca
lizihao-anu May 23, 2023
623f68f
add_dd fix
lizihao-anu May 23, 2023
92bd011
Style code
lizihao-anu May 23, 2023
2f1f0f8
Merge branch 'main-R' into 333-c02-link-delayed-discharges-episodes
Moohan May 24, 2023
f435172
Merge branch 'main-R' into 333-c02-link-delayed-discharges-episodes
Jennit07 May 29, 2023
4e7c07c
Update R/add_dd.R
lizihao-anu May 30, 2023
d70b893
remove unnecessary clarity x$ y$
lizihao-anu May 30, 2023
cd3f684
Merge branch 'main-R' into 333-c02-link-delayed-discharges-episodes
Jennit07 Jun 1, 2023
c43243f
Add `.data$` where needed
Moohan Jun 2, 2023
5a97d23
Add quotes in the rename
Moohan Jun 2, 2023
52854f2
Merge branch 'main-R' into 333-c02-link-delayed-discharges-episodes
Moohan Jun 2, 2023
a23dd7a
Lint - Make integers explicit
Moohan Jun 2, 2023
c172f13
Lint - add `.data$` where relevant
Moohan Jun 2, 2023
2d5f827
Use `case_match` instead of `case_when`
Moohan Jun 2, 2023
0bf55da
Rename `add_dd()` to `link_delayed_discharge_eps()`
Moohan Jun 2, 2023
f189ea3
Rename `add_dd.R` to `link_delayed_discharge_eps.R`
Moohan Jun 2, 2023
0e9f0ac
Update the documentation for `last_date_month`
Moohan Jun 2, 2023
0ec7ec1
Add tests for `last_date_month`
Moohan Jun 2, 2023
c11b3e1
Merge branch 'main-R' into 333-c02-link-delayed-discharges-episodes
Moohan Jun 2, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .github/actions/spelling/expect.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
Accom
admloc
admtype
ADPE
adtf
arrivalmode
arth
Expand Down Expand Up @@ -34,6 +35,7 @@ createslf
dataframe
datamart
datazone
datediff
dateformat
dateop
datetime
Expand Down Expand Up @@ -82,6 +84,7 @@ hbtreatname
hci
HCP
HHG
hhg
hjust
hms
homecare
Expand Down Expand Up @@ -172,6 +175,7 @@ smr
SMRA
smrtype
SPARRA
sparra
spd
SPSS
spss
Expand Down Expand Up @@ -200,6 +204,7 @@ vline
xintercept
xlsx
yearstay
YYYYQX
zihao
zsav
zstd
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Generated by roxygen2: do not edit by hand

export("%>%")
export(add_dd)
export(add_nsu_cohort)
export(add_ppa_flag)
export(check_year_format)
Expand Down Expand Up @@ -73,6 +74,7 @@ export(get_year_dir)
export(is_date_in_fyyear)
export(is_missing)
export(la_code_lookup)
export(last_date_month)
export(latest_cost_year)
export(latest_update)
export(midpoint_fy)
Expand Down
314 changes: 314 additions & 0 deletions R/add_dd.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,314 @@
#' Add Delay Discharge to working file
#'
#' @param data The input data frame
#' @param year The year being processed
#'
#' @return A data frame linking delay discharge cohorts
#' @export
#'
#' @family episode file
add_dd <- function(data, year) {
year_param <- year

data <- data %>%
dplyr::mutate(
# remember to revoke the cij_end_date with dummy_cij_end
cij_start_date_lower = .data$cij_start_date - lubridate::days(1),
cij_end_date_upper = cij_end_date + lubridate::days(1),
cij_end_month = last_date_month(cij_end_date),
is_dummy_cij_start = is.na(cij_start_date) & !is.na(cij_end_date),
dummy_cij_start = dplyr::if_else(
is_dummy_cij_start,
Moohan marked this conversation as resolved.
Show resolved Hide resolved
lubridate::as_date("1900-01-01"),
cij_start_date_lower
),
is_dummy_cij_end = !is.na(cij_start_date) & is.na(cij_end_date),
dummy_cij_end = dplyr::if_else(
is_dummy_cij_end,
Moohan marked this conversation as resolved.
Show resolved Hide resolved
lubridate::today(),
cij_end_month
Moohan marked this conversation as resolved.
Show resolved Hide resolved
)
)
lizihao-anu marked this conversation as resolved.
Show resolved Hide resolved

## handling DD ----
# no flag for last reported
dd_data <-
read_file(get_source_extract_path(year_param, "DD")) %>%
dplyr::rename(
record_keydate1 = keydate1_dateformat,
record_keydate2 = keydate2_dateformat
Moohan marked this conversation as resolved.
Show resolved Hide resolved
) %>%
dplyr::mutate(
# remember to revoke the keydate2 and amended_dates with dummy_keydate2
is_dummy_keydate2 = is.na(record_keydate2),
Moohan marked this conversation as resolved.
Show resolved Hide resolved
dummy_keydate2 = dplyr::if_else(is_dummy_keydate2,
Moohan marked this conversation as resolved.
Show resolved Hide resolved
lubridate::today(),
record_keydate2
),
dummy_id = dplyr::row_number()
)

by_dd <- dplyr::join_by(
chi,
x$record_keydate1 >= y$dummy_cij_start,
lizihao-anu marked this conversation as resolved.
Show resolved Hide resolved
x$dummy_keydate2 <= y$dummy_cij_end
)
data <- dd_data %>%
dplyr::inner_join(data,
by = by_dd,
suffix = c("_dd", "")
) %>%
dplyr::arrange(cij_start_date, cij_end_date, cij_marker, postcode) %>%
# remove duplicate rows, but still got some duplicate mis-matches
dplyr::distinct(
chi,
cij_start_date,
cij_end_date,
cij_marker,
record_keydate1_dd,
record_keydate2_dd,
.keep_all = TRUE
) %>%
# determine DD quality
dplyr::mutate(
dd_type = dplyr::if_else(
is.na(cij_marker),
"no-cij",
dplyr::case_when(
# "1" "Accurate Match - (1)"
# "1P" "Accurate Match (allowing +-1 day) - (1P)"
# "1A" "Accurate Match (has an assumed end date) - (1A)"
# "1AP" "Accurate Match (allowing +-1 day and has an assumed end date) - (1AP)"
# "2" "Starts in CIJ - (2)"
# "2D" "Starts in CIJ (ends one day after) - (2D)"
# "2DP" "Starts in CIJ (allowing +-1 day and ends one day after) - (2DP)"
# "2A" "Starts in CIJ (Accurate Match after correcting assumed end date) - (2A)"
# "2AP" "Starts in CIJ (Accurate Match (allowing +-1 day) after correcting assumed end date) - (2AP)"
# "3" "Ends in CIJ - (3)"
# "3D" "Ends in CIJ (starts one day before) - (3D)"
# "3DP" "Ends in CIJ (allowing +-1 day and starts one day before) - (3DP)"
# "4" "Matches unended MH record - (4)"
# "4P" "Matches unended MH record (allowing -1 day) - (4P)"
# "-" "No Match (We don't keep these)".

# If we use record_keydate2_dd,
# we implicitly mean is_dummy_keydate2 needs to be FALSE.
# Given that in DD files,
# we only keep the records with missing keydate2 for 04B, mental health,
# and drop the records with missing keydate2 for other recid,
# it should be ok to only use dummy_keydate2 for "4"(s).

# "1" "Accurate Match - (1)"
record_keydate1_dd >= cij_start_date &
record_keydate2_dd <= cij_end_date &
!amended_dates ~ "1",

# "1P" "Accurate Match (allowing +-1 day) - (1P)"
record_keydate1_dd >= cij_start_date_lower &
record_keydate2_dd <= cij_end_date_upper &
!amended_dates ~ "1P",

# "1A" "Accurate Match (has an assumed end date) - (1A)"
record_keydate1_dd >= cij_start_date &
record_keydate2_dd <= cij_end_date &
amended_dates ~ "1A",

# "1AP" "Accurate Match (allowing +-1 day and has an assumed end date) - (1AP)"
record_keydate1_dd >= cij_start_date_lower &
record_keydate2_dd <= cij_end_date_upper &
amended_dates ~ "1AP",

# "1APE" the CIJ ends during the month but the delay has an end date of the end of the month
record_keydate1_dd >= cij_start_date_lower &
record_keydate2_dd == cij_end_month &
amended_dates ~ "1APE",

# "2" "Starts in CIJ - (2)"
record_keydate1_dd >= cij_start_date &
record_keydate1_dd <= cij_end_date &
record_keydate2_dd > cij_end_date &
!amended_dates ~ "2",

# "2D" "Starts in CIJ (ends one day after) - (2D)"
record_keydate1_dd >= cij_start_date &
record_keydate1_dd <= cij_end_date &
record_keydate2_dd > cij_end_date_upper &
!amended_dates ~ "2D",

# "2DP" "Starts in CIJ (allowing +-1 day and ends one day after) - (2DP)"
record_keydate1_dd >= cij_start_date_lower &
record_keydate1_dd <= cij_end_date_upper &
record_keydate2_dd > cij_end_date_upper &
!amended_dates ~ "2DP",

# "2A" "Starts in CIJ (Accurate Match after correcting assumed end date) - (2A)"
record_keydate1_dd >= cij_start_date &
record_keydate1_dd <= cij_end_date &
record_keydate2_dd > cij_end_date &
amended_dates ~ "2A",

# "2AP" "Starts in CIJ (Accurate Match (allowing +-1 day) after correcting assumed end date) - (2AP)"
record_keydate1_dd >= cij_start_date_lower &
record_keydate1_dd <= cij_end_date_upper &
record_keydate2_dd > cij_end_date_upper &
# record_keydate2_dd == cij_end_month &
amended_dates ~ "2AP",

# "3" "Ends in CIJ - (3)"
record_keydate1_dd <= cij_start_date &
record_keydate2_dd >= cij_start_date &
record_keydate2_dd <= cij_end_date &
!amended_dates ~ "3",

# "3D" "Ends in CIJ (starts one day before) - (3D)"
record_keydate1_dd <= cij_start_date_lower &
record_keydate2_dd >= cij_start_date &
record_keydate2_dd <= cij_end_date &
!amended_dates ~ "3D",

# "3DP" "Ends in CIJ (allowing +-1 day and starts one day before) - (3DP)"
record_keydate1_dd <= cij_start_date_lower &
record_keydate2_dd >= cij_start_date_lower &
record_keydate2_dd <= cij_end_date_upper &
!amended_dates ~ "3DP",

# "3ADPE"
record_keydate1_dd <= cij_start_date_lower &
record_keydate2_dd >= cij_start_date_lower &
record_keydate2_dd <= cij_end_month &
amended_dates ~ "3ADPE",

# "4" "Matches unended MH record - (4)"
recid == "04B" &
record_keydate1_dd >= cij_start_date &
is_dummy_cij_end ~ "4",

# "4P" "Matches unended MH record (allowing -1 day) - (4P)"
recid == "04B" &
record_keydate1_dd >= cij_start_date_lower &
is_dummy_cij_end ~ "4P",

# "-" "No Match (We don't keep these)"
.default = "-"
)
),
dd_type = factor(
dd_type,
levels = c(
"1",
"1P",
"1A",
"1AP",
"2",
"2D",
"2DP",
"2A",
"2AP",
"3",
"3D",
"3DP",
"1APE",
"3ADPE",
"4",
"4P",
"-"
)
),

# For "1APE", assign 1APE cij_end_date to record_keydate2_dd
record_keydate2_dd = dplyr::if_else(
dd_type == "1APE" | dd_type == "3ADPE",
cij_end_date,
record_keydate2_dd,
),
datediff_end = abs(cij_end_date - record_keydate2_dd),
datediff_start = cij_start_date - record_keydate1_dd
) %>%
dplyr::filter(dd_type != "-") %>%
dplyr::mutate(smrtype_dd = dplyr::case_when(
dd_type %in% c(
"1",
"1P",
"1A",
"1AP",
"1APE",
"2",
"2D",
"2DP",
"2A",
"2AP",
"3",
"3D",
"3DP",
"3ADPE",
"4",
"4P"
) ~ "DD-CIJ",
dd_type %in% c("no-cij") ~ "DD-No CIJ"
)) %>%
# remove duplicated rows when many to many inner join
# keep the records that closest to the cij record
dplyr::arrange(
chi,
original_admission_date,
record_keydate1_dd,
record_keydate2_dd,
dummy_id,
dd_type,
datediff_end, -datediff_start
) %>%
dplyr::distinct(postcode,
record_keydate1_dd,
record_keydate2_dd,
.keep_all = TRUE
) %>%
# tidy up and rename columns to match the format of episode files
dplyr::select(
"year" = "year_dd",
"recid" = "recid_dd",
"record_keydate1" = "record_keydate1_dd",
"record_keydate2" = "record_keydate2_dd",
"smrtype" = "smrtype_dd",
"chi",
"gender",
"dob",
"age",
"gpprac",
"postcode" = "postcode_dd",
"lca" = "dd_responsible_lca",
"hbtreatcode" = "hbtreatcode_dd",
"original_admission_date",
"amended_dates",
"delay_end_reason",
"primary_delay_reason",
"secondary_delay_reason",
"cij_marker",
"cij_start_date",
"cij_end_date",
"cij_pattype_code",
"cij_ipdc",
"cij_admtype",
"cij_adm_spec",
"cij_dis_spec",
"location",
"spec" = "spec_dd",
"dd_type"
) %>%
# combine DD with episode data
dplyr::bind_rows( # restore cij_end_date
data %>%
dplyr::select(
-c(
"cij_start_date_lower",
"cij_end_date_upper",
"cij_end_month",
"is_dummy_cij_start",
"dummy_cij_start",
"is_dummy_cij_end",
"dummy_cij_end"
)
)
)

return(data)
}
16 changes: 16 additions & 0 deletions R/last_date_month.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#' Return the end date of the month of the given date
#'
#' @description Return the end date of the month of the given date
#'
#' @param x a date with a date format
#'
#' @return a vector of dates of the end date of the FY year
#' @export
#'
#' @examples
#' last_date_month(lubridate::as_date("2020-02-05"))
#'
#' @family date functions
last_date_month <- function(x) {
return(lubridate::ceiling_date(x, "month") - lubridate::days(1))
}
lizihao-anu marked this conversation as resolved.
Show resolved Hide resolved
Loading