Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add activity after death #953

Merged
merged 81 commits into from
May 27, 2024
Merged
Show file tree
Hide file tree
Changes from 72 commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
7d05ce9
Remove redundant code
Jennit07 Jan 16, 2024
d1718f0
Update documentation
Jennit07 Jan 16, 2024
6aec7b1
Style code
Jennit07 Jan 16, 2024
65e8caa
Reorder when we match on client variables
Jennit07 Jan 22, 2024
35bcddc
Update documentation
Jennit07 Jan 22, 2024
800083a
Style code
Jennit07 Jan 22, 2024
996db4c
Revert "Update logic to use end of Quarter"
Jennit07 Jan 22, 2024
d10376d
Style code
Jennit07 Jan 22, 2024
b8e1dd2
Update documentation
Jennit07 Jan 22, 2024
3591aca
add check comment (TO DO for this PR)
Jennit07 Jan 22, 2024
47769e3
Remove `check_quarter_format` function
Jennit07 Jan 22, 2024
85c22ad
Remove `check_quarter_format`
Jennit07 Jan 22, 2024
e4d9128
Add chi parameter to `create_demog_test_flags`
Jennit07 Jan 22, 2024
daa9ee7
Style code
Jennit07 Jan 22, 2024
702225f
Use CHI parameter for ep/indiv tests
Jennit07 Jan 22, 2024
d0fb3cd
Use CHI parameter for extract tests (chi)
Jennit07 Jan 22, 2024
bbf28dd
Change test sheet names to lowercase
Jennit07 Jan 23, 2024
b3d826b
Change date to lowercase
Jennit07 Jan 23, 2024
4ca03b7
Update documentation
Jennit07 Jan 23, 2024
c589786
Merge branch 'master' into mar-23-update
Jennit07 Jan 24, 2024
1e87385
Merge branch 'mar-23-update' into remove-redundant-code
Jennit07 Jan 24, 2024
0e69e50
Update documentation
Jennit07 Jan 24, 2024
0586dd7
Merge branch 'mar-23-update' into bug-client-vars
SwiftySalmon Jan 24, 2024
cbf5ae4
Update documentation
SwiftySalmon Jan 24, 2024
cde4b3d
Bug - match on client variables before NSU (#896)
SwiftySalmon Jan 24, 2024
63f6333
Merge branch 'mar-23-update' into check_snake_case
Jennit07 Jan 24, 2024
e6c83f0
Merge branch 'mar-23-update' into review_demog_test_flags
SwiftySalmon Jan 24, 2024
3233bfb
Review `create_demog_test_flags` function (#898)
SwiftySalmon Jan 24, 2024
3055d54
Style code
Jennit07 Jan 29, 2024
acb2639
Merge branch 'mar-23-update' into check_snake_case
Jennit07 Jan 29, 2024
287e782
Amend test workbooks to be in the format snake case (#903)
SwiftySalmon Jan 29, 2024
30cb567
Fix pick variables
Jennit07 Feb 5, 2024
00ae978
Merge branch 'mar-23-update' into remove-redundant-code
Jennit07 Feb 5, 2024
3b38160
Remove redundant code (#894)
SwiftySalmon Feb 6, 2024
d0712d1
Merge branch 'mar-23-update' into bug-service_use_cohort
SwiftySalmon Feb 6, 2024
744bbc0
SC Demographics and SDS (#900)
SwiftySalmon Feb 7, 2024
cd8b359
Sc all at speedup (#904)
lizihao-anu Feb 7, 2024
b1a9523
Add case_when statement for `high_cc` cohort
Jennit07 Feb 7, 2024
6829c1a
Bug - `high_cc` in demographic cohort showing `NAs` instead of `TRUE/…
Jennit07 Feb 12, 2024
c7a1400
added a casewhen to update property type description for homelessness
Feb 13, 2024
ea19220
Update documentation
SwiftySalmon Feb 13, 2024
a634ea7
Style code
SwiftySalmon Feb 13, 2024
c1f2a36
Merge branch 'mar-23-update' into hl1_property_type_description
SwiftySalmon Feb 13, 2024
842e616
Added property type description for homelessness (#912)
SwiftySalmon Feb 13, 2024
760e2e5
Merge branch 'mar-23-update' into bug-service_use_cohort
Jennit07 Feb 14, 2024
1defb2b
Bug - NSUs being assigned psychiatry in the service use cohort (#908)
SwiftySalmon Feb 14, 2024
14cde16
Bug - deal with missing variables (#914)
Jennit07 Feb 16, 2024
625402b
Bug - Fix get pop path failing and preventing the indiv file from run…
Jennit07 Feb 16, 2024
36c5e74
correct file hscp file path
Jennit07 Feb 16, 2024
63e2caf
Merge branch 'mar-23-update' into review_check_qtr_format
Jennit07 Feb 16, 2024
c684b81
Review `check_qtr_format` function (#897)
rchlv Feb 16, 2024
ad629b2
Update process_sc_all_home_care.R
lizihao-anu Feb 26, 2024
640548b
Update process_sc_all_alarms_telecare.R
lizihao-anu Feb 26, 2024
e0da70c
remove duplicate columns
lizihao-anu Feb 26, 2024
9699394
Fix targets (#892)
lizihao-anu Feb 27, 2024
aa83adb
Merge branch 'mar-23-update' into lizihao-anu-patch-1
lizihao-anu Feb 27, 2024
f5c7448
remove cases that start date is later than end date
lizihao-anu Feb 27, 2024
612a409
Update process_sc_all_home_care.R (#916)
SwiftySalmon Feb 27, 2024
b7e7138
Update Refs for March24 SLF update
Jennit07 Feb 27, 2024
c9e103e
Merge branch 'mar-23-update' into add_activity_after_death
Jennit07 Feb 27, 2024
3c2a360
Merge branch 'master' into add_activity_after_death
Jennit07 Mar 26, 2024
ad79baa
Merge branch 'June-24-update' into add_activity_after_death
Jennit07 May 17, 2024
6c2b6d5
Added function for get_all_slf_deaths_lookup_path
rchlv May 17, 2024
d47c0a1
Update documentation
rchlv May 17, 2024
ec0c5ef
Style code
rchlv May 17, 2024
8a704e0
Add vars for activity after death flag
rchlv May 20, 2024
746d65d
Add activity after death flag
rchlv May 20, 2024
861bb2b
Join data back to episode file
rchlv May 20, 2024
f8d1d9a
Solve git conflict
rchlv May 20, 2024
8ea7941
Style code
rchlv May 20, 2024
b8589ad
Update documentation
rchlv May 20, 2024
3b22b92
Merge branch 'June-24-update' into add_activity_after_death
SwiftySalmon May 22, 2024
2b43e3f
Merge branch 'June-24-update' into add_activity_after_death
Jennit07 May 24, 2024
e3a646f
changes to activity after death flag
May 27, 2024
5e5c71a
Update documentation
SwiftySalmon May 27, 2024
5ed7cf3
Update R/add_activity_after_death_flag.R
SwiftySalmon May 27, 2024
8faac6e
Update R/add_activity_after_death_flag.R
SwiftySalmon May 27, 2024
1406ff0
added .data$ to variables
May 27, 2024
c535165
Update documentation
SwiftySalmon May 27, 2024
edb4782
Style code
SwiftySalmon May 27, 2024
9aea212
Merge branch 'June-24-update' into add_activity_after_death
SwiftySalmon May 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ export(end_fy_quarter)
export(end_next_fy_quarter)
export(find_latest_file)
export(fy_interval)
export(get_all_slf_deaths_lookup_path)
export(get_boxi_extract_path)
export(get_ch_costs_path)
export(get_dd_path)
Expand Down Expand Up @@ -91,6 +92,7 @@ export(process_costs_ch_rmd)
export(process_costs_dn_rmd)
export(process_costs_gp_ooh_rmd)
export(process_costs_hc_rmd)
export(process_deaths_lookup)
export(process_extract_acute)
export(process_extract_ae)
export(process_extract_alarms_telecare)
Expand Down
177 changes: 177 additions & 0 deletions R/add_activity_after_death_flag.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
#' Match on BOXI NRS death dates to process activity after death flag
#'
#' @description Match on CHI number where available in the episode file, and add date of death from the BOXI NRS lookup.
#' Create new activity after death flag
#'
#' @param data episode files
#' @param year financial year, e.g. '1920'
#' @param deaths_data The death data for the year
#'
#' @return data flagged if activity after death
add_activity_after_death_flag <- function(
data,
year,
deaths_data = read_file(all_slf_deaths_lookup_path())) {
SwiftySalmon marked this conversation as resolved.
Show resolved Hide resolved
# Match on BOXI NRS deaths lookup for records without chi
data <- data %>%
SwiftySalmon marked this conversation as resolved.
Show resolved Hide resolved
dplyr::filter(!is.na(chi) | chi != "") %>%
dplyr::left_join(
deaths_data,
by = "chi",
suffix = c("", "_boxi")
)


# Check and print error message for records which already have a death_date in the episode file, but this doesn't match the BOXI death date
check_death_date_match <- data %>%
dplyr::filter(death_date != death_date_boxi)

if (nrow(check_death_date_match) != 0) {
warning("There were records in the episode file which already have a death_date, but does not match the BOXI NRS death date.")
}


# Check and print error message for records which have a record_keydate1 after their BOXI death date
check_keydate1_death_date <- data %>%
dplyr::filter(record_keydate1 > death_date_boxi)

if (nrow(check_death_date_match) != 0) {
warning("There were records in the episode file which have a record_keydate1 after the BOXI NRS death date.")
}


flag_data <- data %>%
dplyr::mutate(
flag_keydate1 = if_else(record_keydate1 > death_date_boxi, 1, 0),
SwiftySalmon marked this conversation as resolved.
Show resolved Hide resolved
flag_keydate2 = if_else(record_keydate2 > death_date_boxi, 1, 0),

# Next flag records with 'ongoing' activity after date of death (available from BOXI) if keydate2 is missing and the death date occurs in
# in the current or a previous financial year.
flag_keydate2_missing = if_else(((is.na(record_keydate2) | record_keydate2 == "") & (death_date_boxi <= paste0("20", substr(year, 3, 4), "-03-31"))), 1, 0),

# Also flag records without a death_date in the episode file, but the BOXI death date occurs in the current or a previous financial year.
flag_deathdate_missing = if_else(((is.na(death_date) | death_date == "") & (death_date_boxi <= paste0("20", substr(year, 3, 4), "-03-31"))), 1, 0)

Check failure on line 53 in R/add_activity_after_death_flag.R

View workflow job for this annotation

GitHub Actions / Check Spelling

`deathdate` is not a recognized word. (unrecognized-spelling)
) %>%
# These should be flagged by one of the two lines of code above, but in these cases, we will also fill in the blank death date if appropriate

# Search all variables beginning with "flag_" for value "1" and create new variable to flag cases where 1 is present
# Multiplying by 1 changes flag from true/false to 1/0
dplyr::mutate(activity_after_death = purrr::pmap_dbl(
select(., contains("flag_")),
SwiftySalmon marked this conversation as resolved.
Show resolved Hide resolved
~ any(grepl("^1$", c(...)),
na.rm = TRUE
) * 1
))


# Check and print error message for records which already are TRUE for the deceased variable in the episode file, but this doesn't match the
# BOXI deceased variable
check_deceased_match <- flag_data %>%
dplyr::filter(deceased != deceased_boxi)

if (nrow(check_deceased_match) != 0) {
warning("There were records in the episode file which have a deceased variable which does not match the BOXI NRS deceased variable")
}


# Fill in date of death if missing in the episode file but available in BOXI lookup, due to historic dates of death not being carried
# over from previous financial years
flag_data <- flag_data %>%
dplyr::mutate(death_date = if_else(((is.na(death_date) | death_date == "") & (death_date_boxi <= paste0("20", substr(year, 1, 2), "-03-31"))), death_date_boxi, death_date)) %>%
SwiftySalmon marked this conversation as resolved.
Show resolved Hide resolved
dplyr::mutate(deceased = if_else(((is.na(deceased) | deceased == "") & (deceased_boxi == TRUE)), deceased_boxi, deceased)) %>%
# Remove temporary flag variables used to create activity after death flag and fill in missing death_date
dplyr::select(-c(death_date_boxi, deceased_boxi, flag_keydate1, flag_keydate2, flag_keydate2_missing, flag_deathdate_missing))

Check failure on line 83 in R/add_activity_after_death_flag.R

View workflow job for this annotation

GitHub Actions / Check Spelling

`deathdate` is not a recognized word. (unrecognized-spelling)

# Match activity after death flag back to episode file
final_data <- data %>%
dplyr::left_join(
flag_data,
by = "chi",
na_matches = "never",
relationship = "many-to-one"
)


return(final_data)
}


#' Create and read SLF Deaths lookup from processed BOXI NRS deaths extracts
#'
#' #' @description The BOXI NRS deaths extract lookup should be created after the extract files for all years have been processed,
# but before an episode file has been produced. Therefore, all BOXI NRS years should be run before running episode files.
#'
#' @param file_path Path to the BOXI NRS file for each financial year - may not use this
#' @param year The year to process, in FY format - may not use this
#'
#' @param write_to_disk (optional) Should the data be written to disk default is
#' `TRUE` i.e. write the data to disk.
#'
#' @return the final data as a [tibble][tibble::tibble-package].
#' @export
#'
#'
#'
# Read data------------------------------------------------
process_deaths_lookup <- function(update = latest_update(), ...) {
all_boxi_deaths <- read_file(get_slf_deaths_lookup_path("1415")) %>%
rbind(read_file(get_slf_deaths_lookup_path("1516"))) %>%
rbind(read_file(get_slf_deaths_lookup_path("1617"))) %>%
rbind(read_file(get_slf_deaths_lookup_path("1718"))) %>%
rbind(read_file(get_slf_deaths_lookup_path("1819"))) %>%
rbind(read_file(get_slf_deaths_lookup_path("1920"))) %>%
rbind(read_file(get_slf_deaths_lookup_path("2021"))) %>%
rbind(read_file(get_slf_deaths_lookup_path("2122"))) %>%
rbind(read_file(get_slf_deaths_lookup_path("2223"))) %>%
rbind(read_file(get_slf_deaths_lookup_path("2324"))) %>%
# Can this be automated to pick up files starting with name "get_slf_deaths_lookup_path"?

# Remove rows with missing or blank CHI number - could also use na.omit?
# na.omit(all_boxi_deaths)
dplyr::filter(!is.na(chi) | chi != "")

# Check all CHI numbers are valid
chi_check <- all_boxi_deaths %>%
dplyr::pull(.data$chi) %>%
phsmethods::chi_check()

if (!all(chi_check %in% c("Valid CHI", "Missing (Blank)", "Missing (NA)"))) {
# There are some Missing (NA) values in the extracts, but I have excluded them above as they cannot be matched to episode file
stop("There were bad CHI numbers in the BOXI NRS file")
}

# Check and print error message for chi numbers with more than one death date
duplicates <- all_boxi_deaths %>%
janitor::get_dupes(.data$chi)

if (nrow(duplicates) != 0) {
# There are some Missing (NA) values in the extracts, but I have excluded them above as they cannot be matched to episode file
warning("There were duplicate death dates in the BOXI NRS file.")
}


# We decided to include duplicates as unable to determine which is correct date (unless IT can tell us, however, they don't seem to know
# the process well enough), and overall impact will be negligible
# Get anon_chi and use this to match onto episode file later
all_boxi_deaths <- all_boxi_deaths %>%
slfhelper::get_anon_chi()

# Save out duplicates for further investigation if needed (as anon_chi)
if (!missing(duplicates)) {
write_file(
duplicates,
fs::path(get_slf_dir(), "Deaths",
file_name = stringr::str_glue("slf_deaths_duplicates_{update}.parquet")
)
)
}

# Maybe save as its own function
# Write the all BOXI NRS deaths lookup file to disk, so this can be used to populate activity after death flag in each episode file
if (write_to_disk) {
all_boxi_deaths %>%
write_file(get_all_slf_deaths_lookup_path())
}

return(all_boxi_deaths)
}
3 changes: 3 additions & 0 deletions R/create_episode_file.R
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,8 @@
"hscp",
"datazone2011",
"attendance_status",
"death_date",
"deceased",
"deathdiag1",
"deathdiag2",
"deathdiag3",
Expand Down Expand Up @@ -139,6 +141,7 @@
year,
slf_deaths_lookup
) %>%
add_activity_after_death_flag(year, deaths_data = read_file(all_slf_deaths_lookup_path())) %>%
load_ep_file_vars(year)

if (!check_year_valid(year, type = c("ch", "hc", "at", "sds"))) {
Expand Down Expand Up @@ -260,7 +263,7 @@
fill_missing_cij_markers <- function(data) {
fixable_data <- data %>%
dplyr::filter(
.data[["recid"]] %in% c("01B", "04B", "GLS", "02B", "DD") & !is.na(.data[["chi"]])

Check failure on line 266 in R/create_episode_file.R

View workflow job for this annotation

GitHub Actions / Check Spelling

`GLS` is not a recognized word. (unrecognized-spelling)
)

non_fixable_data <- data %>%
Expand Down Expand Up @@ -361,7 +364,7 @@
dplyr::mutate(
cost_total_net_inc_dnas = .data$cost_total_net,
# In the Cost_Total_Net column set the cost for
# those with attendance status 5 or 8 (CNWs and DNAs)

Check failure on line 367 in R/create_episode_file.R

View workflow job for this annotation

GitHub Actions / Check Spelling

`CNWs` is not a recognized word. (unrecognized-spelling)
cost_total_net = dplyr::if_else(
.data$attendance_status %in% c(5L, 8L),
0.0,
Expand Down
26 changes: 26 additions & 0 deletions R/get_slf_lookup_paths.R
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@
#' @family slf lookup file path
#' @seealso [get_file_path()] for the generic function.
get_slf_deaths_lookup_path <- function(year, ...) {
# Review the naming convention of this path and file
slf_deaths_lookup_path <- get_file_path(
directory = fs::path(get_slf_dir(), "Deaths"),
file_name = stringr::str_glue("slf_deaths_lookup_{year}.parquet"),
Expand All @@ -69,6 +70,31 @@
return(slf_deaths_lookup_path)
}

#' SLF death dates File Path
#'
#' @description Get the full path to the BOXI NRS Deaths lookup file for all financial years
#'
#' @inheritParams get_boxi_extract_path
#' @param ... additional arguments passed to [get_file_path()]
#' @param year financial year e.g. "1920"
#'
#' @export
#' @family slf lookup file path
#' @seealso [get_file_path()] for the generic function.

get_all_slf_deaths_lookup_path <- function(update = latest_update()) {
# Note this name is very similar to the existing slf_deaths_lookup_path which returnsthe path for

Check failure on line 86 in R/get_slf_lookup_paths.R

View workflow job for this annotation

GitHub Actions / Check Spelling

`returnsthe` is not a recognized word. (unrecognized-spelling)
# the processed BOXI extract for each financial year. This function will return the combined financial
# years lookup i.e. all years put together.
all_slf_deaths_lookup_path <- get_file_path(
directory = fs::path(get_slf_dir(), "Deaths",
file_name = stringr::str_glue("all_slf_deaths_lookup_{update}.parquet")
)
)

return(all_slf_deaths_lookup_path)
}

#' SLF CHI Deaths File Path
#'
#' @description Get the full path to the CHI deaths file
Expand Down
26 changes: 26 additions & 0 deletions man/add_activity_after_death_flag.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

27 changes: 27 additions & 0 deletions man/get_all_slf_deaths_lookup_path.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/get_slf_ch_name_lookup_path.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/get_slf_chi_deaths_path.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/get_slf_deaths_lookup_path.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/get_slf_gpprac_path.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/get_slf_postcode_path.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

22 changes: 22 additions & 0 deletions man/process_deaths_lookup.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading