Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update older years to bring the data in line with our newest processes. #988

Merged
merged 137 commits into from
Sep 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
137 commits
Select commit Hold shift + click to select a range
5858576
fix sc_client_lookup sc_send_lca
lizihao-anu Jan 11, 2024
41d6c58
fix an issue of get_pop_path
lizihao-anu Jan 11, 2024
8599574
Style code
lizihao-anu Jan 11, 2024
8c76e2c
fix the rest of get_pop_path from get_datazone_pop_path
lizihao-anu Jan 11, 2024
177b34c
Merge branch 'fix_targets' of github.com:Public-Health-Scotland/sourc…
lizihao-anu Jan 11, 2024
0b7edbe
Update documentation
lizihao-anu Jan 12, 2024
d4b0660
fix sc_send_lca
lizihao-anu Jan 12, 2024
77e6c09
add missing year column
lizihao-anu Jan 12, 2024
7d05ce9
Remove redundant code
Jennit07 Jan 16, 2024
d1718f0
Update documentation
Jennit07 Jan 16, 2024
6aec7b1
Style code
Jennit07 Jan 16, 2024
fe5ceb1
explicitly specify the argument year to avoid corruption of targets
lizihao-anu Jan 18, 2024
ca6f25f
Update documentation
lizihao-anu Jan 18, 2024
65e8caa
Reorder when we match on client variables
Jennit07 Jan 22, 2024
35bcddc
Update documentation
Jennit07 Jan 22, 2024
800083a
Style code
Jennit07 Jan 22, 2024
e4d9128
Add chi parameter to `create_demog_test_flags`
Jennit07 Jan 22, 2024
daa9ee7
Style code
Jennit07 Jan 22, 2024
702225f
Use CHI parameter for ep/indiv tests
Jennit07 Jan 22, 2024
d0fb3cd
Use CHI parameter for extract tests (chi)
Jennit07 Jan 22, 2024
bbf28dd
Change test sheet names to lowercase
Jennit07 Jan 23, 2024
b3d826b
Change date to lowercase
Jennit07 Jan 23, 2024
4ca03b7
Update documentation
Jennit07 Jan 23, 2024
0c61266
new data pipeline with targets
lizihao-anu Jan 23, 2024
5242191
minor changes
lizihao-anu Jan 23, 2024
45651a2
Style code
lizihao-anu Jan 23, 2024
c589786
Merge branch 'master' into mar-23-update
Jennit07 Jan 24, 2024
1e87385
Merge branch 'mar-23-update' into remove-redundant-code
Jennit07 Jan 24, 2024
0e69e50
Update documentation
Jennit07 Jan 24, 2024
0586dd7
Merge branch 'mar-23-update' into bug-client-vars
SwiftySalmon Jan 24, 2024
cbf5ae4
Update documentation
SwiftySalmon Jan 24, 2024
cde4b3d
Bug - match on client variables before NSU (#896)
SwiftySalmon Jan 24, 2024
62dc226
Merge branch 'mar-23-update' into fix_targets
lizihao-anu Jan 24, 2024
63f6333
Merge branch 'mar-23-update' into check_snake_case
Jennit07 Jan 24, 2024
e6c83f0
Merge branch 'mar-23-update' into review_demog_test_flags
SwiftySalmon Jan 24, 2024
3233bfb
Review `create_demog_test_flags` function (#898)
SwiftySalmon Jan 24, 2024
a540fd4
Merge branch 'mar-23-update' into fix_targets
SwiftySalmon Jan 24, 2024
3055d54
Style code
Jennit07 Jan 29, 2024
acb2639
Merge branch 'mar-23-update' into check_snake_case
Jennit07 Jan 29, 2024
e0ad3fa
Merge branch 'mar-23-update' into old_years
Jennit07 Jan 29, 2024
287e782
Amend test workbooks to be in the format snake case (#903)
SwiftySalmon Jan 29, 2024
0198ac5
Merge branch 'fix_targets' into old_years
Jennit07 Jan 29, 2024
2acc38f
undo sc_send_lca bit
lizihao-anu Jan 30, 2024
db4c462
Merge branch 'fix_targets' into old_years
Jennit07 Jan 30, 2024
ce77f75
Add code for running years available
Jennit07 Jan 30, 2024
9f16a68
Update `_targets.R` script for running old years
Jennit07 Jan 30, 2024
81c78f5
Style code
Jennit07 Jan 30, 2024
334a2bb
Update `check_year_valid` for running old years
Jennit07 Jan 31, 2024
dc71603
Use `check_year_valid` where no data for old yrs
Jennit07 Jan 31, 2024
85af5f4
Style code
Jennit07 Jan 31, 2024
30cb567
Fix pick variables
Jennit07 Feb 5, 2024
00ae978
Merge branch 'mar-23-update' into remove-redundant-code
Jennit07 Feb 5, 2024
3b38160
Remove redundant code (#894)
SwiftySalmon Feb 6, 2024
d0712d1
Merge branch 'mar-23-update' into bug-service_use_cohort
SwiftySalmon Feb 6, 2024
744bbc0
SC Demographics and SDS (#900)
SwiftySalmon Feb 7, 2024
cd8b359
Sc all at speedup (#904)
lizihao-anu Feb 7, 2024
b1a9523
Add case_when statement for `high_cc` cohort
Jennit07 Feb 7, 2024
6829c1a
Bug - `high_cc` in demographic cohort showing `NAs` instead of `TRUE/…
Jennit07 Feb 12, 2024
c7a1400
added a casewhen to update property type description for homelessness
Feb 13, 2024
ea19220
Update documentation
SwiftySalmon Feb 13, 2024
a634ea7
Style code
SwiftySalmon Feb 13, 2024
c1f2a36
Merge branch 'mar-23-update' into hl1_property_type_description
SwiftySalmon Feb 13, 2024
842e616
Added property type description for homelessness (#912)
SwiftySalmon Feb 13, 2024
760e2e5
Merge branch 'mar-23-update' into bug-service_use_cohort
Jennit07 Feb 14, 2024
1defb2b
Bug - NSUs being assigned psychiatry in the service use cohort (#908)
SwiftySalmon Feb 14, 2024
14cde16
Bug - deal with missing variables (#914)
Jennit07 Feb 16, 2024
625402b
Bug - Fix get pop path failing and preventing the indiv file from run…
Jennit07 Feb 16, 2024
101e1ac
Merge branch 'mar-23-update' into old_years
Jennit07 Feb 16, 2024
36c5e74
correct file hscp file path
Jennit07 Feb 16, 2024
a63cb7a
Merge branch 'mar-23-update' into old_years
Jennit07 Feb 16, 2024
9d67429
Declare missing variables for older years
Jennit07 Feb 16, 2024
bab433a
Merge branch 'september-2024' into old_years
Jennit07 Jul 22, 2024
0844a2f
setup targets scripts for old years
Jennit07 Jul 22, 2024
753d10e
Style code
Jennit07 Jul 22, 2024
e541db1
Merge branch 'september-2024' into old_years
Jennit07 Jul 24, 2024
881fdf6
Include `check_year_valid` for sc client path
Jennit07 Jul 24, 2024
8b0da55
Add check year valid to join sc client
Jennit07 Jul 24, 2024
db24e84
Add if else statement
Jennit07 Jul 24, 2024
eaccd43
WIP - TO DO - fix dummy path for `get_chi()`
Jennit07 Jul 24, 2024
bcd55ad
Style code
Jennit07 Jul 24, 2024
0ec4d9d
update dummy data file to read empty tibble
Jennit07 Jul 25, 2024
6af1e41
Update `check_year_valid`
Jennit07 Jul 25, 2024
7322df2
Update declared `NA` variables
Jennit07 Jul 25, 2024
faf564d
Update documentation
Jennit07 Jul 25, 2024
41500ef
declare `count_not_known` as NA
Jennit07 Jul 31, 2024
ac62956
supply year as default in `aggregate_by_chi`
Jennit07 Jul 31, 2024
51dc1ce
Decalre unused variables
Jennit07 Aug 2, 2024
976e74b
Style code
Jennit07 Aug 2, 2024
b0b12d3
Update sc client with sept update new code
Jennit07 Aug 2, 2024
0f568fb
Specify code for running older years
Jennit07 Aug 2, 2024
10686ae
Merge branch 'september-2024' into old_years
Jennit07 Aug 2, 2024
e713058
Style code
Jennit07 Aug 2, 2024
298cfe7
Add Running SLF files manually scripts
Jennit07 Aug 7, 2024
5588584
Style code
Jennit07 Aug 7, 2024
0746ee6
update write_tests_xlsx
lizihao-anu Aug 23, 2024
3dfbc8e
update process_refined_death
lizihao-anu Aug 23, 2024
1bfe269
fix tests by removing get_chi
lizihao-anu Aug 26, 2024
cf2a547
add 2425
lizihao-anu Aug 26, 2024
c5e7c7f
Style code
lizihao-anu Aug 26, 2024
a57f993
fix NA matches in refined_death
lizihao-anu Aug 26, 2024
ecd019d
move latest_cost_year() to cost_uplift()
lizihao-anu Aug 27, 2024
9d5bd12
improve automation
lizihao-anu Aug 27, 2024
2f5e0a0
Update documentation
lizihao-anu Aug 27, 2024
45ddf9a
fix `cij_ppa` in DD data
Jennit07 Aug 27, 2024
500b166
fix bugs of dd and populate cij_delay back to episodes
lizihao-anu Aug 27, 2024
24dea0b
Style code
lizihao-anu Aug 27, 2024
5d3ebe7
Merge branch 'september-2024' into sep2024-fix
Jennit07 Aug 27, 2024
2f2fd94
keep all variable for delayed discharge episodes
lizihao-anu Aug 27, 2024
3948654
remove dummy variable names from dd_date
lizihao-anu Aug 27, 2024
08c58f7
Style code
lizihao-anu Aug 27, 2024
f3a90a7
remove `deceased_boxi` variable - bug
Jennit07 Aug 28, 2024
baadb99
remove `create_person_id`. Its matched in client
Jennit07 Aug 28, 2024
bfed637
remove `create_person_id`
Jennit07 Aug 28, 2024
0e19c5f
Update `run_slf_manually` scripts
Jennit07 Aug 28, 2024
5948fb0
further remove person_id
lizihao-anu Aug 29, 2024
d54dafc
fix duplicate row introduced by adding death
lizihao-anu Aug 30, 2024
8d6f3e7
remove duplicated chi when joining death data
lizihao-anu Aug 30, 2024
bd01b28
TODO: check distinct death data by chi while keeping chi==NA records
lizihao-anu Aug 30, 2024
fc6404f
add parameter for year
Jennit07 Sep 2, 2024
0261482
fix duplicate in add_activity_after_death_flag
lizihao-anu Sep 2, 2024
746da2b
Update `check_year_valid`
Jennit07 Sep 3, 2024
c107825
Merge branch 'september-2024' into old_years
Jennit07 Sep 4, 2024
57c7521
Declare DN variables
Jennit07 Sep 9, 2024
a5fed7f
Style code
Jennit07 Sep 9, 2024
d6de128
Merge remote-tracking branch 'origin/sep2024-fix' into old_years
Jennit07 Sep 9, 2024
97110d9
Declare client variables
Jennit07 Sep 10, 2024
1bb5fa9
remove extra dd variables
Jennit07 Sep 10, 2024
71cc259
remove redundant variables
lizihao-anu Sep 12, 2024
f30508b
remove fy variable
Jennit07 Sep 12, 2024
c32546d
Remove redundant variable `count_not_known`
Jennit07 Sep 12, 2024
615ab24
Remove duplicate code
Jennit07 Sep 13, 2024
cb9b930
revert commit - remove fy
Jennit07 Sep 13, 2024
026135b
Merge remote-tracking branch 'origin/september-2024' into old_years
Jennit07 Sep 16, 2024
c4123d2
update manual run
Jennit07 Sep 16, 2024
2be8541
declare missing sc variables indiv file
Jennit07 Sep 16, 2024
1dd7183
Style code
Jennit07 Sep 16, 2024
95c9054
Merge branch 'september-2024' into old_years
lizihao-anu Sep 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions R/check_year_valid.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ check_year_valid <- function(
"ch",
"client",
"cmh",
"cost_dna",
"dd",
"deaths",
"dn",
Expand All @@ -34,9 +35,9 @@ check_year_valid <- function(
)) {
if (year <= "1415" && type %in% c("dn", "sparra")) {
return(FALSE)
} else if (year <= "1516" && type %in% c("cmh", "homelessness")) {
} else if (year <= "1516" && type %in% c("cmh", "homelessness", "dd")) {
return(FALSE)
} else if (year <= "1617" && type %in% c("ch", "hc", "sds", "at")) {
} else if (year <= "1617" && type %in% c("ch", "hc", "sds", "at", "client", "cost_dna")) {
return(FALSE)
} else if (year <= "1718" && type %in% "hhg") {
return(FALSE)
Expand Down
65 changes: 64 additions & 1 deletion R/create_episode_file.R
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,65 @@ create_episode_file <- function(
sc_social_worker = NA,
sc_type_of_housing = NA,
sc_meals = NA,
sc_day_care = NA
sc_day_care = NA,
social_care_id = NA,
sc_dementia = NA,
sc_learning_disability = NA,
sc_mental_health_disorders = NA,
sc_physical_and_sensory_disability = NA,
sc_drugs = NA,
sc_alcohol = NA,
sc_palliative_care = NA,
sc_carer = NA,
sc_elderly_frail = NA,
sc_neurological_condition = NA,
sc_autism = NA,
sc_other_vulnerable_groups = NA,
ch_provider_description = NA
)
}

if (!check_year_valid(year, type = "homelessness")) {
episode_file <- episode_file %>%
dplyr::mutate(
hl1_12_months_post_app = NA,
hl1_12_months_pre_app = NA,
hl1_6after_ep = NA,
hl1_6before_ep = NA,
hl1_application_ref = NA,
hl1_completeness = NA,
hl1_during_ep = NA,
hl1_in_fy = NA,
hl1_property_type = NA,
hl1_reason_ftm = NA,
hl1_sending_lca = NA
)
}

if (!check_year_valid(year, type = "dd")) {
episode_file <- episode_file %>%
dplyr::mutate(
cij_delay = NA,
dd_quality = NA,
dd_responsible_lca = NA,
delay_end_reason = NA,
primary_delay_reason = NA,
secondary_delay_reason = NA,
)
}

if (!check_year_valid(year, type = "dn")) {
episode_file <- episode_file %>%
dplyr::mutate(
ccm = NA,
total_no_dn_contacts = NA
)
}

if (!check_year_valid(year, type = "cost_dna")) {
episode_file <- episode_file %>%
dplyr::mutate(
cost_total_net_inc_dnas = NA
)
}

Expand Down Expand Up @@ -471,6 +529,11 @@ join_sc_client <- function(data,
file_type = c("episode", "individual")) {
cli::cli_alert_info("Join social care client function started at {Sys.time()}")

if (!check_year_valid(year, type = "client")) {
data_file <- data
return(data_file)
}

if (file_type == "episode") {
# Match on client variables by chi
data_file <- data %>%
Expand Down
28 changes: 27 additions & 1 deletion R/create_individual_file.R
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,9 @@ create_individual_file <- function(
hc_personal_hours = NA,
hc_non_personal_hours = NA,
hc_reablement_hours = NA,
hc_non_personal_hours_cost = NA,
hc_personal_hours_cost = NA,
hc_reablement_hours_cost = NA,
at_alarms = NA,
at_telecare = NA,
sds_option_1 = NA,
Expand All @@ -125,10 +128,33 @@ create_individual_file <- function(
sc_support_from_unpaid_carer = NA,
sc_social_worker = NA,
sc_meals = NA,
sc_day_care = NA
sc_day_care = NA,
sc_type_of_housing = NA,
count_not_known = NA,
sc_latest_submission = NA,
social_care_id = NA,
person_id = NA,
sc_alcohol = NA,
sc_autism = NA,
sc_carer = NA,
sc_dementia = NA,
sc_drugs = NA,
sc_elderly_frail = NA,
sc_learning_disability = NA,
sc_mental_health_disorders = NA,
sc_neurological_condition = NA,
sc_other_vulnerable_groups = NA,
sc_palliative_care = NA,
sc_physical_and_sensory_disability = NA
)
}

if (!check_year_valid(year, type = "homelessness")) {
individual_file <- individual_file %>%
dplyr::mutate(hl1_in_fy = NA)
}


if (anon_chi_out) {
individual_file <- individual_file %>%
tidyr::replace_na(list(chi = "")) %>%
Expand Down
4 changes: 3 additions & 1 deletion R/get_boxi_extract_path.R
Original file line number Diff line number Diff line change
Expand Up @@ -86,9 +86,11 @@ get_boxi_extract_path <- function(
#'
#' @return an [fs::path()] to a dummy file which can be used with targets.
get_dummy_boxi_extract_path <- function() {
get_file_path(
dummy_path <- get_file_path(
directory = get_dev_dir(),
file_name = ".dummy",
create = TRUE
)

return(dummy_path)
}
21 changes: 14 additions & 7 deletions R/get_sc_lookup_paths.R
Original file line number Diff line number Diff line change
Expand Up @@ -38,11 +38,18 @@ get_sc_demog_lookup_path <- function(update = latest_update(), ...) {
#' @family social care lookup file paths
#' @seealso [get_file_path()] for the generic function.
get_sc_client_lookup_path <- function(year, update = latest_update(), ...) {
sc_client_lookup_path <- get_file_path(
directory = fs::path(get_slf_dir(), "Social_care", "processed_sc_client_lookup"),
file_name = stringr::str_glue("anon-sc_client_lookup_{year}_{update}.parquet"),
...
)

return(sc_client_lookup_path)
if (!check_year_valid(year, type = "client")) {
return(get_dummy_boxi_extract_path())
} else {
sc_client_lookup_path <- get_file_path(
directory = fs::path(
get_slf_dir(),
"Social_care",
"processed_sc_client_lookup"
),
file_name = stringr::str_glue("anon-sc_client_lookup_{year}_{update}.parquet"),
...
)
return(sc_client_lookup_path)
}
}
8 changes: 6 additions & 2 deletions R/link_delayed_discharge_eps.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,12 @@ link_delayed_discharge_eps <- function(
dd_data = read_file(get_source_extract_path(year, "dd")) %>% slfhelper::get_chi()) {
cli::cli_alert_info("Link delayed discharge to episode file function started at {Sys.time()}")

names_ep <- names(episode_file)
if (!check_year_valid(year, type = "dd")) {
episode_file <- episode_file
return(episode_file)
}

names_ep <- names(episode_file)
episode_file <- episode_file %>%
dplyr::mutate(
# remember to revoke the cij_end_date with dummy_cij_end
Expand Down Expand Up @@ -370,7 +374,7 @@ link_delayed_discharge_eps <- function(
delay_dd,
cij_delay
)) %>%
dplyr::select(-c("has_dd", "delay_dd"))
dplyr::select(-c("has_dd", "delay_dd", "original_admission_date", "amended_dates"))

return(linked_data)
}
14 changes: 14 additions & 0 deletions R/process_lookup_homelessness.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ create_homelessness_lookup <- function(
homelessness_data = read_file(get_source_extract_path(year, "homelessness")) %>% slfhelper::get_chi()) {
cli::cli_alert_info("Create homelessness lookup function started at {Sys.time()}")

# Specify years available for running
if (year < "1617") {
return(NULL)
}
homelessness_lookup <- homelessness_data %>%
dplyr::distinct(.data$chi, .data$record_keydate1, .data$record_keydate2) %>%
tidyr::drop_na(.data$chi) %>%
Expand All @@ -39,6 +43,11 @@ add_homelessness_flag <- function(data, year,
lookup = create_homelessness_lookup(year)) {
cli::cli_alert_info("Add homelessness flag function started at {Sys.time()}")

if (!check_year_valid(year, type = "homelessness")) {
data <- data
return(data)
}

data <- data %>%
dplyr::left_join(
lookup %>%
Expand All @@ -65,6 +74,11 @@ add_homelessness_flag <- function(data, year,
add_homelessness_date_flags <- function(data, year, lookup = create_homelessness_lookup(year)) {
cli::cli_alert_info("Add homelessness date flags function started at {Sys.time()}")

if (!check_year_valid(year, type = "homelessness")) {
data <- data
return(data)
}

lookup <- lookup %>%
dplyr::filter(!(is.na(.data$record_keydate2))) %>%
dplyr::rename(
Expand Down
5 changes: 5 additions & 0 deletions R/process_lookup_sc_client.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,11 @@ process_lookup_sc_client <-
slfhelper::get_chi() %>%
dplyr::select(c("sending_location", "social_care_id", "chi", "latest_flag")),
write_to_disk = TRUE) {
# Specify years available for running
if (year < "1718") {
return(NULL)
}

# Match to demographics lookup to get CHI
sc_client_demographics <- data %>%
dplyr::right_join(
Expand Down
2 changes: 1 addition & 1 deletion R/read_file.R
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ read_file <- function(path, col_select = NULL, as_data_frame = TRUE, ...) {

# Return an empty tibble if trying to read the dummy path
if (path == get_dummy_boxi_extract_path()) {
return(tibble::tibble())
return(tibble::tibble(anon_chi = NA_character_))
}

ext <- fs::path_ext(path)
Expand Down
25 changes: 8 additions & 17 deletions R/replace_sc_id_with_latest.R
Jennit07 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -7,40 +7,31 @@ replace_sc_id_with_latest <- function(data) {
# Check for required variables
check_variables_exist(
data,
c("sending_location", "social_care_id", "chi", "period")
c("sending_location", "social_care_id", "chi", "latest_flag")
)

# select variables we need
filter_data <- data %>%
dplyr::select(
"sending_location", "social_care_id", "chi", "period"
"sending_location", "social_care_id", "chi", "latest_flag"
) %>%
dplyr::filter(!(is.na(.data$chi)))
dplyr::filter(!(is.na(.data$chi))) %>%
dplyr::distinct()

change_sc_id <- filter_data %>%
# Sort (by sending_location, chi and period) for unique chi/sending location
dplyr::arrange(
.data$sending_location,
.data$chi,
dplyr::desc(.data$period)
) %>%
# Find the latest sc_id for each chi/sending location by keeping latest period
dplyr::distinct(
.data$sending_location,
.data$chi,
.keep_all = TRUE
) %>%
dplyr::filter(latest_flag == 1) %>%
# Rename for latest sc id
dplyr::rename(latest_sc_id = "social_care_id") %>%
# drop period for matching
dplyr::select(-"period")
# drop latest_flag for matching
dplyr::select(-"latest_flag")

return_data <- change_sc_id %>%
# Match back onto data
dplyr::right_join(data,
by = c("sending_location", "chi"),
multiple = "all"
) %>%
dplyr::filter(!(is.na(period))) %>%
# Overwrite sc id with the latest
dplyr::mutate(
social_care_id = dplyr::if_else(
Expand Down
8 changes: 8 additions & 0 deletions R/write_tests_xlsx.R
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,14 @@ write_tests_xlsx <- function(comparison_data,

date_today <- stringr::str_to_lower(date_today)

sheet_name_dated <- ifelse(
is.null(year),
stringr::str_glue("{sheet_name}_{date_today}"),
stringr::str_glue("{year}_{sheet_name}_{date_today}")
)

date_today <- stringr::str_to_lower(date_today)

if (is.null(year)) {
sheet_name_dated <- stringr::str_glue("{sheet_name}_{date_today}")
} else {
Expand Down
Loading
Loading