-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create individual file #715
Merged
Merged
Changes from 250 commits
Commits
Show all changes
266 commits
Select commit
Hold shift + click to select a range
0c426c5
Until L594
jr-mandy 7f17ef2
Converted until L677
jr-mandy 52a4ffd
Until L731
jr-mandy 3223189
Update documentation
jr-mandy 1c67f20
Remove test ref
jr-mandy aaad1ff
Merge
jr-mandy b14e4b2
Style code
jr-mandy f31f19b
WIP writing functions to fill postcode in line with previous DOB func…
shintoLamp 891c9a9
Update documentation
shintoLamp 509935f
Merge branch 'main-R' into d01_2
Moohan 134a893
Merge branch 'main-R' into d01_1
Moohan 9664f4b
Merge branch 'main-R' into d01_2
Moohan 9a39422
Merge branch 'main-R' into d01_1
Moohan a26dcc9
Merge branch 'main-R' into d01_1
Moohan 40adc51
Merge branch 'main-R' into d01_2
Moohan a4b3393
Merge branch 'main-R' into d01_2
Moohan c197c51
Merge branch 'main-R' into d01_1
Moohan 124f64e
Merge branch 'main-R' into d01_2
Moohan fb27bf8
Merge branch 'main-R' into d01_1
Moohan 85b1bea
Merge branch 'main-R' into d01_2
Jennit07 fb57c75
Merge branch 'main-R' into d01_1
Jennit07 73f77d5
implement quick fix for running 22/23
Jennit07 00f37c8
Style code
Jennit07 4017984
Merge branch 'main-R' into d01_1
Moohan ce0d700
Merge branch 'main-R' into d01_2
Moohan 2cacee8
Fix missed comma
Moohan 369a8bb
Exclude DD code for now - TEMP fix
Jennit07 f715830
Correct/rename variables
Jennit07 d537aad
Style code
Jennit07 50641b3
Include NSU in `check_year_valid`
Jennit07 1b52ebb
Update `check_year_valid_tests`
Jennit07 e5cf2a0
Update documentation
Jennit07 33fe105
Update `add_nsu_cohort` to pick up years valid
Jennit07 3473c18
Style code
Jennit07 fff9bad
remove extra `!`
Jennit07 8a37356
Exclude `cij_delay`
Jennit07 b2f6941
Style code
Jennit07 2c7d312
Merge branch 'main-R' into d01_2
Jennit07 6c44897
Merge branch 'main-R' into no_data_nsu
Moohan 13a14e9
Merge remote-tracking branch 'origin/d01_1' into d01_2
Moohan 362d1b4
Merge branch 'main-R' into d01_2
Jennit07 07b03f3
improve `max_no_inf()`
Jennit07 617ac68
Use pmin/max instead of `rowwise`
Jennit07 01cc1b4
improve `min_no_inf()`
Jennit07 2ff02bd
Use n_distinct(cij_marker)
Jennit07 435cd0f
deal with distinct(ch_chi_cis)
Jennit07 5a0b550
use n_distinct(ooh_case_id)
Jennit07 0da09b0
remove `find_non_duplicates`
Jennit07 a1d9c80
Merge branch 'main-R' into d01_2
Jennit07 faa0a96
Use dplyr::if_else()
Jennit07 979fc81
Fix typo in `ooh_covid_assessment`
Jennit07 6a57809
Move `ooh_case_id` to aggregate
Jennit07 83fbdcb
Use `slfhelper::ltc_vars`
Jennit07 8a761c0
Remove `clean_up_dob`
Jennit07 46a7b70
Update documentation
Jennit07 6424c95
[check-spelling] Update metadata
Moohan 89268dc
Use `start_next_fy_quarter` in place of rowwise
Jennit07 b6d93ed
Style code
Jennit07 d4e1d41
Use `compute_mid_year_age`
Jennit07 eac15ed
convert code into data.table for improving speed
Jennit07 4f6d6ff
Update `get_fy_dates`function
Jennit07 4c9134b
remove `date_from_fy`, use `get_fy_dates`
Jennit07 3730ee1
Update documentation
Jennit07 c9852b4
Remove `clean_up_postcode` function
Jennit07 3714bca
Remove non duplicates function/move to aggregate
Jennit07 15ae96a
Style code
Jennit07 e182a14
Update documentation
Jennit07 73852cc
Add time stamps to `create_individual_file`
Jennit07 a358cc5
Style code
Jennit07 ca0c7b6
remove `clean_up_postcode`
Jennit07 2cb8a24
Deal with ch cis episodes
Jennit07 fee2b46
Style code
Jennit07 ee36738
add .data$
Jennit07 feef2b6
Turn ch aggregate into a data table
Jennit07 da13d92
Style code
Jennit07 7fc40fa
use ch_chi_cis
Jennit07 45eeca0
remove `preventable_admissions` from aggregate
Jennit07 d89b0aa
exclude `hh_in_fy` for now
Jennit07 2326c0f
Style code
Jennit07 78d2c36
Test - exclude `sc_` vars from aggregate
Jennit07 3ac7d26
Style code
Jennit07 141c880
Exclude for now
Jennit07 93fcd43
exclude for now
Jennit07 baf5d13
Style code
Jennit07 3bf8fb7
automate `check_year_valid`
Jennit07 3e5a059
Return dummy file path for NSU not valid
Jennit07 bfeffc7
Style code
Jennit07 cfc0195
Merge branch 'main-R' into no_data_nsu
Jennit07 f42825c
Merge branch 'no_data_nsu' into d01_2
Jennit07 4aacf7a
Fix brackets in aggregate
Jennit07 5bf6a4b
TEMP - exclude variables
Jennit07 e045ccc
Use `phsmethods::sex_from_chi`
Jennit07 173ae02
Style code
Jennit07 e5332ee
Add ungroup()
Jennit07 cec63a3
lowercase dob
Jennit07 8a652df
Remove as.data.table
Jennit07 fc979d9
rewrite aggregate_by_chi with data.table
lizihao-anu 7c63f57
Style code
lizihao-anu 70f0891
minor changes
lizihao-anu 4e89a6b
Merge branch 'd01_2_zihao' of github.com:Public-Health-Scotland/sourc…
lizihao-anu abda3d5
Use the updated function
Moohan 38be4d2
to properly import data.table
lizihao-anu 6368535
remove redundant columns dob postcode and gpprac
lizihao-anu 2d70019
minor changes to remove redundant postcode gpprac columns
lizihao-anu 9f23cff
Style code
lizihao-anu b361616
rename columns with small letters
lizihao-anu a23388b
Merge branch 'd01_2_zihao' of github.com:Public-Health-Scotland/sourc…
lizihao-anu 550adab
Style code
lizihao-anu fee7d8b
newaggregate_ch_episodes
lizihao-anu a8f4ae2
Update documentation
lizihao-anu c601a64
Merge branch 'main-R' into d01_2
Moohan bf9fdfd
Merge branch 'd01_2' into d01_2_zihao
Moohan f37b276
Merge branch 'main-R' into d01_2
Moohan ae5eacd
Merge branch 'd01_2' into d01_2_zihao
Moohan cd8a08b
add functions to replace regular expressions to select column/variables
lizihao-anu 7415971
Merge branch 'd01_2_zihao' of github.com:Public-Health-Scotland/sourc…
lizihao-anu e03b02d
Update documentation
lizihao-anu 66e21e6
Style code
lizihao-anu f0fce5b
minor changes
lizihao-anu f1b96d1
add a missing variable, cij_delay
lizihao-anu f565922
Style code
lizihao-anu 5237f9e
add variables cij_delay, preventable_beddays
lizihao-anu bdfc0b4
add missing variables health_net_cost, health_net_costincdnas, and cm…
lizihao-anu 7d7296d
Merge branch 'd01_2_zihao' of github.com:Public-Health-Scotland/sourc…
lizihao-anu 7b288bd
Style code
lizihao-anu e907dd9
add more variables needed
lizihao-anu 4647904
Style code
lizihao-anu 9f8133b
Merge branch 'main-R' into d01_2
Moohan 5004a52
Merge branch 'd01_2' into d01_2_zihao
Moohan 45688c3
Update R/link_delayed_discharge_eps.R
Moohan b2676d4
Style code
Moohan 8048e68
amend costs
lizihao-anu 78197c6
Style code
lizihao-anu 4fd8ac4
Revert "amend costs"
Jennit07 b6a1e6f
Add DN and cij_delay back in
Jennit07 f32c7a2
fix the issue
lizihao-anu 99115ce
minor changes
lizihao-anu 04fe893
Style code
lizihao-anu b468271
remove running in chunks
lizihao-anu 55a075c
Style code
lizihao-anu b9fbf29
Update tests to include missing variables
Jennit07 74da47c
Remove unnecessary comma
Jennit07 79981a3
fix the bug of preventable_beddays
lizihao-anu fe72501
Merge branch 'd01_2_zihao' of github.com:Public-Health-Scotland/sourc…
lizihao-anu a029a10
Update documentation
lizihao-anu 71702e0
fix total ae_attendances
lizihao-anu 90511c3
Merge branches 'd01_2_zihao' and 'd01_2_zihao' of github.com:Public-H…
lizihao-anu 1667ff0
fix the bug of preventable_admissions
lizihao-anu 0a517d3
fix the bug of hbrescode etc
lizihao-anu 3fc60d1
Merge branch 'main-R' into d01_2_zihao
lizihao-anu 3b24326
minor fix
lizihao-anu 5dd9544
Merge branch 'd01_2_zihao' of https://github.com/Public-Health-Scotla…
lizihao-anu 4e4330c
minor fix
lizihao-anu 6bdd780
Style code
lizihao-anu 6720349
Merge branch 'create_individual_file_stable' into d01_2_zihao
lizihao-anu 06e1c7c
Fix some warnings being produced by the tests
Moohan e14ae02
Fix failing test
Moohan dc79a75
remove running in chunks
lizihao-anu 28289fa
Style code
lizihao-anu 5ad5c78
Update the targets config to use `timestamp_positives` as the default…
Moohan f6e04ce
fix the bug of preventable_beddays
lizihao-anu 6bcf2b2
Update documentation
lizihao-anu b0065c9
fix total ae_attendances
lizihao-anu 42f107a
fix the bug of preventable_admissions
lizihao-anu 9612b9a
fix the bug of hbrescode etc
lizihao-anu 4750913
minor fix
lizihao-anu 338479f
minor fix
lizihao-anu 724f319
Style code
lizihao-anu e9c8ef0
fix home care cost
lizihao-anu 9a951a5
add ipdc to fix maternity
lizihao-anu 4b63ee8
fix preventable addmission and care home cost
lizihao-anu c42d7ba
fix preventable_admissions and calculate preventable_beddays here
lizihao-anu f0671fc
add monthly_beddays and yearstay to dd
lizihao-anu 3246a69
Merge branch 'd01_2_zihao' of github.com:Public-Health-Scotland/sourc…
lizihao-anu 9cc84f3
Style code
lizihao-anu d6391e5
fix preventable_admissions and preventable_beddays
lizihao-anu 1a136fd
Style code
lizihao-anu c631f4e
include parameter for write to disk/year
Jennit07 390fbc0
Merge branch 'main-R' into join-lookups
Jennit07 c7b7bb8
Merge branch 'main-R' into create_individual_file_stable
Jennit07 b731676
Add lookups to indiv file creation pipeline
Jennit07 5507a33
include parameter for write to disk/year
Jennit07 e8f1099
fix delay discharge beddays and yearstay
lizihao-anu ff36479
Style code
lizihao-anu 23e8513
fix preventable issues
lizihao-anu 3022576
Style code
lizihao-anu 9a7b8e0
fix the issue of preventable stuff
lizihao-anu d264943
Style code
lizihao-anu 999afd8
Merge branch 'create_individual_file_stable' into d01_2_zihao
lizihao-anu 7433fb8
Update R/aggregate_by_chi_zihao.R
lizihao-anu 64e15b0
Merge branch 'master' into create_individual_file_stable
Moohan 288f417
Merge branch 'create_individual_file_stable' into join-lookups
Moohan 8f31277
Update documentation
Moohan b3f2d11
Fix minor typos
Moohan 1bc1d6c
[check-spelling] Update metadata
Moohan 31b3782
Remove some obsolete comments
Moohan a1371ed
Remove some unnecessary brackets
Moohan 64081c8
Reformat some code
Moohan 0800662
Use some `dplyr` functions for readability
Moohan a954611
Style code
Moohan 7a4c023
Merge branch 'master' into create_individual_file_stable
Moohan f9f6e8f
Join lookups onto individual file pipeline (#709)
Moohan 2777d81
Merge branch 'create_individual_file_stable' into d01_2_zihao
Moohan 689dac2
Update R/link_delayed_discharge_eps.R
Moohan fa6120d
Style code
Moohan b96b3b3
Merge branch 'master' into create_individual_file_stable
Moohan 388fa04
Some individual file fixes (#710)
Moohan 16d6d22
Remove some code which is no longer needed
Moohan 77ddd9e
Work out preventable admissions with similar indicators
Moohan 51a0767
Lowercase variable names
Moohan b17f806
Restore `cij_delay`
Moohan 12ec4f6
Restore DN variables
Moohan 33681d3
Tidy the code and use integers where possible
Moohan f9e6f81
Supply `year` as a parameter to `clean_up_ch`
Moohan cb73e0f
Supply `year` as a parameter to `clean_individual_file`
Moohan 42cc15e
Only keep required variables to save memory
Moohan 35a6ef2
Rename the parameter so the documentation works
Moohan 978d9e8
Use `setnames` to change names to lower
Moohan 9be6385
Remove unneeded code
Moohan 1ca4000
Update file path name
Moohan 3ebfecc
Trim the return code
Moohan beae36a
Some fixes
Moohan 13b7f11
Correctly compute `ooh_cases`
Moohan c03a0ee
Update documentation
Moohan b13fc13
Merge branch 'create_individual_file_stable' into d01_2_zihao
Moohan 0027576
Style code
Moohan bee0342
Merge branch 'master' into create_individual_file_stable
Moohan 61d02dc
Merge branch 'create_individual_file_stable' into d01_2_zihao
Moohan 60e3f3a
[check-spelling] Update metadata
Moohan 21859ad
Some more indiv file changes and fixes (#719)
Moohan bc75af8
Merge branch 'master' into create_individual_file_stable
Moohan 2d8e731
Merge branch 'master' into create_individual_file_stable
Moohan c8d86c5
Add targets for the individual file
Moohan 62c70c5
Fix missed pipe
Moohan 479d9db
Merge branch 'master' into create_individual_file_stable
Moohan 54252a1
Style code
Moohan 9ae871a
Update some targets to only run once a week
Moohan 486b51d
Make the deaths lookup unique
Moohan 5ad1928
Add `year` back to the individual file
Moohan c7acc86
Merge branch 'master' into create_individual_file_stable
Moohan 7103630
Merge branch 'master' into create_individual_file_stable
Moohan cf02089
Merge branch 'master' into create_individual_file_stable
Moohan 507fffe
Remove `cost_total_net_inc_dnas` from the indiv file (#737)
Moohan 292f4d8
Join slf lookups onto individual file (#724)
Jennit07 8bc5c4c
Merge branch 'master' into create_individual_file_stable
Moohan c644992
Join sc client variables onto individual file (#740)
Jennit07 7b24b8c
Merge branch 'master' into create_individual_file_stable
Moohan 1bb52aa
Update documentation
Moohan dabbf57
Output the individual file with `anon_chi` (#748)
Moohan 437c82b
Merge branch 'master' into create_individual_file_stable
Moohan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,215 @@ | ||
#' Aggregate by CHI | ||
#' | ||
#' @description Aggregate episode file by CHI to convert into | ||
#' individual file. | ||
#' | ||
#' @importFrom data.table .N | ||
#' @importFrom data.table .SD | ||
#' | ||
#' @inheritParams create_individual_file | ||
aggregate_by_chi_zihao <- function(episode_file) { | ||
cli::cli_alert_info("Aggregate by CHI function started at {Sys.time()}") | ||
|
||
# Convert to data.table | ||
data.table::setDT(episode_file) | ||
|
||
# Ensure all variable names are lowercase | ||
data.table::setnames(episode_file, stringr::str_to_lower) | ||
|
||
# Sort the data | ||
data.table::setkeyv( | ||
episode_file, | ||
c( | ||
"chi", | ||
"record_keydate1", | ||
"keytime1", | ||
"record_keydate2", | ||
"keytime2" | ||
) | ||
) | ||
|
||
data.table::setnames( | ||
episode_file, | ||
c( | ||
"ch_chi_cis", "cij_marker", "ooh_case_id" | ||
# ,"hh_in_fy" | ||
), | ||
c( | ||
"ch_cis_episodes", "cij_total", "ooh_cases" | ||
# ,"hl1_in_fy" | ||
) | ||
) | ||
|
||
# column specification, grouped by chi | ||
# columns to select last | ||
cols2 <- c( | ||
"postcode", | ||
"dob", | ||
"gpprac", | ||
vars_start_with(episode_file, "sc_") | ||
) | ||
# columns to count unique rows | ||
cols3 <- c( | ||
"ch_cis_episodes", | ||
"cij_total", | ||
"cij_el", | ||
"cij_non_el", | ||
"cij_mat", | ||
"cij_delay", | ||
"ooh_cases", | ||
"preventable_admissions" | ||
) | ||
# columns to sum up | ||
cols4 <- c( | ||
vars_end_with( | ||
episode_file, | ||
c( | ||
"episodes", | ||
"beddays", | ||
"cost", | ||
"attendances", | ||
"attend", | ||
"contacts", | ||
"hours", | ||
"alarms", | ||
"telecare", | ||
"paid_items", | ||
"advice", | ||
"homev", | ||
"time", | ||
"assessment", | ||
"other", | ||
"dn", | ||
"nhs24", | ||
"pcc", | ||
"_dnas" | ||
) | ||
), | ||
vars_start_with( | ||
episode_file, | ||
"sds_option" | ||
), | ||
"health_net_cost_inc_dnas" | ||
) | ||
cols4 <- cols4[!(cols4 %in% c("ch_cis_episodes"))] | ||
# columns to select maximum | ||
cols5 <- c("nsu", vars_contain(episode_file, c("hl1_in_fy"))) | ||
data.table::setnafill(episode_file, fill = 0L, cols = cols5) | ||
# compute | ||
individual_file_cols1 <- episode_file[, | ||
.(gender = mean(gender)), | ||
by = "chi" | ||
] | ||
individual_file_cols2 <- episode_file[, | ||
.SD[.N], | ||
.SDcols = cols2, | ||
by = "chi" | ||
] | ||
individual_file_cols3 <- episode_file[, | ||
lapply(.SD, function(x) { | ||
data.table::uniqueN(x, na.rm = TRUE) | ||
}), | ||
.SDcols = cols3, | ||
by = "chi" | ||
] | ||
individual_file_cols4 <- episode_file[, | ||
lapply(.SD, function(x) { | ||
sum(x, na.rm = TRUE) | ||
}), | ||
.SDcols = cols4, | ||
by = "chi" | ||
] | ||
individual_file_cols5 <- episode_file[, | ||
lapply(.SD, function(x) max(x, na.rm = TRUE)), | ||
.SDcols = cols5, | ||
by = "chi" | ||
] | ||
individual_file_cols6 <- episode_file[, | ||
.( | ||
preventable_beddays = ifelse( | ||
max(cij_ppa, na.rm = TRUE), | ||
max(cij_end_date) - min(cij_start_date), | ||
NA_real_ | ||
) | ||
), | ||
# cij_marker has been renamed as cij_total | ||
by = c("chi", "cij_total") | ||
] | ||
individual_file_cols6 <- individual_file_cols6[, | ||
.( | ||
preventable_beddays = sum(preventable_beddays, na.rm = TRUE) | ||
), | ||
by = "chi" | ||
] | ||
|
||
individual_file <- dplyr::bind_cols( | ||
individual_file_cols1, | ||
individual_file_cols2[, chi := NULL], | ||
individual_file_cols3[, chi := NULL], | ||
individual_file_cols4[, chi := NULL], | ||
individual_file_cols5[, chi := NULL], | ||
individual_file_cols6[, chi := NULL] | ||
) | ||
|
||
# convert back to tibble | ||
return(dplyr::as_tibble(individual_file)) | ||
} | ||
|
||
|
||
#' select columns ending with some patterns | ||
#' @describeIn select columns based on patterns | ||
vars_end_with <- function(data, vars, ignore_case = FALSE) { | ||
names(data)[stringr::str_ends( | ||
names(data), | ||
stringr::regex(paste(vars, collapse = "|"), | ||
ignore_case = ignore_case | ||
) | ||
)] | ||
} | ||
|
||
#' select columns starting with some patterns | ||
#' @describeIn select columns based on patterns | ||
vars_start_with <- function(data, vars, ignore_case = FALSE) { | ||
names(data)[stringr::str_starts( | ||
names(data), | ||
stringr::regex(paste(vars, collapse = "|"), | ||
ignore_case = ignore_case | ||
) | ||
)] | ||
} | ||
|
||
#' select columns contains some characters | ||
#' @describeIn select columns based on patterns | ||
vars_contain <- function(data, vars, ignore_case = FALSE) { | ||
names(data)[stringr::str_detect( | ||
names(data), | ||
stringr::regex(paste(vars, collapse = "|"), | ||
ignore_case = ignore_case | ||
) | ||
)] | ||
} | ||
|
||
#' Aggregate CIS episodes | ||
#' | ||
#' @description Aggregate CH variables by CHI and CIS. | ||
#' | ||
#' @inheritParams create_individual_file | ||
aggregate_ch_episodes_zihao <- function(episode_file) { | ||
cli::cli_alert_info("Aggregate ch episodes function started at {Sys.time()}") | ||
|
||
# Convert to data.table | ||
data.table::setDT(episode_file) | ||
|
||
# Perform grouping and aggregation | ||
episode_file <- episode_file[, `:=`( | ||
ch_no_cost = max(ch_no_cost), | ||
ch_ep_start = min(record_keydate1), | ||
ch_ep_end = max(ch_ep_end), | ||
ch_cost_per_day = mean(ch_cost_per_day) | ||
), by = c("chi", "ch_chi_cis")] | ||
|
||
# Convert back to tibble if needed | ||
episode_file <- tibble::as_tibble(episode_file) | ||
|
||
return(episode_file) | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lizihao-anu I don't think we need the zoo package? I assume this was used in some of the 'original code'.
Can you confirm, and also remove
aggregate_ch_episodes
andaggregate_by_chi
and rename 'your' data.table versions to take their place as I think we're happy they work and do a better / faster job.