From 2e8d63d61d6b153efc01761fab7e61687e7b7302 Mon Sep 17 00:00:00 2001 From: Rhian Davies Date: Tue, 12 Nov 2024 14:42:41 +0000 Subject: [PATCH 1/6] =?UTF-8?q?=F0=9F=93=9D=20Update=20data=20log?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- data_extraction/data_log.qmd | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/data_extraction/data_log.qmd b/data_extraction/data_log.qmd index 6222fbb..16280cb 100644 --- a/data_extraction/data_log.qmd +++ b/data_extraction/data_log.qmd @@ -5,6 +5,28 @@ order: 250 This page contains information about changes to the data underpinning the NHP model. If there are no changes logged for a particular version (such as 1.0) then the changes were only to the model code, and not to the underlying data. +## Version 3.0 + +Date updated: 13/11/2024 + +As part of migrating our data processing from SQL to databricks, we have a made a number of changes to the data. + +Inpatient data: + - Include records where `patientid` is `NULL` + - Include readmission within 28 days for well babies +Inpatient mitigators: + - ambulatory emergency care: change the way we extract to use a more stable and workable code list + - evidence based interventions: update code lists + - medicines related admissions: fix bug in old SQL (explicit codes were not excluded from implicit properly) + - pre-op_los_1-day: Use an [updated procedure list](https://github.com/The-Strategy-Unit/nhp_data/blob/9cd4495172b5aa63e57b29ac9172d6d512a311b9/generate_inpatients.py#L27) +- Outpatients data + - Keep where `sitetret=null` +- AAE data + - Include followups (flagged by `atentype.isin([2, 3, 4])`) + - [Include new `acuity` column](https://github.com/The-Strategy-Unit/nhp_data/commit/b3e2ef9acf40b18f5b079533e364f74fd792e1a0#diff-e2401c3d40d30e706413f4a8efac282b166111cc6b7a44ec88c952a801fe4dc6R171) which is a more human readble version of [URGENT AND EMERGENCY CARE ACUITY (SNOMED CT)](https://www.datadictionary.nhs.uk/data_elements/urgent_and_emergency_care_acuity__snomed_ct_.html) + + + ## Version 2.2 Date updated: 24/09/2024 From 2d7a663d8c65868018caf85cd30e56b7e435ae92 Mon Sep 17 00:00:00 2001 From: Rhian Davies Date: Tue, 12 Nov 2024 23:04:37 +0000 Subject: [PATCH 2/6] =?UTF-8?q?=F0=9F=93=9D=20Link=20to=20ambulatory=20car?= =?UTF-8?q?e=20changes?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: YiWen Hon --- data_extraction/data_log.qmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/data_extraction/data_log.qmd b/data_extraction/data_log.qmd index 16280cb..d6576ac 100644 --- a/data_extraction/data_log.qmd +++ b/data_extraction/data_log.qmd @@ -15,7 +15,7 @@ Inpatient data: - Include records where `patientid` is `NULL` - Include readmission within 28 days for well babies Inpatient mitigators: - - ambulatory emergency care: change the way we extract to use a more stable and workable code list + - ambulatory emergency care: change the way we extract to use a more [stable and workable code list](https://github.com/The-Strategy-Unit/nhp_data/blob/main/mitigators/ip/activity_avoidance/ambulatory_care_conditions.py) - evidence based interventions: update code lists - medicines related admissions: fix bug in old SQL (explicit codes were not excluded from implicit properly) - pre-op_los_1-day: Use an [updated procedure list](https://github.com/The-Strategy-Unit/nhp_data/blob/9cd4495172b5aa63e57b29ac9172d6d512a311b9/generate_inpatients.py#L27) From a419f4d43f553bbe74e8eca3a6f885634d7969fc Mon Sep 17 00:00:00 2001 From: Rhian Davies Date: Tue, 12 Nov 2024 23:21:06 +0000 Subject: [PATCH 3/6] =?UTF-8?q?=F0=9F=93=9D=20Link=20to=20data=20processin?= =?UTF-8?q?g=20repos?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- data_extraction/data_log.qmd | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/data_extraction/data_log.qmd b/data_extraction/data_log.qmd index d6576ac..0998eff 100644 --- a/data_extraction/data_log.qmd +++ b/data_extraction/data_log.qmd @@ -8,8 +8,9 @@ This page contains information about changes to the data underpinning the NHP mo ## Version 3.0 Date updated: 13/11/2024 +We have migrated our data pre-processing pipeline from SQL to Databricks. The new data pre-processing scripts are available in the public repository [nhp_data](https://github.com/The-Strategy-Unit/nhp_data). This repository supersedes the [nhp_sql](https://github.com/The-Strategy-Unit/nhp_sql) repositiory which will be publically archived and no longer maintained. -As part of migrating our data processing from SQL to databricks, we have a made a number of changes to the data. +As part of migrating our data pre-processing, there are a number of changes to the data. Inpatient data: - Include records where `patientid` is `NULL` From b5635aa0452099e0c4982a1e8b66d7b9de18fd18 Mon Sep 17 00:00:00 2001 From: Rhian Davies Date: Wed, 13 Nov 2024 12:10:22 +0000 Subject: [PATCH 4/6] =?UTF-8?q?=F0=9F=90=9B=20Document=20bug=20fix=20for?= =?UTF-8?q?=20has=5Fprocedures?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- data_extraction/data_log.qmd | 1 + 1 file changed, 1 insertion(+) diff --git a/data_extraction/data_log.qmd b/data_extraction/data_log.qmd index 0998eff..f84b4e8 100644 --- a/data_extraction/data_log.qmd +++ b/data_extraction/data_log.qmd @@ -15,6 +15,7 @@ As part of migrating our data pre-processing, there are a number of changes to t Inpatient data: - Include records where `patientid` is `NULL` - Include readmission within 28 days for well babies + - Fixed bug in `has_procedure` which was failing to filter codes beginning with `U`, `Y` or `Z` Inpatient mitigators: - ambulatory emergency care: change the way we extract to use a more [stable and workable code list](https://github.com/The-Strategy-Unit/nhp_data/blob/main/mitigators/ip/activity_avoidance/ambulatory_care_conditions.py) - evidence based interventions: update code lists From 37f2e986f0d68dc3453f2f6d48586d505e3c9c7f Mon Sep 17 00:00:00 2001 From: YiWen Hon Date: Thu, 14 Nov 2024 09:07:20 +0000 Subject: [PATCH 5/6] Update data_extraction/data_log.qmd Co-authored-by: Tom Jemmett --- data_extraction/data_log.qmd | 1 + 1 file changed, 1 insertion(+) diff --git a/data_extraction/data_log.qmd b/data_extraction/data_log.qmd index f84b4e8..552744b 100644 --- a/data_extraction/data_log.qmd +++ b/data_extraction/data_log.qmd @@ -16,6 +16,7 @@ Inpatient data: - Include records where `patientid` is `NULL` - Include readmission within 28 days for well babies - Fixed bug in `has_procedure` which was failing to filter codes beginning with `U`, `Y` or `Z` + - Added a new flag `maternity_delivery_in_spell`, which looks to see if any episode in the spell had `maternity_episode_type=1`, the same [logic used to create the official published statistics on delivery episodes](https://digital.nhs.uk/data-and-information/publications/statistical/nhs-maternity-statistics) (applied at spell end level) Inpatient mitigators: - ambulatory emergency care: change the way we extract to use a more [stable and workable code list](https://github.com/The-Strategy-Unit/nhp_data/blob/main/mitigators/ip/activity_avoidance/ambulatory_care_conditions.py) - evidence based interventions: update code lists From b70a95ef784a00208f0fea2ba7506b22fd4765b4 Mon Sep 17 00:00:00 2001 From: YiWen Hon Date: Thu, 14 Nov 2024 09:38:44 +0000 Subject: [PATCH 6/6] address review points, remove readmission 28 days --- data_extraction/data_log.qmd | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/data_extraction/data_log.qmd b/data_extraction/data_log.qmd index 552744b..ba9e49c 100644 --- a/data_extraction/data_log.qmd +++ b/data_extraction/data_log.qmd @@ -14,19 +14,20 @@ As part of migrating our data pre-processing, there are a number of changes to t Inpatient data: - Include records where `patientid` is `NULL` - - Include readmission within 28 days for well babies - Fixed bug in `has_procedure` which was failing to filter codes beginning with `U`, `Y` or `Z` - Added a new flag `maternity_delivery_in_spell`, which looks to see if any episode in the spell had `maternity_episode_type=1`, the same [logic used to create the official published statistics on delivery episodes](https://digital.nhs.uk/data-and-information/publications/statistical/nhs-maternity-statistics) (applied at spell end level) Inpatient mitigators: - - ambulatory emergency care: change the way we extract to use a more [stable and workable code list](https://github.com/The-Strategy-Unit/nhp_data/blob/main/mitigators/ip/activity_avoidance/ambulatory_care_conditions.py) - - evidence based interventions: update code lists - - medicines related admissions: fix bug in old SQL (explicit codes were not excluded from implicit properly) - - pre-op_los_1-day: Use an [updated procedure list](https://github.com/The-Strategy-Unit/nhp_data/blob/9cd4495172b5aa63e57b29ac9172d6d512a311b9/generate_inpatients.py#L27) + - ambulatory emergency care: Change the way we extract to use a more [stable and workable code list](https://github.com/The-Strategy-Unit/nhp_data/blob/main/mitigators/ip/activity_avoidance/ambulatory_care_conditions.py) + - evidence based interventions: Update code lists to reflect latest evidence + - medicines related admissions: Fix bug in old SQL (explicit medicines related codes were not excluded from implicit queries properly) + - pre-op_los mitigators: Use an [updated procedure list](https://github.com/The-Strategy-Unit/nhp_data/blob/9cd4495172b5aa63e57b29ac9172d6d512a311b9/generate_inpatients.py#L27) + - alcohol_partially_attributable: Errors in the old code list meant some codes were missed, and some codes were not properly distinguishing between the mortality/morbidity cases + - excess_beddays: An error in the way we handled the old csv meant some NA values were treated as 0, flagging most activity as being an excess bedday - Outpatients data - Keep where `sitetret=null` - AAE data - Include followups (flagged by `atentype.isin([2, 3, 4])`) - - [Include new `acuity` column](https://github.com/The-Strategy-Unit/nhp_data/commit/b3e2ef9acf40b18f5b079533e364f74fd792e1a0#diff-e2401c3d40d30e706413f4a8efac282b166111cc6b7a44ec88c952a801fe4dc6R171) which is a more human readble version of [URGENT AND EMERGENCY CARE ACUITY (SNOMED CT)](https://www.datadictionary.nhs.uk/data_elements/urgent_and_emergency_care_acuity__snomed_ct_.html) + - [Include new `acuity` column](https://github.com/The-Strategy-Unit/nhp_data/commit/b3e2ef9acf40b18f5b079533e364f74fd792e1a0#diff-e2401c3d40d30e706413f4a8efac282b166111cc6b7a44ec88c952a801fe4dc6R171) which is a more human readable version of [URGENT AND EMERGENCY CARE ACUITY (SNOMED CT)](https://www.datadictionary.nhs.uk/data_elements/urgent_and_emergency_care_acuity__snomed_ct_.html)