Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

review deaths dates processing #987

Merged
merged 17 commits into from
Aug 6, 2024
Merged

review deaths dates processing #987

merged 17 commits into from
Aug 6, 2024

Conversation

Jennit07
Copy link
Collaborator

@Jennit07 Jennit07 commented Jul 29, 2024

Opening this as a draft PR for now to fix #986

Previously, the IT deaths extract was using the nrs weekly date but if this was missing, it would use the chi death date. We agreed as a team that this was wrong and the boxi nrs deaths date was the most reliable. This was the previous code in process_it_chi_deaths:

death_date = dplyr::coalesce(.data$death_date_nrs, .data$death_date_chi)

I have removed this and selected chi and death_date_chi so that we do not use the weekly nrs death date provided by IT. This is unvalidated and found to be unreliable. Instead, the IT chi death date is left with no modification and has two variables: chi and death_date_chi.

Later in the process we create a deaths lookup which is one row per chi for deaths that occurred within the financial year. This is used to match onto the episode file later in the process. By default, this was set up to use the boxi nrs death date which is submitted monthly and is validated. We agreed this should be the correct method going forward. However, this does not take into account the chi date. I have updated this in a way that:

if the BOXI NRS date does not match the chi death date, use the chi death date here is the code which does this:

    dplyr::mutate(
      death_date = dplyr::if_else(.data$record_keydate1 != .data$death_date_chi,
        .data$death_date_chi, .data$record_keydate1
      ),
      deceased = TRUE,
      .keep = "unused"
    )

When i checked this with 1920 data, there was only one case which changed from 31-03-2020 to 01-04-2020. This was not one of the chi's which had a 40year difference. The 40 year difference mainly was in the weekly nrs data.

We possibly need to review this to say use the chi date if X time between the boxi nrs date and chi date

Another thing to check the methodology is on care home data which is explicitly using the nrs weekly data but if this is not available then use chi data. This is outstanding on this PR. The changes should be the same from the deaths lookup

TO DO:

  • check time difference between boxi nrs and chi death dates
  • check care home methodology to reflect new changes

opening a draft for now to see the changes.

Jennit07 and others added 5 commits July 26, 2024 16:46
This uses the NRS Weekly dates and if this is blank use the chi death date. This methodology is wrong. We want to use the monthly nrs boxi date by default and chi date if there is an issue

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@lizihao-anu lizihao-anu self-requested a review July 29, 2024 14:25

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@lizihao-anu lizihao-anu marked this pull request as ready for review August 5, 2024 10:54
Copy link

github-actions bot commented Aug 5, 2024

@check-spelling-bot Report

🔴 Please review

See the 📂 files view, the 📜action log, or 📝 job summary for details.

Unrecognized words (14)
ADPE
anomymous
Canx
CCYY
CNWs
Comhairle
communicty
datas
LCHO
Matern
Mcbride
PPAs
rbindlist
SMRA
Previously acknowledged words that are now absent adpe canx ccyy cnws comhairle lcho matern mcbride ppas returnsthe smra 🫥
To accept these unrecognized words as correct and remove the previously acknowledged and now absent words, you could run the following commands

... in a clone of the [email protected]:Public-Health-Scotland/source-linkage-files.git repository
on the it_deaths branch (ℹ️ how do I use this?):

curl -s -S -L 'https://raw.githubusercontent.com/check-spelling/check-spelling/main/apply.pl' |
perl - 'https://github.com/Public-Health-Scotland/source-linkage-files/actions/runs/10247484327/attempts/1'

OR

To have the bot accept them for you, reply quoting the following line:
@check-spelling-bot apply updates.

Available 📚 dictionaries could cover words (expected and unrecognized) not in the 📘 dictionary

This includes both expected items (307) from .github/actions/spelling/expect.txt and unrecognized words (14)

Dictionary Entries Covers Uniquely
cspell:fullstack/dict/fullstack.txt 419 3 3
cspell:k8s/dict/k8s.txt 153 4 1
cspell:php/dict/php.txt 1689 4
cspell:node/dict/node.txt 891 3 1
cspell:npm/dict/npm.txt 302 3

Consider adding them (in .github/workflows/spelling.yml) for uses: check-spelling/check-spelling@main in its with:

      with:
        extra_dictionaries:
          cspell:fullstack/dict/fullstack.txt
          cspell:k8s/dict/k8s.txt
          cspell:php/dict/php.txt
          cspell:node/dict/node.txt
          cspell:npm/dict/npm.txt

To stop checking additional dictionaries, add (in .github/workflows/spelling.yml) for uses: check-spelling/check-spelling@main in its with:

check_extra_dictionaries: ''
Errors (1)

See the 📂 files view, the 📜action log, or 📝 job summary for details.

❌ Errors Count
❌ ignored-expect-variant 3

See ❌ Event descriptions for more information.

If the flagged items are 🤯 false positives

If items relate to a ...

  • binary file (or some other file you wouldn't want to check at all).

    Please add a file path to the excludes.txt file matching the containing file.

    File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

    ^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

  • well-formed pattern.

    If you can write a pattern that would match it,
    try adding it to the patterns.txt file.

    Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

    Note that patterns can't match multiline strings.

@lizihao-anu
Copy link
Contributor

Should be ready for review. @Jennit07 @SwiftySalmon

Copy link
Collaborator

@SwiftySalmon SwiftySalmon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems fine

@lizihao-anu lizihao-anu merged commit 03b41c3 into september-2024 Aug 6, 2024
3 of 4 checks passed
@lizihao-anu lizihao-anu deleted the it_deaths branch August 6, 2024 15:52
@lizihao-anu lizihao-anu mentioned this pull request Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants