Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated SNOMED Codes #1474

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

florim14
Copy link

Pull Request Description: Updated SNOMED Codes

Overview

This pull request aims to update the SNOMED codes across various Synthea modules, ensuring they are current and active. The updates were made using a script specifically developed for this purpose.

Changes Made

  1. Script Development: A custom script was created to review and update the SNOMED codes within the Synthea modules. The script performed the following tasks:

    • Identified inactive or outdated SNOMED codes.
    • Checked for active versions of these codes. If no code found, search based on the display and the semantic tag (if it has one), and check if it matches with at least 70% of similarity.
    • Updated the modules with the active SNOMED codes.
  2. Code Updates:

    • The script successfully updated all SNOMED codes across the modules, except for 12 codes for which no active matches were found.
    • Out of these 12 unmatched codes, 11 are located in the "TNM_Diagnosis" module file.

What are the benefits:

  • Ensures all SNOMED codes used in the Synthea modules are active, improving the integrity and reliability of the synthetic data generated.
  • The developed script (once it is published) can be reused for future updates, simplifying the maintenance process for SNOMED codes in Synthea modules.

Remaining Issues

  • Unmatched Codes:
    • 12 SNOMED codes could not be matched with active versions.
    • 11 of these unmatched codes are located in the "TNM_Diagnosis" module.

Future Work

  • Script Contribution: While this pull request focuses on the updated SNOMED codes, the script used for these updates will be shared in a future contribution. This script will assist other users in maintaining up-to-date SNOMED codes within their Synthea modules.

Thank you for considering this pull request.

…ctive, and could not find the right ones to update
@jawalonoski
Copy link
Member

@florim14 thank you for the pull request.

I wonder how much your script differs from the current code display update script: https://github.com/synthetichealth/synthea/blob/master/src/main/javascript/update_code_display.js

Would you mind separating this pull request into two parts?

  1. PR with no SNOMED code changes except for an updated display value
  2. PR with updated SNOMED codes (replacing of inactive codes)

The first PR is easy to test and merge, the second requires detailed review.

@florim14
Copy link
Author

florim14 commented Jun 20, 2024 via email

@jawalonoski
Copy link
Member

The problem is that after perusing the changes for only a few minutes, a colleague and I have found that some of the code changes you've made are wrong.

Having two sets of changes makes it easier to differentiate what is merely a change on the preferred display value (changing the display value does not change the clinical meaning of the concept), versus all the changed codes that may refer to new or different concepts -- and those latter changes we have to manually verify and check, in some cases consulting with clinicians.

@dehall
Copy link
Contributor

dehall commented Jun 20, 2024

As one example, there is a change in the dementia module - 316744009 "Office Visit" was changed in a few places to 61488002 "Physical medicine initial examination for orthotic program (procedure)". Just at a glance, orthotics isn't really relevant to the dementia module so this change isn't correct. But, digging further into this to see what happened, the display "Office Visit" was wrong in the first place, this was a really old code and the display for that old code should have been "Persons encountering health services in circumstances related to reproduction ". So we really need to look closely at any code changes. There are probably more instances like this one where the old code was wrong or had the wrong display and so an automated process isn't going to produce a good result.

That's not to say a script to make those changes is wrong or bad, this code in the dementia module definitely needs to be changed and it's nice that a script can highlight that, but those changes will always need a close look

@florim14
Copy link
Author

florim14 commented Jun 25, 2024 via email

@dehall
Copy link
Contributor

dehall commented Jun 27, 2024

@florim14 Ok, I think it might be easier to split this into 2 PRs; one with just the display changes that we can quickly merge, and one with the code changes which we can use the github review features for. But if you prefer the separate json file I guess that's fine

@florim14
Copy link
Author

florim14 commented Jun 27, 2024 via email

@florim14
Copy link
Author

florim14 commented Jun 27, 2024 via email

@dehall
Copy link
Contributor

dehall commented Jul 8, 2024

@florim14 Apologies for the delay, I was out a lot of last week with the holiday. It looks like the files didn't attach on github. Can you try uploading them to a comment here on the PR? #1474

@florim14
Copy link
Author

florim14 commented Jul 8, 2024

@dehall - No worries. I have attached the mention files, let me know if any adjustments needs to be made:

not_founded_codes.json
replaced_codes.json
unique_replaced_codes.json

@florim14
Copy link
Author

florim14 commented Sep 9, 2024

@dehall - I wanted to follow on the last comment, is there any news regarding the check for updating the SNOMED codes?

@dehall
Copy link
Contributor

dehall commented Sep 9, 2024

Ahhh, again @florim14 my sincerest apologies for the delay on this. Unfortunately I no longer have time dedicated to synthea support so this unfortunately fell through the cracks. I took an initial look at the changed codes when you first posted them and agreed with nearly all the changes, but there were a couple I wanted to look closer at. I'll make sure get you an update by tomorrow at the latest.

@dehall
Copy link
Contributor

dehall commented Sep 10, 2024

Ok I finally took a closer look at the replaced codes -- see attached in CSV format: codes_review.csv
This only includes the changed codes, I'm assuming the changed displays are all fine

In general the replacements look good but some of them have what I'll call an increase in specificity that's incorrect for the context the code is used in, for example 104173009, we had original display "Sputum Culture" but the official display is "Microbial culture of sputum (procedure)", this was changed to 104184002 "Sputum culture for mycobacterium (procedure)" which is a child code of the original, so is a more specific code which I don't think applies where it's used in the cystic fibrosis module. (And in this case the original code seems to still be valid anyway?)
Let me know if you disagree with any of these. In terms of implementation, again I'd suggest reverting the
ones we're not sure on and keeping the rest so we can merge the part we do feel good about.

@florim14
Copy link
Author

@dehall - thank you for your response. One thing we can do is that I can modify my script so that first it checks if the code is active, then we can automatically replace with the correct display. I can then put the codes changed this way (the displays more precisely) in a separate file, and we can review if it makes sense to change them. If you agree to this, I can modify the script and I can send you the new files
Moreover, I will also add a codition to ignore the codes you send in the csv file which you marked as "Needs review", and we can check them

@dehall
Copy link
Contributor

dehall commented Sep 11, 2024

Yes that sounds good

@florim14
Copy link
Author

@dehall - I have updated my script to implement the changes we discussed. I am attaching you the following files:

  • ignored_codes.json - are the codes we decided to ignore at the moment, which we can discuss together to what to replace them
  • replaced_displays.json - are the changed displays for which the code was still active, just the display was wrong. The unique_replaced_displays.json is a condensed version that includes the unique displays
  • replaced_codes.json - are the changed codes that includes the code, replaced code, display, replaced display, and the file from which this code was taken. The unique_replaced_codes.json is a condensed version that includes the unique codes and replaced codes
  • not_founded_codes.json - includes the inactive codes for which no replacement was found. We can check for this one as well together to which codes to replace them

Let me know if you agree with these changes, and if you want to proceed with the next step

@dehall
Copy link
Contributor

dehall commented Sep 20, 2024

Sounds good, I'm currently on travel but will take a look when I'm back 9/30

@florim14
Copy link
Author

florim14 commented Sep 20, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants