Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatch in compute_feature_label_alignment #18

Open
martijnsiepel01 opened this issue Oct 14, 2024 · 0 comments
Open

Mismatch in compute_feature_label_alignment #18

martijnsiepel01 opened this issue Oct 14, 2024 · 0 comments

Comments

@martijnsiepel01
Copy link

Hi!

I am trying to write a custom labeler for the task of predicting Hospital Acquired Infections (HAI, infections that occur 48+ hours after admission). What I have done previously is creating a custom labelers for different SNOMED codes and run the erhshot benchmark code using femr 0.0.20, which worked fine. I used a simple custom implementation of the FirstDiagnosisTimeHorizonCodeLabeler.

However, for a HAI, I want to change the prediction time to 48 hours after admission by adding time to the admission time(similar to how the x-ray task moves the prediction task to some time before the result gets registered). I attempted this in two ways, using both an implementation of a Labeler labeler, as well as an implementation of a WithinVisitLabeler.

Both labelers work from a technical point of view, they run and produce legit looking label files. But, once I get to to the evaluation portion of the script (7_eval), I run into the following issue:


Traceback (most recent call last):
File "/mnt/data/inbox/Martijn/Scripts Martijn/ehrshot/ehrshot-benchmark/ehrshot/bash_scripts/../7_eval.py", line 212, in
patient_ids, label_values, label_times, feature_matrixes = get_labels_and_features(labeled_patients, PATH_TO_FEATURES_DIR)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/inbox/Martijn/Scripts Martijn/ehrshot/ehrshot-benchmark/ehrshot/utils.py", line 337, in get_labels_and_features
join_indices = compute_feature_label_alignment(label_patient_ids,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/inbox/Martijn/Scripts Martijn/ehrshot/ehrshot-benchmark/ehrshot/utils.py", line 283, in compute_feature_label_alignment
raise RuntimeError(f"Could not find match for {label_pids[i]} {label_dates[i]}, closest is {feature_pids[j]} {feature_dates[j]}")
RuntimeError: Could not find match for 115967095 1223596560000000, closest is 115967095 1223580540000000

Apparantly there is a mismatch between my labels and features. How can I solve this? Does my labeler have an issue?

I would appreciatie some thoughts!
HAI_Labeler.txt
HAI_WithinVisitLabeler.txt

Best

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant