Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor PHI deidentifier DeidentificationStep schema #168

Open
tschaffter opened this issue Feb 12, 2021 · 0 comments
Open

Refactor PHI deidentifier DeidentificationStep schema #168

tschaffter opened this issue Feb 12, 2021 · 0 comments
Labels
Enhancement New feature or request Priority: Low

Comments

@tschaffter
Copy link
Member

Is your proposal related to a problem?

There are several issues with the current design of DeidentificationStep :

Issue 1. Currently the user can specify multiple masking strategy for this step, but only one must be allowed.
Issue 2. The user can specify to use the "dateOffsetConfig" for non-date annotation.

Additional point to review:

The current design allows to specify one confidence level threshold for multiple annotator, but the current design is flexible and could be used to specify one different threshold for each annotator.

Even though we ask the annotators to output a confidence value between 0 and 100, not all annotators may distribute their confidence levels using this full range. This has been observed in DREAM Challenges, where one method may distribute its confidence level around 30 (arbitrary example), while another method may distribute its confidence level differently. Therefore, the user of the PHI Deidentifier API would need to know information about the distribution of the confidence level of a given annotator in order to identify a meaningful confidence level threshold for it.

Describe the solution you'd like

Come up with a schemas update that fixes Issue 1 and Issue 2.

@tschaffter tschaffter added Enhancement New feature or request Priority: Low labels Feb 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New feature or request Priority: Low
Projects
None yet
Development

No branches or pull requests

1 participant