Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include DateTime in TextDateAnnotation #120

Open
boyleconnor opened this issue Dec 8, 2020 · 4 comments
Open

Include DateTime in TextDateAnnotation #120

boyleconnor opened this issue Dec 8, 2020 · 4 comments
Assignees

Comments

@boyleconnor
Copy link
Contributor

boyleconnor commented Dec 8, 2020

Suppose I want to de-identify this note: "The patient came in 11 Novemb". A particularly good date-annotator would correctly produce an annotation [{start: 20, length: 2, text: "11", dateFormat: "DD"}, {start: 20: length: 23, text: "Novemb"}].

If someone or something (e.g. our phi-deidentifier) wanted to use this annotation to apply a date offset to the note, this would be very difficult to do. The de-identifier has to somehow map "Novemb" to an abstract representation like {month: 11}, which is work that the date-annotator probably already did, but just didn't share in its output.

What if we add something new to the TextDateAnnotation schema that can contain structured data about the date represented in the annotation? E.g. add a (new) property like:

date:
  type: object
  properties:
    year:
      type: int
    month:
      type: int
    day:
      type: int
@thomasyu888
Copy link
Member

+1. This is interesting. In the text above, would an offset look like:

  • "The patient came in 15 12"
  • "The patient came in 15 December"
  • "The patient came in XX XXXXXX"

I guess all of those are valid responses right? Would the object replace the "dateFormat" variable? I guess my initial thinking would be that the "date" object would be neat if it could recognize: 11 Novemb as an entire string which would make the date object:

date = {month: 11, day: 11}

That would probably make the de-id more clear than just doing a 11 12

@tschaffter
Copy link
Member

tschaffter commented Dec 8, 2020

The discussion today concluded that there is a benefit at having both the dateFormat and a new property named date that should follow an existing standard format.

EDIT: For the record, we are keeping the date format because it includes information about the formating of the PHI. This information can be useful to generated deidentified notes that look like the original ones. One application is the generation of deidentified clinical notes that can be used to train annotation tools.

@cascadianblue Can you 1) identify a suitable schema for date on https://schema.org/, 2) create an OpenAPI version of it and 3) create a PR that adds the property TextDateAnnotation.date?

@tschaffter
Copy link
Member

@cascadianblue Do you have an update regarding this ticket?

@tschaffter
Copy link
Member

@cascadianblue Do you have an update regarding this ticket?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants