Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Category data to Conditions table #88

Conversation

robertandremitchell
Copy link
Collaborator

@robertandremitchell robertandremitchell commented Oct 29, 2024

PULL REQUEST

Summary

This adds a new column to the conditions table, category. This is an organizational structure that we need to organize our conditions to display like so: https://www.figma.com/design/Iuw9me6kYftBF4WTCEsCZz/Query-Connector?node-id=475-18708&node-type=frame&t=Lcg6TdahnUvYPBsR-0

These are the categories we would end up with:
Screenshot 2024-10-29 at 1 47 36 PM

the three missing are ones we added. Here are my thoughts on how to handle:

  • Social Determinants of Health, which I believe we can remove from the DB given convo today
  • Cancer (Leukemia), which, probably fairly reasonably, could be coded to coded to be included in the Cancer category.
  • Newborn Screening, which could be Birth Defects and Infant Disorders? I don't see an immediately obvious one we would nest Newborn Screening under, otherwise. I can see an argument for it being standalone.

a few backend considerations: right now, we are just using INSERT statements to create the temporary category_data table that needs to be merged with conditions.

Other ideas:

  • this data in theory is only necessary for the display of data on the custom query page. We could just have a small script that transforms this into a JSON we reference to functionally display this data independent of the db.
  • we could read the data from an S3 bucket. This is probably a lot more work, but if we were uncomfortable committing this file as a standalone file to the repo, or if it introduces Terraform complications, that might provide some balance.

Related Issue

Fixes #68

Acceptance Criteria

Please copy the acceptance criteria from your ticket and paste it here for your reviewer(s)

For frontend PR’s - include a description (including anything that’s out of scope) for what you want the designers to review

  • ex: “Check out the whitespace on the page, as well as the dropdowns in the form. Here is the corresponding Figma link. You can ignore everything else. Also, the button at the bottom doesn’t work now”

Additional Information

Anything else the review team should know?

Checklist

  • If this code affects the other scrum team, have they been notified? (In Slack, as reviewers, etc.)

@robertandremitchell robertandremitchell linked an issue Oct 29, 2024 that may be closed by this pull request
Copy link
Collaborator

@bamader bamader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me as is, though I would probably be in the camp of just using INSERT statements to make that temporary table. I have no problem exposing this file publicly, but we probably aren't looking to do so. Mounting and S3 both seem like overkill to me, but that's more a stylistic and preference thing, I think--I would rather our DB product not establish the pattern of bringing in externally mounted volumes. I think the seeding/creation should be as self-contained as possible. I don't feel strongly either way, and I don't think anything here is blocking though!

@robertandremitchell
Copy link
Collaborator Author

Looks fine to me as is, though I would probably be in the camp of just using INSERT statements to make that temporary table. I have no problem exposing this file publicly, but we probably aren't looking to do so. Mounting and S3 both seem like overkill to me, but that's more a stylistic and preference thing, I think--I would rather our DB product not establish the pattern of bringing in externally mounted volumes. I think the seeding/creation should be as self-contained as possible. I don't feel strongly either way, and I don't think anything here is blocking though!

I'm torn, I think I lean toward a bunch of INSERT statements also being fine. I think I'd probably want to add a script that can parse this CSV to write those INSERTs each time a new eRSD drops for some future-proofing (or, preferably, convince APHL to surface this data in the eRSD...).

happy to look into that option if we think it's cleaner.

@DanPaseltiner
Copy link
Collaborator

@bamader definitely agree that ultimately all database creation should be handled in a single place (the work you and @m-goggins are doing). I totally forgot about this when reviewing. I'd propose that we move this along so that @robertandremitchell and @fzhao99 aren't blocked on their query building work, but eventually have this along with all other seeding handled at the app layer.

@bamader
Copy link
Collaborator

bamader commented Oct 29, 2024

@DanPaseltiner That sounds good to me--approving with the idea that we revisit the conversation during broader DB creation!

@robertandremitchell
Copy link
Collaborator Author

@bamader definitely agree that ultimately all database creation should be handled in a single place (the work you and @m-goggins are doing). I totally forgot about this when reviewing. I'd propose that we move this along so that @robertandremitchell and @fzhao99 aren't blocked on their query building work, but eventually have this along with all other seeding handled at the app layer.

Makes sense to me. I've created a new issue in the backlog so that we don't lose track of that work (https://linear.app/skylight-cdc/issue/QUE-43/refactor-conditionscategory-migration), but I will work on getting this merged shortly (darn CI tests).

Thank you both for your review!

Copy link
Collaborator

@fzhao99 fzhao99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Agreed that we should probably avoid the mounting / external host route given the (relatively) static nature of this data. Also lean towards just doing it the brute force INSERT way, but obviously if we could get it through an API call that'd be best

@robertandremitchell
Copy link
Collaborator Author

requesting a re-review since switching to INSERT over CSV. everything else should be the same.

@robertandremitchell robertandremitchell merged commit e2c9aa6 into main Oct 31, 2024
5 checks passed
@robertandremitchell robertandremitchell deleted the rob/68-determine-how-to-store-category-information-for-conditions branch October 31, 2024 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Determine how to store category information for conditions
4 participants