Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QC quick hits #311

Closed
5 tasks
dfsnow opened this issue Jan 2, 2025 · 0 comments · Fixed by #325
Closed
5 tasks

QC quick hits #311

dfsnow opened this issue Jan 2, 2025 · 0 comments · Fixed by #325
Assignees

Comments

@dfsnow
Copy link
Member

dfsnow commented Jan 2, 2025

Some quick QC tasks based on the model outputs from earlier this week:

  • The RMSE of the linear model on the test set is suspiciously high, particularly in Barrington. I'm guessing there are a few non-market/erroneous sales hiding out in Barrington.
  • Some PINs worth checking out from my fast QC earlier this week:
    • 11181100020000 - Seems like a gut reno in Evanston
    • 12133010680000 - Sqft too low?
    • 14322140240000 - High $/sqft
    • 05211000140000 - High $/sqft
    • 20142150160000 - High $/sqft
  • Let's also check all residential PINs in ccao.pin_weird
  • Do a quick histogram/scatterplot of predictions for each town, looking for enormous outliers
  • Create a quick histogram of the distribution of continuous features in the training data, again looking for outliers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants