Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test error weighting schemes for most recent year of sales #297

Open
dfsnow opened this issue Dec 19, 2024 · 0 comments
Open

Test error weighting schemes for most recent year of sales #297

dfsnow opened this issue Dec 19, 2024 · 0 comments
Assignees
Labels
method ML technique or method change

Comments

@dfsnow
Copy link
Member

dfsnow commented Dec 19, 2024

The most recent year of sales data (in the current case, 2024) has two possible problems:

  1. The 2024 sales sample is incomplete because of lagged reporting to the state via IDOR
  2. The 2024 sales sample may not be representative of the full sample of sales, e.g. certain types of properties may be more reported more quickly or completely than others

Per @Douglasmsw, we can adjust for these issues with error weighting:

  1. Upweight the most recent sales such that the total number is closer to what we'd expect if we had all the data
  2. Perform IPW such that 2024 sales end up with the same characteristic distributions as 2023. IPW here would be something like $1 / P(2023\_sample | sales\_chars)$, where the latter term is the result of a logit or probit specced as: sale_from_2024 (binary) ~ sales_chars using 2023 and 2024 sales for training. We also need to a apply an adjustment term here to ensure that weights are between 0 and 1
@dfsnow dfsnow self-assigned this Dec 19, 2024
@dfsnow dfsnow added the method ML technique or method change label Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
method ML technique or method change
Projects
None yet
Development

No branches or pull requests

1 participant