Test error weighting schemes for most recent year of sales #297

dfsnow · 2024-12-19T20:09:58Z

The most recent year of sales data (in the current case, 2024) has two possible problems:

The 2024 sales sample is incomplete because of lagged reporting to the state via IDOR
The 2024 sales sample may not be representative of the full sample of sales, e.g. certain types of properties may be more reported more quickly or completely than others

Per @Douglasmsw, we can adjust for these issues with error weighting:

Upweight the most recent sales such that the total number is closer to what we'd expect if we had all the data
Perform IPW such that 2024 sales end up with the same characteristic distributions as 2023. IPW here would be something like $1 / P(2023\_sample | sales\_chars)$, where the latter term is the result of a logit or probit specced as: sale_from_2024 (binary) ~ sales_chars using 2023 and 2024 sales for training. We also need to a apply an adjustment term here to ensure that weights are between 0 and 1

The text was updated successfully, but these errors were encountered:

dfsnow self-assigned this Dec 19, 2024

dfsnow added the method ML technique or method change label Dec 19, 2024

Provide feedback