You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are records in the MSHA mines and EIA plants data that are missing latitude and longitude coordinates. Currently, these records are being excluded. Instead, try to impute the Census tract and county from the other locational data given for these records.
The text was updated successfully, but these errors were encountered:
We're (IMHO) overly aggressively dropping lat/lon values in the main PUDL ETL right now, and should make this fix as far upstream as possible. IIRC, right now we're basically treating the floating point values as strings and declaring them inconsistent if the digits aren't identical, which is not good.
Ideally we would use the haversine distance between all the different (lon,lat) points to estimate an "actual" location and identify any totally crazy outliers (and assign to them the actual location). Probably rectilinear coordinates are good enough though and much simpler, since all the points should be very close to each other, and if they're not very close, it'll be obvious in either spherical or euclidean coords.
We could also convert (lon, lat) into a geopoint / tuple stored in a single column.
There are records in the MSHA mines and EIA plants data that are missing latitude and longitude coordinates. Currently, these records are being excluded. Instead, try to impute the Census tract and county from the other locational data given for these records.
The text was updated successfully, but these errors were encountered: