Time zone offset issue in data processing pipeline? #234

stephen-frank · 2022-11-22T18:14:51Z

I just completed an end-to-end deployment test with @JanghyunJK:

Exported data from SkySpark
Trained a model
Received the model back
Deployed the model in SkySpark
Called prediction against the deployed model
Synced predictions to SkySpark points

Ok, so first off, this deployment worked great (yay!). But...

The issue is that the prediction seems to have a time zone offset. It isn't a simple shift in the output, though, because then you would expect things to line up just by shifting left. Instead, it seems that perhaps training assumed input timestamps in UTC and prediction assumed UTC-7, or vice versa? I'm surmising this because it appears some predictors are being correctly accounted for (decent fit at some times of day), but the time-of-day predictor is time-shifted (poor fit at other times of day when irradiance has little effect).

I'm not entirely sure how to go about troubleshooting this. Things I do know already:

Original predictor data are delivered in Denver time (MST/MDT) and timestamps are encoded as such
Required start/end times as reported by the Wattile model appear to work correctly, as I can pass the requested window and get a single prediction timestamp out
SkySpark converts SkySpark datetimes to Pandas datetimes complete with timestamp, and converts back the same way; timezones appear to be preserved properly in both directions as shown below.

JanghyunJK · 2022-11-22T18:52:48Z

perhaps training assumed input timestamps in UTC and prediction assumed UTC-7

yeah I think this is it. this part is actually something I wasn't sure of when I first switched to UTC for data for training.

stephen-frank · 2022-11-22T19:27:54Z

So some time zone tracking/conversion is needed on prediction too. Let me know if I can help debug.

stephen-frank · 2022-11-22T22:20:50Z

In today's meeting we decided that timezone standardization to UTC should happen within prep_for_rnn instead of (or in addition to) loading training data from CSV

stephen-frank · 2023-12-11T22:05:36Z

Probably duplicates #294 at this point.

stephen-frank added bug Something isn't working deploy labels Nov 22, 2022

stephen-frank assigned stephen-frank, JanghyunJK and haneslinger Nov 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Time zone offset issue in data processing pipeline? #234

Time zone offset issue in data processing pipeline? #234

stephen-frank commented Nov 22, 2022

JanghyunJK commented Nov 22, 2022 •

edited

Loading

stephen-frank commented Nov 22, 2022

stephen-frank commented Nov 22, 2022

stephen-frank commented Dec 11, 2023

Time zone offset issue in data processing pipeline? #234

Time zone offset issue in data processing pipeline? #234

Comments

stephen-frank commented Nov 22, 2022

JanghyunJK commented Nov 22, 2022 • edited Loading

stephen-frank commented Nov 22, 2022

stephen-frank commented Nov 22, 2022

stephen-frank commented Dec 11, 2023

JanghyunJK commented Nov 22, 2022 •

edited

Loading