Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time zone offset issue in data processing pipeline? #234

Open
stephen-frank opened this issue Nov 22, 2022 · 4 comments
Open

Time zone offset issue in data processing pipeline? #234

stephen-frank opened this issue Nov 22, 2022 · 4 comments
Assignees
Labels
bug Something isn't working deploy

Comments

@stephen-frank
Copy link
Member

I just completed an end-to-end deployment test with @JanghyunJK:

  1. Exported data from SkySpark
  2. Trained a model
  3. Received the model back
  4. Deployed the model in SkySpark
  5. Called prediction against the deployed model
  6. Synced predictions to SkySpark points

Ok, so first off, this deployment worked great (yay!). But...

The issue is that the prediction seems to have a time zone offset. It isn't a simple shift in the output, though, because then you would expect things to line up just by shifting left. Instead, it seems that perhaps training assumed input timestamps in UTC and prediction assumed UTC-7, or vice versa? I'm surmising this because it appears some predictors are being correctly accounted for (decent fit at some times of day), but the time-of-day predictor is time-shifted (poor fit at other times of day when irradiance has little effect).

export

I'm not entirely sure how to go about troubleshooting this. Things I do know already:

  1. Original predictor data are delivered in Denver time (MST/MDT) and timestamps are encoded as such
  2. Required start/end times as reported by the Wattile model appear to work correctly, as I can pass the requested window and get a single prediction timestamp out
  3. SkySpark converts SkySpark datetimes to Pandas datetimes complete with timestamp, and converts back the same way; timezones appear to be preserved properly in both directions as shown below.

image
image

@JanghyunJK
Copy link
Contributor

JanghyunJK commented Nov 22, 2022

perhaps training assumed input timestamps in UTC and prediction assumed UTC-7

yeah I think this is it. this part is actually something I wasn't sure of when I first switched to UTC for data for training.

@stephen-frank
Copy link
Member Author

So some time zone tracking/conversion is needed on prediction too. Let me know if I can help debug.

@stephen-frank
Copy link
Member Author

In today's meeting we decided that timezone standardization to UTC should happen within prep_for_rnn instead of (or in addition to) loading training data from CSV

@stephen-frank
Copy link
Member Author

Probably duplicates #294 at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working deploy
Projects
None yet
Development

No branches or pull requests

3 participants