Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tools for Generating Synthetic Irradiance Timeseries #2422

Open
jranalli opened this issue Mar 27, 2025 · 7 comments
Open

Tools for Generating Synthetic Irradiance Timeseries #2422

jranalli opened this issue Mar 27, 2025 · 7 comments

Comments

@jranalli
Copy link
Contributor

jranalli commented Mar 27, 2025

In my solarspatialtools library, I made an implementation of Lave et al's synthetic irradiance approach based on stacked multi-scale random cloud fields.

I wanted to offer to contribute it over here if people would be interested, because my philosophy for that project is to rely on pvlib where possible since it's much more mature and widely adopted, and only to house things that are out of pvlib's scope. In this case, it might be relevant to expansion of the scaling package in pvlib, but no offense taken if this is too far afield. This code is pretty involved (600+ lines), so it would be a big review, but is relatively self-contained.

@mikofski
Copy link
Member

+1 if it can scale monthly to hourly. What inputs does it require? Any sense of expected downsides? Eg if used to scale hourly to 5-minute what difference versus ground should I be concerned about?

@jranalli
Copy link
Contributor Author

jranalli commented Mar 27, 2025

This isn't my method, so I don't want to misrepresent, but here's my take.

It's fully synthetic, and more geared for generating high frequency data than monthly->hourly. The most direct application is creating spatially distinct timeseries based on a single high frequency source (e.g. taking an irradiance sensor measurement and building out to give you an idea of what the statistically similar time series would look like in spatial variability across a plant). It could also be used for temporal downscaling, but again with a focus on high frequency rather than low.

Inputs are:

  • magnitudes of the wavelet modes
  • the size of the field you'd like to simulate
  • the fraction of clearsky
  • some statistics of clearsky index that you'd like to be reflected in the generated time series (mean, minimum, maximum).

So it works best when you have a time series on which to base the wavelet modes, but if you had your own ideas about what those should be you could specify them manually. Applying it spatially requires that you also now a cloud motion vector for how the temporal and spatial transport are related.

Here's a single demo of how the PDFs and CDFs of clear sky index are comparing real- and the synthetic data.
Image

@cwhanse
Copy link
Member

cwhanse commented Mar 28, 2025

I think this method is insufficiently validated for inclusion in pvlib, in particular, because its likely use in pvlib is to extrapolate a point irradiance measurement to a field of irradiance and to accept each field point as realistic. This is beyond the aims of the developers, and the paper admits as much (emphasis added):

"However, when timeseries are sampled at hundreds of locations (corresponding to the hundreds of different transformer locations on the feeder), the aggregate output is much smoother and looks more realistic. As described in section Error! Reference source not found., we have ongoing test to evaluate the need for accurate distributed PV inputs. For analysis such as voltage regulator tap changes, it may not be important that a single customer be accurately portrayed because the regulator will only see the aggregate output of several PV systems."

@jranalli
Copy link
Contributor Author

I certainly agree with that description of the method's validation and can see the potential for misinterpretation of what the spatial field really represents.

@mikofski
Copy link
Member

Hi @jranalli I might have an alternate method for synthesizing high frequency data, I think it’s an implementation of a popular algorithm. I’ll send it to you to see what you think.

@jranalli
Copy link
Contributor Author

jranalli commented Mar 29, 2025 via email

@mikofski
Copy link
Member

mikofski commented Apr 1, 2025

HI Joe, the method I have was written by Patrick Mathiesen based on:

This code is probably proprietary, but I can post the other references and see if we can piece it together without reading the source code. That would be better for pvlib as referenced implementations anyway.

As I understand it, the statistics are extracted from a climatically similar high frequency dataset with sufficient duration to characterize the statistics. Then Markov chains are used to generate hourly data from the monthly totals that matches the statistics of the reference high frequency data and the monthly total of the provided data set. I'm sure there's more nuance to it than that, and sorry if I am ignorantly stating the obvious.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants