Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VDI profiles based on lpagg #50

Draft
wants to merge 25 commits into
base: v0.2dev
Choose a base branch
from
Draft

Conversation

uvchik
Copy link
Member

@uvchik uvchik commented Apr 12, 2022

The basic code is taken from https://github.com/jnettels/lpagg but most of the code has been revised for modular use and to speed up the code. The lpagg package contains various functionality but only the VDI-profiles and reading the TRY data from DWD is used.

To test the new profiles use the vdi_profile_example in the examples section. The parameter of the houses are added within a loop. I used this to multiply the number of houses to check how long it would take to model e.g. 1000 houses. 1000 houses on an hourly base will take less than a minute, which shows that the draft is at least fast enough 🏎️

Related to #36

Install this branch to test the code:

pip install https://github.com/oemof/demandlib/archive/features/add-vdi-from-lpagg

It is the first draft, I am open for concrete ideas.

  • add tests
  • add documentation
  • clean code

@uvchik uvchik self-assigned this Apr 12, 2022
@uvchik uvchik added this to the v0.2.0 milestone Apr 12, 2022
@jnettels
Copy link

jnettels commented Apr 13, 2022

@uvchik thanks very much, especially the speed looks promising! Not iterating through rows does indeed help a lot :-)
I was able to run your example. I have reduced it to a calculation for a single house, and I am still getting differences between my implementation and yours, even if they are very small. It comes down to me identifying e.g. 21. October 2017 in TRY04 as WWB, while your code yields WWH. The difference is the daily mean of the cloud cover (< or >5). This only happens for 4 days, I think.

I am not 100% sure yet, but I think it comes down to the following: At 15min frequency, Pandas df.resample("D").mean() will use the values between (including) 2017-10-21 00:00 and 2017-10-21 23:15 to give the mean of 2017-10-21. However, in the weather data, 2017-10-21 00:00 describes the weather conditions between 2017-10-20 23:00 and 2017-10-21 00:00. The value in 2017-10-21 00:00 therefore belongs to 2017-10-20. To put it more simply: By DWD definition, the weather data does not start at 2017-1-1 00:00, but 2017-1-1 01:00.
You seem to ignore that and simply load the weather data, starting at 2017-1-1 00:00, which may actually solve those problems. It may very well be that your solution is correct, and my many workarounds lead to incorrect results. Whenever I think about this stuff, I have to double and tripple check.

Some more stuff that might need consideration:

  • You seem to allow leap years, but that would lead to missing rows in the weather data. Setting vdi.Region(year=2020) currently throws ValueError: Length mismatch: Expected 8760 rows, received array of length 8784 and a more helpful message might be appropriate
  • The defining temperature for summer should be exposed to the user, with default 15°C (Tamb_heat_limit in my code)
  • If no weather data is given, the default weather file for the given TRY region should be loaded
  • The user should be able to override the used weather timeseries. (These could then be used for leap years, if desired.)
  • While using vdi.Region(holidays=holidays.country_holidays('DE', subdiv='NI')) from https://pypi.org/project/holidays/ already works (yay!), it could be actively encouraged in vdi_profile_example. I think that project is awesome.

There might be more stuff, but these are all minor issues I think. I need some time for more investigation, though, to see if the API fits all the needs that I can come up with.

@uvchik
Copy link
Member Author

uvchik commented Apr 13, 2022

  • The weather data seems to be fine to me now. The DWD data starts with 01-01-01 (Month-Day-Hour), while weather data is typically right stamped. So in my left stamped index the first value is 2017-01-01 00:00. For me a right stamped index seems to be counter intuitive for a daily index
  • My idea is to duplicate the nearest day with the same weekday to create the typical day for the 29th of February. If it is a Sunday I would use the last Sunday, if it is a weekday I would use the last weekday which would be the 28th or the 27th. I think this is good enough regarding all the other assumptions. (still a ToDo)
  • I am not sure whether we should offer this or not. Some of the factors are connected to the TRY region. I seems to be strange to use an arbitrary data set with this procedure. In that case one has to pass the weather data and the TRY region.
  • I use workalendar in my example, but feel free to use your favourite holiday package.

@pep8speaks
Copy link

pep8speaks commented Apr 13, 2022

Hello @uvchik! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-09-06 18:25:57 UTC

@uvchik uvchik linked an issue Apr 14, 2022 that may be closed by this pull request
@jnettels
Copy link

jnettels commented Apr 27, 2022

Here are the notes about what we discussed:

  • Make summer_temperature_limit (currently set in the ini file) an attribute of each house. As a consequence, different series of typical days have to be created for each unique temperature. These typical days then have to be correctly used for the correct houses
  • Allow manipulation of the weather data. Using the default weather data should be the default behaviour, but using your own weather data should also be possible. For example, vdi.Region could have a function get_weather() which would yield a DataFrame which can be manipulated and a function set_weather() which overwrites that default weather. This can be used before calling get_load_curve_houses().

Some more notes that came up while replacing my own implementation in lpagg with yours:

  • I think it could be more convenient for further processing if the DataFrame lc returned by vdi.region.get_load_curve_houses() had named column levels. Simplest would be lc.columns.set_names(['name', 'house_type', 'energy'], inplace=True) before returning it, but it might me more appropriate to apply the names at different concat() stages, like the following. However, this probably should be in line with the results returned from the existing BDEW implementation.
            load_curve_house.columns = pd.MultiIndex.from_product(
                [
                    [house["house_type"]],
                    load_curve_house.columns,
                ],
                names=['house_type', 'energy']
            )
            house_profiles[house["name"]] = load_curve_house
        return pd.concat(
            house_profiles.values(), keys=house_profiles.keys(), axis=1,
            names=['name']
        )
  • Another very general point: Currently the new module is simply named vdi, which is extremely general. Would vdi4655 not be more appropriate, in case other vdi norms are implemented in the future?
  • If I am not mistaken, TRY (as an attribute of each house) is unused and should be removed in the example script (since it is defined by the region instead)
  • When I implemented it, I made the house data a nested dictionary. You changed it to a list of dictionaries instead. In retrospect, I think both are not ideal. I later added an option to lpagg to read the house data from a table file. (See this example xlsx: https://github.com/jnettels/lpagg/blob/master/lpagg/examples/VDI_4655_houses_example.xlsx). When you start to define large lists of houses, a table is in my opinion the most intuitive data type. Therefore I would like to consider changing the argument houses of vdi.Region() to a DataFrame. Of course, you could internally convert it to a list of dictionaries immediately and not change your code. I am only talking about the interface.
  • The point above would also provide an opportunity to implement an empty houses DataFrame that contains all the required rows. If creating an object of vdi.Region() without the houses argument, you could afterwards ask for the empty DataFrame and fill it with your data? This way the required attributes of each house are actually documented. ...or is that too convoluted?
  • If I see correctly, the house attributes N_Pers and N_WE are currently unused. In VDI4655, they serve two purposes:
    -- Check validity: VDI only claims support for N_Pers <= 12 (for EFH) and N_WE <= 40 (for MFH)
    -- Provide default values for Q_TWW_a and W_a (annual energy for domestic hot water and electricity)
    If we are not using them here, they should drop from the house attributes. But I think we should use them. I have a separate function lpagg.VDI4655.get_annual_energy_demand() which takes the houses data, checks those requirements and fills in the default values if they were not provided by the user. After some clean-up, this could become an optional preprocessing step before putting houses into vdi.Region(). Or vdi.Region() does it internally, but that would be less transparent to the user.

(Please tell me if I should do some of those implementations myself, or if you prefer to do it yourself.)

Further discussion:

  • In your example script, the houses get the attributes copies and sigma. If I see correctly, these are not currently used and could therefore be deleted for now. But they are related to the concept of "generating simultaneity effects through time shifts based on random draws from a normal distribution". This could become a seperate issue and pull request, but I would like to discuss if this could have a place in demandlib as well

Comment on lines 550 to 559
load_curve_house.columns = pd.MultiIndex.from_product(
[
[house["house_type"]],
load_curve_house.columns,
]
)
house_profiles[house["name"]] = load_curve_house
return pd.concat(
house_profiles.values(), keys=house_profiles.keys(), axis=1
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion for named levels:

            load_curve_house.columns = pd.MultiIndex.from_product(
                [
                    [house["house_type"]],
                    load_curve_house.columns,
                ],
                names=['house_type', 'energy']
            )
            house_profiles[house["name"]] = load_curve_house
        return pd.concat(
            house_profiles.values(), keys=house_profiles.keys(), axis=1,
            names=['name']
        )

Copy link
Member

@p-snft p-snft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see copyright issues because of the implementation of the standard:

Copyright:
Verein Deutscher Ingenieure e.V.
VDI Standards Department
VDI-Platz 1, 40468 Duesseldorf, Germany
Reproduced with the permission of the Verein Deutscher Ingenieure e.V.,
for non-commercial use only.

@@ -27,7 +27,7 @@
import demandlib.bdew as bdew
import demandlib.particular_profiles as profiles

# The following dictionary is create by "workalendar"
# The following dictionary has been created by "workalendar"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# The following dictionary has been created by "workalendar"
# The following dictionary is created by "workalendar"

I am a bit puzzled here. First, you write that the dictionary has been/ is created using workalendar but later you define the holidays yourself.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I am writing comments here... :-)
Since those lines are commented out, I think this is just to suggest how to use an alternative library, without creating a new dependency for the installation of demandlib. My only suggestion would be to also name https://pypi.org/project/holidays/ as another option. Personally, it took me much too long to stumble across it.

examples/vdi_profile_example.py Show resolved Hide resolved
Comment on lines 49 to 58
"N_Pers": 3,
"name": "EFH_{0}".format(n),
"N_WE": 1,
"Q_Heiz_a": 6000,
"TRY": 4,
"copies": 24,
"house_type": "EFH",
"sigma": 4,
"Q_TWW_a": 1500,
"W_a": 5250,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Language: "EFH", "WE", "Heiz" and "TWW" are German, all others should work in English. (First, I misinterpreted WE as weekend.)

Comprehensibility: What does W_a mean?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has bugged me, too. The reason is sticking to the notation in the norm itself. Even though there is an English version of VDI 4655, that one still uses the German notation. To name a few:
Q: Heat
W: Electrical work
TWW: Domestic hot water
TT: Typical day
a: Year
WE: Flats

Therefore W_a is the electrical work of one year in kWh.

While this may not be ideal, I'd still argue that sticking to the source notation is very important, at least for the internal code. Making the public API English is something that might be worth considering, as a compromise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, it might be better to use English abbreviations in the code.

src/demandlib/vdi/regions.py Outdated Show resolved Hide resolved
@jnettels
Copy link

I took the liberty to make two changes based on my previous suggestions: Named column levels and the ability to set my own weather file path. These changes help me to use demandlib within lpagg.

I think now the most important remaining issue is the ability to define the summer_temperature_limit for each house. My other suggestions are just for convenience.

The temperature limit is connected to the efficiency status of the
house.
@uvchik uvchik force-pushed the features/add-vdi-from-lpagg branch from 85d4768 to 7cccf52 Compare June 21, 2022 10:32
@jnettels
Copy link

I have not tested it yet, but thanks for adding the temperature limits. Though, since you removed my solution to define a custom weather file, I need to ask: Was it rude (or just inconvenient) of me to push to this branch, or did you not agree with my approach?

Copy link
Member

@p-snft p-snft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the verbose inline documentation explaining what the code does. There is just one location where I found a missing piece of information.

More general, I am still not sure about the usage of German abbreviations. It might or might not be advised. (As the English version of the norm is using them, it can at leats be justified.)

Comment on lines 1 to 2
from .dwd_try import find_try_region # noqa: F401
from .regions import Region # noqa: F401
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from .dwd_try import find_try_region # noqa: F401
from .regions import Region # noqa: F401
from .dwd_try import find_try_region
from .regions import Region
__all__ == [
"find_try_region",
"Region",
]


Returns
-------

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing documentation of the return value.

Copy link

@jnettels jnettels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use the default temperature limits if they are not defined by the user.
Also, I added my idea for changing the weather data as a suggestion.

src/demandlib/vdi/regions.py Outdated Show resolved Hide resolved
Comment on lines +547 to +548
summer=house["summer_temperature_limit"],
winter=house["winter_temperature_limit"],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
summer=house["summer_temperature_limit"],
winter=house["winter_temperature_limit"],
summer=house.get("summer_temperature_limit", 15),
winter=house.get("winter_temperature_limit", 5),

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to best improve this, but it feels wrong to define the default values at two places in the code...

seasons=None,
holidays=None,
houses=None,
resample_rule=None,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also adding the code snippets for changing the weather file.

Suggested change
resample_rule=None,
resample_rule=None,
file_weather=None,

dictionary need to have the following keys.
resample_rule : str
Time interval to resample the profile e.g. 1H (1 hour) or 15min.
The value will be passed to the pandas resample method.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The value will be passed to the pandas resample method.
The value will be passed to the pandas resample method.
file_weather : str (optional)
Path to a 'test reference year' (TRY) weather file by German DWD
(Deutscher Wetterdienst). If None, the file fitting the given
try_region will be loaded.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as you adhere to the TRY format, the actual file does not need to be provided by DWD.

self._try_region = try_region

self._year = year
self.weather = None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.weather = None
self.weather = None
self.file_weather = file_weather

Comment on lines +196 to +201
fn_weather = os.path.join(
os.path.dirname(__file__),
"resources_weather",
"TRY2010_{:02d}_Jahr.dat".format(self._try_region),
)
self.weather = dwd_try.read_dwd_weather_file(fn_weather)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn_weather = os.path.join(
os.path.dirname(__file__),
"resources_weather",
"TRY2010_{:02d}_Jahr.dat".format(self._try_region),
)
self.weather = dwd_try.read_dwd_weather_file(fn_weather)
if self.file_weather is None:
self.file_weather = os.path.join(
os.path.dirname(__file__),
"resources_weather",
"TRY2010_{:02d}_Jahr.dat".format(self._try_region),
)
self.weather = dwd_try.read_dwd_weather_file(self.file_weather)

Copy link
Member

@p-snft p-snft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be old code that has been commented out instead of being deleted.

dictionary need to have the following keys.
resample_rule : str
Time interval to resample the profile e.g. 1H (1 hour) or 15min.
The value will be passed to the pandas resample method.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as you adhere to the TRY format, the actual file does not need to be provided by DWD.

Comment on lines +403 to +404
# typtage_combinations = settings["typtage_combinations"]
# houses_list = settings["houses_list_VDI"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this old code that has been commented out instead of being deleted?

Comment on lines +419 to +424
# if settings.get("zero_summer_heat_demand", None) is not None:
# # Reduze the value of 'F_Heiz_TT' to zero.
# # For modern houses, this eliminates the heat demand in summer
# energy_factors_df.loc[
# (slice(None), slice(None), "F_Heiz_TT"), ("SWX", "SSX")
# ] = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this old code that has been commented out instead of being deleted?

Comment on lines +571 to +572
# df = pd.concat([df_typ, mytime], axis=1).ffill()
# df.drop(df.head(1).index, inplace=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this old code that has been commented out instead of being deleted?

Comment on lines +576 to +578
# The typical day calculation inherently does not add up to the
# desired total energy demand of the full year. Here we fix that:
# houses_dict = {h["name"]: h for h in self.houses}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this old code that has been commented out instead of being deleted?

@p-snft p-snft marked this pull request as draft July 5, 2022 07:11
@p-snft
Copy link
Member

p-snft commented Jul 5, 2022

As this is a draft, I marked it as such.

Side note: I really like the feature. It's very handy to have generation of time series for SH, DHW, and electricity consumption at hand.

@uvchik
Copy link
Member Author

uvchik commented Jul 12, 2022

For me it is a bit rude to write your code into my approach without any comment. I already wrote improvements and got dozen of merge conflicts. If I know somebody else is working on the code I would push more often to avoid such problems.

I do have a different approach for the weather file so I think the best way would be to finish this PR with cleaning the code and adding some tests and then start a new PR for the weather data. That makes it easier to track changes and to focus the discussion. One reason for that is that the original approach is connect to the TRY data. I am fine with adding your own weather data but it must be clear that you really should know what you are doing.

@jnettels
Copy link

Please accept my apologies for pushing into this branch, then. While I did leave a small comment about it here, I see how it creates unnecessary merge conflicts. Afterwards I learned how to use the "Suggested Change" feature and will stick to that in the future.
Doing a new PR for the weather data is fine with me, too.

Comment on lines 49 to 58
def read_dwd_weather_file(weather_file_path=None, try_region=None):
"""Read and interpolate 'DWD Testreferenzjahr' files."""
if weather_file_path is None:
weather_file_path = os.path.join(
os.path.dirname(__file__),
"resources_weather",
"TRY2010_{:02d}_Jahr.dat".format(try_region),
)
# The comments in DWD files before the header are not commented out.
# Thus we have to search for the line with the header information:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Carefully reading the description, it is clear that we are talking about Testreferenzjahr files. However, there are a lot of types of DWD files (i.e. recordings from weather stations), maybe we could abbreviate it to TRY files like DWD does.

@jnettels
Copy link

Hi there,

it has been quite a while, but I am still depending on this implementation with its awesome speed.
Is there any chance to get it over the finish line and merge it? Is there something I can do to help?

The changes I require to make it work for me personally are added in my fork and affect the ability to load other weather data. Just leaving the link here for reference:
https://github.com/jnettels/demandlib/tree/features/add-vdi-from-lpagg


from demandlib import vdi

# The following dictionary has been created by "workalendar"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# The following dictionary has been created by "workalendar"
# The following dictionary can also be created by "workalendar"

Maybe this is more clear.

@p-snft
Copy link
Member

p-snft commented Jun 28, 2023

@jnettels: Honestly, I'm not really deep into demandlib. I'm willing to review and give feedback. If you have an improved version, you can file a PR against this branch.

@jnettels
Copy link

jnettels commented Jul 3, 2023

Thanks for your answer. My "improvements" concerning the weather data were already suggested, once with a rude (still sorry about that!) push to this PR, once with the "suggestion" feature. @uvchik already mentioned that he had a different approach for the weather data, so I do not think creating a new PR for that is helpful now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cooperate with lpagg
4 participants