Skip to content

A project about an analyzation of a statistic of damaged logging (wood) in Germany using Python.

License

Notifications You must be signed in to change notification settings

HokageM/DamagedLoggingAnalyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DamagedLoggingAnalyzer

A project about of analyzing a statistic of damaged logging wood in Germany using Python.

This is my individual project for the module Research Software Engineering in SS24. The task was to analyze a dataset from genesis.destatis using Python and to find interesting aspects and potential questions that could be explored using this data.

If you are only interested in the results, please jump to the section Damaged Logging.

Installation

git clone [email protected]:HokageM/DamagedLoggingAnalyzer.git
cd DamagedLoggingAnalyzer
pip install .

Usage

Commandline

usage: damaged_logg_analyzer [-h] [--version] [--calculate-most-dangerous-reasons] [--plot-reason-dependencies] [--plot-owner-dependencies] [--plot-temporal-dependencies-all] [--predict]
                             [--out-dir OUT_DIR]
                             CSV

Analyzes the data about damaged wood from the CSV file.

positional arguments:
  CSV                   Path to the CSV containing the statistic.

options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  --calculate-most-dangerous-reasons
                        Calculates the most dangerous reasons for each specie.
  --plot-reason-dependencies
                        Create combined plots for each specie and owner combinations for all reasons. Plots will be saved in: output-path/Specie/all_reasons/Owner/plot.png.
  --plot-owner-dependencies
                        Create combined plots for each specie and reason combinations for all owners. Plots will be saved in: output-path/Specie/Reason/all_owners/plot.png.
  --plot-temporal-dependencies-all
                        Create plots for temporal dependencies for each specie, reason and owner combination. Plots will be saved in: output-path/Specie/Reason/Owner/plot.png. Note: use --plot-
                        owner-dependencies and --plot-reason-dependencies.
  --predict             Estimates a death count function using Polynomial Regression with K-Fold Cross Validation to predict the numbers for the year 2024. Plots will be saved in: output-
                        path/Prediction_2024/Specie/Reasons/Owner/plot.png.Note: will created a new model for every specie, reason and owner combination.
  --out-dir OUT_DIR     Output directory for the plots.

Library

The following classes are available:

from damagedlogginganalyzer.DamagedLoggingAnalyzer import DamagedLoggingAnalyzer
from damagedlogginganalyzer.CSVAnalyzer import CSVAnalyzer
from damagedlogginganalyzer.Plotter import Plotter
from damagedlogginganalyzer.WoodOracle import WoodOracle
from damagedlogginganalyzer.Oracle import Oracle

The classes CSVAnalyzer and Oracle are independent of this project and can be used for other projects. Moreover, the classes DamagedLoggingAnalyzer, WoodOracle and Plotter are specific for this project / data set.

Damaged Logging

Note: You can optionally read the notebook story_of_this_project.ipynb to have an interactive experience with the project.

What is the dataset about?

The dataset contains statistics on forest wood harvesting due to various damages in Germany, listed by year, type of wood species groups, and ownership types of forests.

Each entry specifies the volume of wood harvested (in cubic meters) due to different causes such as wind/storm, snow/ice damage, insects, drought, and other reasons.

Why is this dataset interesting?

Here are some interesting aspects and potential questions, which could explore using this data:

Temporal Trends: How has the damage-caused wood harvesting changed over the years? Are there increasing trends in certain types of damage like drought or insects, possibly linked to climate change? How will the year 2024 look like in terms of the volume of wood harvested due to different causes?

Damage Types: Which type of damage causes the most wood harvesting? How do different types of forests compare in their vulnerability to specific damage types?

Forest Management: Are there noticeable differences in wood harvesting due to damage across different forest ownership types (e.g., state-owned vs. privately-owned forests)? This could reflect different management practices and their effectiveness.

Potential Questions (will not be answered in this project):

Impact of Extreme Weather: Are there particular years with exceptionally high damage that could be correlated to extreme weather events or climate anomalies?

Economic and Ecological Impact: What might be the economic impact of these losses? How might these harvesting activities due to damages impact the ecological balance and biodiversity in these forests?

Temporal Trends

Question: How has the damage-caused wood harvesting changed over the years?

I created individual plots for the total volume of wood harvested due to different reasons (drought, wind/storm, snow, insects, miscellaneous, total) and different owners over the years for different types of wood species. Here are some examples (the other plots can be found in the plots/specie/reason/owner/plot.png directory or can be generated with the following command:

damaged_logg_analyzer data/DamagedLoggingWoodFixTable.csv --plot-temporal-dependencies-all --out-dir plots

Deaths of Oak and Red Oak caused by insects and owned by Insgesamt over the years in Germany:

Deaths of Pine caused by insects and owned by Insgesamt over the years in Germany:

Additionally, I created combined plots for the different types of wood species. Note: In the following, I will only show the combined plots for the different types of wood species and owned by Insgesamt. The other plots can be found in the plots/specie/all_reasons/owner/plot.png directory or can be generated with the following command:

damaged_logg_analyzer data/DamagedLoggingWoodFixTable.csv --plot-reason-dependencies --out-dir plots

Total Oak and Red Oak deaths over the years in Germany:

Total Beech and Hardwood deaths over the years in Germany:

Total Spruce deaths over the years in Germany:

Total Pine deaths over the years in Germany:

Total tree deaths over the years in Germany:

All in all, one can see that the deaths of all species due to the most reasons depend on the year and fluctuate between high and low values. However, the total deaths of all species are increasing over the years especially for the reasons Sonstiges (miscellaneous), which could be caused by fires, diseases, or other reasons. The definition of Sonstiges is not clear in the dataset.

Question: Are there increasing trends in certain types of damage like drought or insects, possibly linked to climate change?

All in all, the death of all species due to drought, snow, and insects can be modeled as linear (near constant) functions. Please look in Prediction_2024 for the function estimations.

The deaths of all species due to wind/storm depends on the year and fluctuate between high and low values. But one can see a very high number of deaths due to wind/storm in the year 2006 and 2018.

The deaths of all species due to Sonstiges (miscellaneous) can be modeled quit good with a polynomial function and are increasing over the years.

The total deaths of all species are increasing over the years.

Question: How will the year 2024 look like in terms of the volume of wood harvested due to different causes?

I used polynomial regression with k-fold cross validation to predict the volume of wood harvested due to different causes in the year 2024. Note: The prediction is based on the data from 2006 to 2023. All plots can be found in the Prediction_2024 directory or can be generated with the following command:

damaged_logg_analyzer data/DamagedLoggingWoodFixTable.csv --predict --out-dir path/to/output

The death of all species due to "Sonsitges" (miscellaneous) can be modeled quit good with a polynomial function, e.g. for the Beech and Hardwood species group:

In some cases prediction does not make sense, because the death do not follow a polynomial function and depend on other factors, e.g. death causes by insects:

Deaths due to nature like wind/storm, snow, and drought can be modeled as linear functions, e.g. for the Beech and Hardwood species group. Note: One need to handle the outliers in the data, e.g. the death of the year 2018 for the Beech and Hardwood species group due to wind/storm, this can be done by using a Ridge Regression model. Those outliers come from special events like storms, which are not predictable with the current model.

Damage Types

Question: Which type of damage causes the most wood harvesting?

This is solved by calculating the maximum damage for each type of wood species group, which can be done with the following command:

damaged_logg_analyzer data/DamagedLoggingWoodFixTable.csv --calculate-most-dangerous-reasons

The maximum damage for each type of wood species group is:

Specie Reason Amount
Eiche und Roteiche Wind/ Sturm 2048
Buche und sonstiges Laubholz Wind/ Sturm 9124
Kiefer und L�rche Wind/ Sturm 19806
Fichte und Tanne und Douglasie und sonstiges Nadelholz Sonstiges 208725
Insgesamt Sonstiges 218181

Question: How do different types of forests compare in their vulnerability to specific damage types?

This is also solved by the following command:

damaged_logg_analyzer data/DamagedLoggingWoodFixTable.csv --calculate-most-dangerous-reasons --plot-temporal-dependencies-all

Analyzing the plots show that Buche und sonstiges Laubholz and Eiche und Roteiche have fewer deaths in any reason compared to Kiefer und L�rche and Fichte und Tanne und Douglasie und sonstiges Nadelholz. The death counts are up to 10 times higher for Kiefer und L�rche and Fichte und Tanne und Douglasie und sonstiges Nadelholz compared to Buche und sonstiges Laubholz and Eiche und Roteiche.

Forest Management

Question: Are there noticeable differences in wood harvesting due to damage across different forest ownership types (e.g., state-owned vs. privately-owned forests)? This could reflect different management practices and their effectiveness.

This is solved by the following command:

damaged_logg_analyzer data/DamagedLoggingWoodFixTable.csv --plot-owner-dependencies

Analyzing the plots show that the deaths due to different reasons are similar for the different owners. So it seems that the owner does not have a big impact on the death count. Here are some examples and the other plots can be found in the plots/specie/reason/all_owners/plot.png:

Deaths of Oak and Red Oak caused by insects and owned by Insgesamt over the years in Germany:

Deaths of Pine caused by insects and owned by Insgesamt over the years in Germany:

Contact Information

If you have any questions, suggestions, or concerns about this project, feel free to contact me:

Statistic about Damaged Logging

From: genesis.destatis

Statistic Number: 41261-0003

Citation

If you use this software, please cite it as described in the CITATION.cff file.

License

This project is licensed under the MIT License - see the LICENSE file for details.