docs: add cd diagrams for m4 forecasting benchmark (#1338)

* chore: add cd diagrams as svg * docs: display m4 bench forecasting cd diagrams
aimclub · Sep 30, 2024 · 11a28d3 · 11a28d3
1 parent 5be9119
commit 11a28d3
Show file tree

Hide file tree

Showing 7 changed files with 6,570 additions and 0 deletions.
diff --git a/docs/source/benchmarks/forecasting.rst b/docs/source/benchmarks/forecasting.rst
@@ -128,6 +128,36 @@ Here, as per usual, the best value is indicated in bold for each row (for each s
     | repeat_last | 2.008   | 5.365   | 7.796   | 7.379     | 9.066   | 5.158   |
     +-------------+---------+---------+---------+-----------+---------+---------+
 
+The custom visualizations of the critical difference plot using the Wilcoxon-Holm method for detecting pairwise significance for different levels of seasonality are shown below:
+
+
+Daily M4 (SMAPE):
+
+.. image:: ./img_benchmarks/cd-daily-m4-forecasting.svg
+
+Weekly M4 (SMAPE):
+
+.. image:: ./img_benchmarks/cd-weekly-m4-forecasting.svg
+
+Monthly M4 (SMAPE):
+
+.. image:: ./img_benchmarks/cd-monthly-m4-forecasting.svg
+
+Quarterly M4 (SMAPE):
+
+.. image:: ./img_benchmarks/cd-quarterly-m4-forecasting.svg
+
+Yearly M4 (SMAPE):
+
+.. image:: ./img_benchmarks/cd-yearly-m4-forecasting.svg
+
+All seasons M4 (SMAPE):
+
+.. image:: ./img_benchmarks/cd-overall-m4-forecasting.svg
+
+
+We can claim that results are statistically better than TimeGPT and LAGLLAMA and and indistinguishable from NBEATS and AutoGluon.
+
 
 The statistical analysis on SMAPE metrics was conducted using the Friedman t-test.
 The results confirm that FEDOT's time series forecasting ability is statistically indistinguishable from