-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to reproduce the results as shown in the paper? #102
Comments
@liu-jc We are working towards releasing the evaluation datasets. Once we have that, I will inform you. Please keep an eye out for the update. |
Hi @abdulfatir, thanks for the reply! I also noticed that in the README, you mentioned "Fixed an off-by-one error in bin indices in the output_transform". Does that mean if we are using the checkpoint on Huggingface, it's the version before fixing this bug? |
@liu-jc The issue was not in the model checkpoints themselves but in the inference code. The decodes values were shifted by one which led to some avoidable discrepancy. |
@abdulfatir thanks for the answer. So, may I confirm that if we are using the latest code for inference, it should not have any problems? |
@liu-jc Yes. |
Update: We have just open-sourced the datasets used in the paper (thanks @shchur!). Please check the updated README. We have also released an evaluation script and backtest configs to compute the WQL and MASE numbers as reported in the paper. Please follow the instructions in this README to evaluate on the in-domain and zero-shot benchmarks. |
Hi @abdulfatir, Thanks for the effort in releasing datasets and evaluation scripts! Those are tremendously helpful for the community. A few more questions I would like to ask:
|
|
Hi @abdulfatir, Thanks for the clarification. That would be super helpful for having the training corpus. I wonder if you have a plan to provide some code snippets/guide to use |
@liu-jc We have no such plan at the moment due to other priorities. My guess is that you should be able to use from pathlib import Path
from typing import List, Optional, Union
import numpy as np
from gluonts.dataset.arrow import ArrowWriter
def convert_to_arrow(
path: Union[str, Path],
time_series: Union[List[np.ndarray], np.ndarray],
start_times: Optional[Union[List[np.datetime64], np.ndarray]] = None,
compression: str = "lz4",
):
if start_times is None:
# Set an arbitrary start time
start_times = [np.datetime64("2000-01-01 00:00", "s")] * len(time_series)
assert len(time_series) == len(start_times)
dataset = [
{"start": start, "target": ts} for ts, start in zip(time_series, start_times)
]
ArrowWriter(compression=compression).write_to_file(
dataset,
path=path,
)
if __name__ == "__main__":
# Generate 20 random time series of length 1024
time_series = [np.random.randn(1024) for i in range(20)]
# Convert to GluonTS arrow format
convert_to_arrow("./noise-data.arrow", time_series=time_series) |
Hi @abdulfatir, Thanks for pointing out this snippet. I asked that question because previously I tried a bit to directly replace Another question: if we want to stick with GluonTS |
|
In general though, I would recommend a machine with large RAM for pretraining. |
Closing in favor of #150 |
Hi chronos team,
Thanks for the great work! I would like to know how we can reproduce the results as shown in the paper, e.g., Figure 4. Also could we have some evaluation scripts/code to facilitate the model evaluation?
I am aware that some code snippets are provided at #75. But as mentioned "While many datasets in GluonTS have the same name as the ones used in the paper, they may be different from the evaluation in the paper in crucial aspects such as prediction length and number of rolls.", I wonder if we can have scripts to help us reproduce the results.
The text was updated successfully, but these errors were encountered: