ChatTS
is a Time Series Multimodal LLM focuses on Understanding and Reasoning about time series, much like what vision/video/audio-MLLMs do.
This repo provides code, datasets and model for ChatTS
(VLDB' 25): ChatTS: Aligning Time Series with LLMs via Synthetic Data for Enhanced Understanding and Reasoning.
The Web Demo of ChatTS-14B is available at
You can upload a .csv
file to the Web Demo to analyze your own data (Example: ts_example.csv)
ChatTS is a Multimodal LLM built natively for time series as a core modality:
- âś… Native support for multivariate time series
- âś… Flexible input: Supports multivariate time series with different lengths and flexible dimensionality
- âś… Conversational understanding + reasoning:
Enables interactive dialogue over time series to explore insights about time series - âś… Preserves raw numerical values:
Can answer statistical questions, such as "How large is the spike at timestamp t?" - âś… Easy integration with existing LLM pipelines, including support for vLLM.
Here is an example of a ChatTS application, which allows users to interact with a LLM to understand and reason about time series data:
Check out the Case Studies section for more real-world applications and sample conversations.
- Web Demo (New):
- Installation:
- Deployment:
- Training:
- Evaluation:
- 2025/07/24: A Web Demo of ChatTS is available at
- 2025/07/03: The quantized version of ChatTS:
ChatTS-14B-GPTQ-4bit
is available at HuggingFace! - 2025/04/29: The data generation code has been updated. You can generate the training datasets of
ChatTS
with the updated code now. Please refer to Training Data Generation. We have also open-sourced the implementation for all the baseline models! Please refer to Evaluation. - 2025/04/16: ChatTS has been accepted by VLDB '25! We have released the training datasets for ChatTS. You can also use the code in this repo to generate data manually and do the model training.
- 2025/01/01: We have released a new version of
ChatTS
model, with enhanced CoT and question answering capability. Checkfor more information.
- 2024/12/30: A experimental version of
vLLM
support for ChatTS is available! Check demo_vllm.py for more information. (Note: This version is still under development and may not be stable.) We have also updated the ChatTS model implementation, which supportskv_cache
andAutoProcessor
now.
Resource | Link | Description |
---|---|---|
ChatTS Paper (VLDB' 25) | arXiv:2412.03104 | Paper |
ChatTS-14B Model | Hugging Face Model | Model weights |
ChatTS-14B-GPTQ-Int4 Model | Hugging Face Model | Quantization |
Training Datasets | Hugging Face Dataset | Synthetic training set |
Evaluation Datasets | Zenodo | Real & synthetic eval data |
Training Scripts | ChatTS-Training | Training Scripts for ChatTS |
This repository provides several toolkits for generating synthetic data with the approaches introduced in ChatTS
, as well as the evaluation code and evaluation datasets for reproduction.
We have provided:
- Toolkits for generating synthetic time series data and the corresponding attribues:
chatts/ts_generator/generate.py
. - Example code for generating a training dataset with pre-defined templates:
demo/generate_template_qa.py
, which can be further used as seed QAs for TSEvol. - Example code for generating a training dataset with LLMs:
chatts/sft/generate_llm_qa
, which can be further used as seed QAs for TSEvol. - Code implementation for
TSEvol
with the generated seed QAs:chatts/sft/generate_tsevol_dataset.py
. - Code implementation for evaluation:
evaluation/
. - Simple demos for inference:
demo/demo_hf.ipynb
anddemo/demo_vllm.py
. - A trained
ChatTS
model (fine-tuned based on a modified version of QWen2.5-14B-Instruct) at.
- Training datasets:
- Evaluations datasets:
.
- Training scripts: ChatTS-Training.
- Basic requirements for model inference:
python>=3.11
,deepspeed
,vllm==0.8.5
,torch==2.6.0
,flash-attn
(refer torequirements.txt
). - Download the evaluation datasets from Zenodo and put them under
evaluation/dataset
(evaluation/dataset/dataset_a.json
andevaluation/dataset/dataset_b.json
). - Download the trained model weights from HuggingFace, extract it and put all the extracted files under
ckpt/
(ckpt/config.json
, etc). - Note:
ChatTS
is trained based on a 14B-sized base model, so you need to ensure that you have a GPU with sufficient memory for inference. Additionally, due to the model's requirements,Flash-Attention
(https://github.com/Dao-AILab/flash-attention) is essential, so you need to ensure that your GPU meets the installation requirements for Flash-Attention. Recommended GPUs: A100/A800.
- Following the steps in
Installation
to download the trainedChatTS
model and place it underckpt
. - The ChatTS model can be loaded directly using the
transformers
library. Refer todemo/demo_hf.ipynb
for more information. - About
sp
Encoding. To facilitate the input of variable-length batch time series, we adopted a method namedsp
encoding when encoding the time series. For each time series data point, an additional numerical value of 1.0 is added as a mask. For convenience, we have a Processor which can be loaded withAutoProcessor
intransformers
to normalize and convert the time series and text (Value-Preserved Time Series Encoding). Please refer todemo/demo_hf.ipynb
for more information about their usage. - An example usage of ChatTS (with
HuggingFace
):
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoProcessor
import torch
import numpy as np
# Load the model, tokenizer and processor
model = AutoModelForCausalLM.from_pretrained("./ckpt", trust_remote_code=True, device_map=0, torch_dtype='float16')
tokenizer = AutoTokenizer.from_pretrained("./ckpt", trust_remote_code=True)
processor = AutoProcessor.from_pretrained("./ckpt", trust_remote_code=True, tokenizer=tokenizer)
# Create time series and prompts
timeseries = np.sin(np.arange(256) / 10) * 5.0
timeseries[100:] -= 10.0
prompt = f"I have a time series length of 256: <ts><ts/>. Please analyze the local changes in this time series."
# Apply Chat Template
prompt = f"<|im_start|>system\nYou are a helpful assistant.<|im_end|><|im_start|>user\n{prompt}<|im_end|><|im_start|>assistant\n"
# Convert to tensor
inputs = processor(text=[prompt], timeseries=[timeseries], padding=True, return_tensors="pt")
# Move to GPU
inputs = {k: v.to(0) for k, v in inputs.items()}
# Model Generate
outputs = model.generate(**inputs, max_new_tokens=300)
print(tokenizer.decode(outputs[0][len(inputs['input_ids'][0]):], skip_special_tokens=True))
- Note: Due to training data limitations, the recommended time series length for ChatTS ranges from 64 to 1024, with a maximum of 30 series in one input. Currently, time series that are too short (i.e., shorter than 64) may not be correctly recognized by ChatTS. We will improve this in future updates.
Since vLLM lacks native support for the ChatTS
model, we have provided a patch to enable vLLM to support inference. Therefore, before using vLLM to load the model, please make sure that the code includes: import chatts.vllm.chatts_vllm
to register the ChatTS model in vLLM. Please refer to the following steps to use vLLM to load ChatTS:
- Install
vllm==0.8.5
(please ensure that you have installed the exact version as vLLM's multimodal APIs change frequently). - Please refer to
demo/demo_vllm.py
for detailed usage methods.
A simple example of using vLLM to load ChatTS:
import chatts.vllm.chatts_vllm
from vllm import LLM, SamplingParams
# Load the model
language_model = LLM(model="./ckpt", trust_remote_code=True, max_model_len=ctx_length, tensor_parallel_size=1, gpu_memory_utilization=0.95, limit_mm_per_prompt={"timeseries": 50})
# Create time series (np.ndarray) and prompts (chat_templated applied)
ts1, ts2 = ...
prompt = ...
# Model Inference
outputs = language_model.generate([{
"prompt": prompt,
"multi_modal_data": {"timeseries": [ts1, ts2]}
}], sampling_params=SamplingParams(max_tokens=300))
- Alignment Data Generation.
- Generate alignment data. Please first set
local_llm_path
inconfig/datagen_config.yaml
to your local LLM for the following steps. Runbash scripts/generate_align_datasets.sh
to generate the alignment datasets. - (Optional) To generate your own data. We provide an example to generate QA through Template. You can modify it according to your own task. Use
python3 -m demo.generate_template_qa
to generate a training dataset with pre-defined templates.
- SFT Data Generation
- Seed QA Generation with LLMs. Use
python3 -m chatts.sft.generate_llm_qa
to generate a seed QA dataset with LLMs. This dataset and all the previous generated alignment datasets will be used as input of TSEvol. - TSEvol. You need a downloaded LLM that can be loaded with
vLLM
to perform this step. The datasets generated in the previous step will be used as seed QAs in TSEvol, so please make sure that you have successfully generated the previous datasets before running TSEvol. Runpython3 -m chatts.sft.generate_tsevol_dataset
to generate the TSEvol dataset. - IFT Dataset. Please generate the
alignment datasets
first! The generation of theIFT (Instruction Following)
datasets takes the labels of the alignment datasets as input and output the ift dataset through a set of predefined templates. Runpython3 -m chatts.sft.generate_ift_dataset
after all the alignment datasets have been successfully generated (in step 1).
SEQ_LEN
of generated time series can be configured by setting theSEQ_LEN
parameter inconfig/datagen_config.yaml
. In ChatTS, most of the time series are with length of 256 in the training and evaluation datasets. We also mix time series with other lengths during training (by settingseq_len=null
) for ChatTS to adapt to time series with different lengths.
-
Download the evaluation datasets and the ChatTS-14B checkpoints first (refer to Installation).
-
Option 1: vLLM Inference (Experimental). After installing
vllm==0.8.5
, please set the correct path to your evaluation dataset and the ChatTS model, and the number of GPUs in thechatts/utils/inference_tsmllm_vllm.py
. Then, run the following command to do the model inference:python3 -m chatts.utils.inference_tsmllm_vllm
-
Option 2: DeepSpeed Inference (Slower). We provide a simple script for inference of ChatTS (
chatts/utils/inference_tsmllm_deepspeed.py
) withdeepspeed
. After installingdeepspeed
, please set the correct path to your evaluation dataset and the ChatTS model in the script. Then, run the following command to do the model inference:deepspeed --num_gpus [YOUR_NUM_GPUS] --master_port 12345 chatts/utils/inference_tsmllm_deepspeed.py
You should find the inference results under exp/
folder, which will be further used for evaluation.
- Generate the inference results of
ChatTS
by following the steps in Deepspeed Model Inference for Evaluation. - Install
ragas==0.1.9
(https://github.com/explodinggradients/ragas), which is used for evaluating the inductive reasoning results. - Set the
API_KEY
andOPENAI_URL
inevaluation/ragas/config/config.toml
(Refer to https://platform.openai.com/docs/api-reference). - Run
python3 -m evaluation.evaluate_tsmllm_models
to evaluateChatTS
(make sure you have done the model inference before). - We also provide code to evaluate the performance of the baseline models. After setting your
API_KEY
andOPENAI_URL
inevaluation/evaluate_gpt_text_*.py
, run the commandpython3 -m evaluation.evaluate_gpt_text_models
,python3 -m evaluation.evaluate_gpt_vision_models
andpython3 -m evaluation.evaluate_gpt_agent_models
to obtain the evaluation results of the text/vision/agent-based LLM models. (Note: For Agent-Based Models, you need to first train the rocket classifier with:python3 -m evaluation.train_rocket_tsc
first.)
- We provide a simple script for fine-tuning your own TS-MLLM models: https://github.com/xiezhe-24/ChatTS-Training (modified based on LLaMA-Factory). Refer to this repository for more details.
ChatTS-14B-GPTQ-4bit
is available at HuggingFace. You can use vLLM
to load it.
You can deploy a OpenAI API compatible server with vLLM
. Refer to #11 for details.
We have provided a user-friendly time series generator to generate attribute pools and corresponding time series. You can refer to demo/demo_ts_generator.ipynb to learn how to use it.
- Evaluation datasets (Dataset A and B) are available at https://doi.org/10.5281/zenodo.14349206. Each sample in these datasets has several parts:
timeseries
, which is the time series data itself;question
, the query related to the time series;answer
, the text-form standard answers provided for your reference only;attributes
, the structured labels used to evaluate results; andability_types
, which indicates the types of tasks the question involves. Please pay special attention to this: To cut down on evaluation costs, we've combined different questions that pertain to the same time series into onequestion
. We use numbering to tell these different questions apart. So, when you look at the evaluation dataset, the actual count of questions might be more than the number oftimeseries
entries. Another thing to note is that some tasks in inductive reasoning and alignment are grouped together in one question. This is because inductive reasoning tasks often require explaining the physical meanings of time series attributes. - The
MCQ2
dataset is sourced from a third-party and is open-source. However, due to licensing restrictions, we are unable to provide it within this repository. You can directly download it via https://github.com/behavioral-data/TSandLanguage.
In
ChatTS
, we mainly focus on Understanding and Reasoning about time series, just like what vision/video/audio-MLLMs do, rather than conducting time series prediction, anomaly detection and classification tasks.
You can try more application scenarios of ChatTS by modifying the time series and the text of questions in demo_hf.ipynb
!
- QWen (https://github.com/QwenLM/Qwen2.5)
- DeepSpeed (https://www.deepspeed.ai/)
- RAGAS (https://github.com/explodinggradients/ragas)
- vLLM (https://github.com/vllm-project/vllm)
- Flash Attention (https://github.com/Dao-AILab/flash-attention)
If you discover a potential security issue in this project, or think you may have discovered a security issue, we ask that you notify Bytedance Security via our security center or vulnerability reporting email.
Please do not create a public GitHub issue for a security vulnerability.
This project is licensed under the MIT License.
@article{xie2024chatts,
title={ChatTS: Aligning Time Series with LLMs via Synthetic Data for Enhanced Understanding and Reasoning},
author={Xie, Zhe and Li, Zeyan and He, Xiao and Xu, Longlong and Wen, Xidao and Zhang, Tieying and Chen, Jianjun and Shi, Rui and Pei, Dan},
journal={arXiv preprint arXiv:2412.03104},
year={2024}
}