DataMan: Data Manager for Pre-training Large Language Models [Paper]
If you use the code in your research, please cite:
@article{peng2025dataman,
title={Dataman: Data manager for pre-training large language models},
author={Peng, Ru and Yang, Kexin and Zeng, Yawen and Lin, Junyang and Liu, Dayiheng and Zhao, Junbo},
journal={arXiv preprint arXiv:2502.19363},
year={2025}
}