Skip to content

pengr/DataMan

Repository files navigation

DataMan: Data Manager for Pre-training Large Language Models [Paper]

Citation

If you use the code in your research, please cite:

@article{peng2025dataman,
  title={Dataman: Data manager for pre-training large language models},
  author={Peng, Ru and Yang, Kexin and Zeng, Yawen and Lin, Junyang and Liu, Dayiheng and Zhao, Junbo},
  journal={arXiv preprint arXiv:2502.19363},
  year={2025}
}

About

Our code for ICLR'25 paper "DataMan: Data Manager for Pre-training Large Language Models".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published