What's changed
This version, we have added more models and released more experiments and evaluation results. Multiple functions have been expanded, including datasets ,APIs and so on , conclude as follows:
- Newly released and added evaluation results for multiple models including base, lora and qlora, updated in docs/eval_llm_result.md. Models include llama2-7b, 13b, codellama2-7b, 13b, baichuan2-7b, 13b, Qwen7b, 14b, mainly completed by @wangzaistone, @zhanghy-sketchzh, @junewgl, @Jian1273 and @zhoufan, @qidanrui.
- Newly completed fine-tuning development and training of codellama-13b in the project and released sota weights, mainly by @wangzaistone, @Jian1273 and @zhanghy-sketchzh offering assistance.
- Updated evaluation methods and results on the testsuit dataset, mainly by @wangzaistone, @JBoRu and @junewgl.
- Newly reconstructed log output code structure, mainly by @wangzaistone and @zhanghy-sketchzh.
- Newly supported adding other datasets during training, mainly by @Jian1273, assisted by @wangzaistone and @John-Saxon.
- Newly added and improved deepspeed support, by @Jian1273, @wangzaistone, @zhanghy-sketchzh.
- Newly added workflow, by @qidanrui.
- Newly added API interfaces, including the entire data processing, training, prediction and evaluation process (#144), by @qidanrui.
- Summarized everyone's models as baseline results in the API interface (@junewgl and @qidanrui).
- Newly added poetry installation and operation methods, by @qidanrui.
- Newly added chatglm3 model support, and released training and evaluation results, by @wangzaistone.
- Continuously maintained Chinese and English documentation including adding related parameters, experimental indicators and data descriptions, grammar checks, etc., mainly by @wangzaistone and @qidanrui, assisted by @zhanghy-sketchzh and @Jian1273.
- Clarified future development directions including interfaces and so on , by @csunny.
Thanks to partner @John-Saxon for starting to contribute code #83 #122 (contributed initial multi-round dialog data code) @SimonChuZZ (contributed initial code for database-assisted training data construction functionality).
Thanks to @qidanrui for starting to submit code in this version and improving multiple parts.