Releases: eosphoros-ai/DB-GPT-Hub
Releases · eosphoros-ai/DB-GPT-Hub
V0.3.0
What's changed
This version, we have added more models and released more experiments and evaluation results. Multiple functions have been expanded, including datasets ,APIs and so on , conclude as follows:
- Newly released and added evaluation results for multiple models including base, lora and qlora, updated in docs/eval_llm_result.md. Models include llama2-7b, 13b, codellama2-7b, 13b, baichuan2-7b, 13b, Qwen7b, 14b, mainly completed by @wangzaistone, @zhanghy-sketchzh, @junewgl, @Jian1273 and @zhoufan, @qidanrui.
- Newly completed fine-tuning development and training of codellama-13b in the project and released sota weights, mainly by @wangzaistone, @Jian1273 and @zhanghy-sketchzh offering assistance.
- Updated evaluation methods and results on the testsuit dataset, mainly by @wangzaistone, @JBoRu and @junewgl.
- Newly reconstructed log output code structure, mainly by @wangzaistone and @zhanghy-sketchzh.
- Newly supported adding other datasets during training, mainly by @Jian1273, assisted by @wangzaistone and @John-Saxon.
- Newly added and improved deepspeed support, by @Jian1273, @wangzaistone, @zhanghy-sketchzh.
- Newly added workflow, by @qidanrui.
- Newly added API interfaces, including the entire data processing, training, prediction and evaluation process (#144), by @qidanrui.
- Summarized everyone's models as baseline results in the API interface (@junewgl and @qidanrui).
- Newly added poetry installation and operation methods, by @qidanrui.
- Newly added chatglm3 model support, and released training and evaluation results, by @wangzaistone.
- Continuously maintained Chinese and English documentation including adding related parameters, experimental indicators and data descriptions, grammar checks, etc., mainly by @wangzaistone and @qidanrui, assisted by @zhanghy-sketchzh and @Jian1273.
- Clarified future development directions including interfaces and so on , by @csunny.
Thanks to partner @John-Saxon for starting to contribute code #83 #122 (contributed initial multi-round dialog data code) @SimonChuZZ (contributed initial code for database-assisted training data construction functionality).
Thanks to @qidanrui for starting to submit code in this version and improving multiple parts.
V0.2.0
What's changed
This version, we have refactored the entire project code and made related optimizations, achieving an execution accuracy on the Spider evaluation set that surpasses GPT-4 (compared with third-party evaluation results).
- Initial code refactoring by @csunny
- Framework for code refactoring confirmed by @wangzaistone and @csunny
- Code refinement and iteration for configs, data, data_process, eval, llm_base, predict, train etc, primarily developed by @wangzaistone ,with assistance from @Jian1273 @zhanghy-sketchzh @junewgl
- relevant training experiments and results by @wangzaistone and @Jian1273
v0.0.2
What's changed
- add lora fine-turning for llama/llama2 @zhanghy-sketchzh
- support qlora fine-turning for llama2 @zhanghy-sketchzh
- add a solution for multi-gpu cards. @wangzaistone
- add prediction demo. @1ring2rta
v0.0.1
What's Changed
- init readme.md and readme.zh.md @csunny @zhanghy-sketchzh
- Spider + QLoRA + Falcon SFT scaffold @zhanghy-sketchzh
- readme.md grammar correction @csunny