-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path大模型微调
289 lines (206 loc) · 11.3 KB
/
大模型微调
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
Stable Diffusion爱好者常说的LoRa是什么?
https://zhuanlan.zhihu.com/p/610031713
论文阅读:LORA-大型语言模型的低秩适应
https://zhuanlan.zhihu.com/p/611557340
Alpaca-lora 在消费级显卡上训练你自己的 ChatGPT!
https://zhuanlan.zhihu.com/p/614913980
https://github.com/tloen/alpaca-lora
LoCon相对于LoRA的改进
https://zhuanlan.zhihu.com/p/612133434
QLoRA:一种高效LLMs微调方法,48G内存可调65B 模型,调优模型Guanaco 堪比Chatgpt的99.3%
https://zhuanlan.zhihu.com/p/632229856
QLoRA的实测记录
https://zhuanlan.zhihu.com/p/632398047
开源原驼(Guanaco)及背后的QLoRA技术,将微调65B模型的显存需求从780GB以上降低到48GB以下,效果直逼GPT-4,技术详解
https://zhuanlan.zhihu.com/p/632236718
QLoRA: Efficient Finetuning of Quantized LLMs
https://arxiv.org/abs/2305.14314
如何从数据集中自动识别高质量的指令数据-IFD指标的使用
https://zhuanlan.zhihu.com/p/658128530
https://arxiv.org/abs/2308.12032
https://github.com/MingLiiii/Cherry_LLM
Firefly(流萤): 中文对话式大语言模型
https://www.shangyexinzhi.com/article/7399473.html
【OpenLLM 007】大模型炼丹术之小参数撬动大模型-万字长文全面解读PEFT参数高效微调技术
https://zhuanlan.zhihu.com/p/625502729
deepspeed入门教程
https://zhuanlan.zhihu.com/p/630734624
DeepSpeed介绍
https://zhuanlan.zhihu.com/p/624412809
DeepSpeed之ZeRO系列:将显存优化进行到底
https://basicv8vc.github.io/posts/zero/
https://www.deepspeed.ai/training/
LLM 学习笔记-Deepspeed-MoE 论文
https://zhuanlan.zhihu.com/p/670968683
《Universal Language Model Fine-tuning for Text Classification》论文笔记
https://blog.csdn.net/weixin_44815943/article/details/123870564
微软也搞起了开源小模型!利用 OpenAI 的 ChatGPT 和 GPT-4 训练,实力碾压当前最强开源模型
https://zhuanlan.zhihu.com/p/639212768
O-LoRA: 针对LLM的“灾难遗忘”解决方案
https://zhuanlan.zhihu.com/p/663034986
中文LLaMA&Alpaca大语言模型词表扩充+预训练+指令精调
https://zhuanlan.zhihu.com/p/631360711
Instruction Tuning with Human Curriculum
https://arxiv.org/abs/2310.09518
EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling
https://www.semanticscholar.org/paper/EMO%3A-Earth-Mover-Distance-Optimization-for-Language-Ren-Wu/36b88e6cf9cd5fa4809602f365287cb2201f8350
InstructionGPT-4: A 200-Instruction Paradigm for Fine-Tuning MiniGPT-4
https://arxiv.org/abs/2308.12067
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
https://arxiv.org/abs/2309.11235
微软发布Orca2,“调教式”教会小规模大语言模型如何推理!
https://zhuanlan.zhihu.com/p/670516349
Orca 2: Teaching Small Language Models How to Reason
https://arxiv.org/abs/2311.11045
如何从数据集中自动识别高质量的指令数据-IFD指标的使用
https://zhuanlan.zhihu.com/p/658128530
[林知/术] 全参数微调LLaMA-2-70B备忘
https://zhuanlan.zhihu.com/p/666613055
https://arxiv.org/abs/2308.12032
FSDP(Fully Sharded Data Parallel)
https://blog.csdn.net/studyeboy/article/details/133888212
Fully Sharded Data Parallel: faster AI training with fewer GPUs
https://engineering.fb.com/2021/07/15/open-source/fsdp/
Facebook推出数据并行训练算法FSDP:采用更少的GPU,更高效地训练更大数量级的模型
https://aif.amtbbs.org/index.php/2021/08/23/383/
Megatron-LM论文阅读笔记
https://zhuanlan.zhihu.com/p/631030756
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
https://arxiv.org/abs/1909.08053
[细读经典]Megatron论文和代码详细分析(1)
https://zhuanlan.zhihu.com/p/366906920
Megatron-LM:使用模型并行训练数十亿参数的语言模型
https://zhuanlan.zhihu.com/p/644493033
Megatron-LM 第三篇Paper总结——Sequence Parallelism & Selective Checkpointing
https://zhuanlan.zhihu.com/p/522198082
论文解读系列第四篇:谷歌GPipe训练超大规模神经网络
https://zhuanlan.zhihu.com/p/113233933
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
https://arxiv.org/abs/1811.06965
SE-MoE:可拓展分布式MoE训练及推理框架(百度)
https://blog.csdn.net/cold_code486/article/details/133683242
SE-MoE: A Scalable and Efficient Mixture-of-Experts Distributed Training and Inference System
https://arxiv.org/abs/2205.10034
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding 论文翻译与精读
https://zhuanlan.zhihu.com/p/672837901
如何超越数据并行和模型并行:从GShard谈起
https://mp.weixin.qq.com/s?__biz=MzU5ODY2MTk3Nw==&mid=2247486137&idx=1&sn=fa429fd4a94a6b815199c9a294276f59&chksm=fe41848fc9360d99b46b20b8bce3e36d7ee981d5e3c64860fe925a0af97de701fde284c1f3a0&scene=21#wechat_redirect
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
https://arxiv.org/abs/2006.16668
Textbooks Are All You Need 论文精读与翻译
https://zhuanlan.zhihu.com/p/673040548
Textbooks Are All You Need II: phi-1.5 technical report 精读与翻译
https://zhuanlan.zhihu.com/p/673021932
大模型如何在指令微调过程中构造或筛选高质量数据? - 刘聪NLP的回答 - 知乎
https://www.zhihu.com/question/623570103/answer/3224726082
WizardLM:赋予大型语言模型遵循复杂指令的能力
https://zhuanlan.zhihu.com/p/643162614
独家采访WizardLM团队,详解WizardCoder/Math超越GPT4/ChatGPT的RLEIF算法
https://it.sohu.com/a/715204130_121119001
【LLM】Ziya2: 数据中心化学习是所有LLM需要的(Ziya2: Data-centric Learning is All LLMs Need)
https://zhuanlan.zhihu.com/p/665614074
Ziya2: Data-centric Learning is All LLMs Need
https://arxiv.org/abs/2311.03301
如何从数据集中自动识别高质量的指令数据-IFD指标的使用
https://zhuanlan.zhihu.com/p/658128530
大模型背景下如何自动筛选高质量的指令数据
https://zhuanlan.zhihu.com/p/672468811
MoDS: Model-oriented Data Selection for Instruction Tuning
https://arxiv.org/abs/2311.15653
DEITA-大模型指令微调的数据高效筛选方法
https://zhuanlan.zhihu.com/p/675928711
如何自动筛选高质量的指令微调数据喂给大模型?
https://zhuanlan.zhihu.com/p/671340624
GPT-4使用效果不好?美国奥本大学提出Prompt分类法,另辟蹊径构建Prompt设计指南
https://zhuanlan.zhihu.com/p/644545992
放弃评测大模型,普林斯顿大学已经开始评估Prompt了,提出Prompt评估框架
https://zhuanlan.zhihu.com/p/644546392
InstructEval: Systematic Evaluation of Instruction Selection Methods
https://arxiv.org/abs/2307.00259
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning
https://arxiv.org/abs/2312.15685
Learning to Edit: Aligning LLMs with Knowledge Editing
https://arxiv.org/abs/2402.11905
I Learn Better If You Speak My Language: Enhancing Large Language Model Fine-Tuning with Style-Aligned Response Adjustments
https://arxiv.org/abs/2402.11192
[少即是多]LIMA:Less Is More for Alignment 论文解读
https://zhuanlan.zhihu.com/p/641934152
LIMA: Less Is More for Alignment
https://arxiv.org/abs/2305.11206
Learning or Self-aligning? Rethinking Instruction Fine-tuning
https://arxiv.org/abs/2402.18243
SelectIT: Selective Instruction Tuning for Large Language Models via Uncertainty-Aware Self-Reflection
https://arxiv.org/abs/2402.16705
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
https://arxiv.org/abs/2402.10110
配置不同的学习率,LoRA还能再涨一点?
https://kexue.fm/archives/10001
[林知/术] 全参数微调LLaMA-2-70B备忘
https://zhuanlan.zhihu.com/p/666613055
微软发布了开源大模型WizardLM-2 8X22B,目前开源模型性能最佳,吊打一众开源和闭源大模型
https://zhuanlan.zhihu.com/p/692725484
https://huggingface.co/alpindale/WizardLM-2-8x22B
https://web.archive.org/web/20240415221214/https://wizardlm.github.io/WizardLM2/
浅谈大模型 SFT 的实践落地: 10 问 10 答
https://zhuanlan.zhihu.com/p/692892489
How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition
https://arxiv.org/abs/2310.05492
论文:How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition
https://zhuanlan.zhihu.com/p/682569860
大模型SFT数据精选方法串讲:IFD、Supperfiltering、MoDS、CaR、Nuggets 与 LESS
https://zhuanlan.zhihu.com/p/692647330
使用 LoRA 和 QLoRA 微调LLM:数百次实验的见解
https://zhuanlan.zhihu.com/p/679172768
细数RAG的12个痛点,英伟达高级架构师亲授解决方案
https://zhuanlan.zhihu.com/p/706873537
深入理解语言模型的困惑度(perplexity)
https://zhuanlan.zhihu.com/p/686808564
Token-Efficient Leverage Learning in Large Language Models
https://zhuanlan.zhihu.com/p/702811649
Token-Efficient Leverage Learning in Large Language Models
https://arxiv.org/abs/2404.00914
一行代码提高大模型10%性能,开发者:免费午餐
https://www.163.com/dy/article/IHLEO52V0511DSSR.html
NEFTune:在Embedding中加入噪⾳提⾼指令微调效果!
https://zhuanlan.zhihu.com/p/674860732
大模型SFT数据精选方法串讲:IFD、Supperfiltering、MoDS、CaR、Nuggets 与 LESS
https://zhuanlan.zhihu.com/p/692647330
InsTag开放式标签集,量化SFT数据集的多样性和复杂性
https://zhuanlan.zhihu.com/p/707449818
SFT Packing详解
https://zhuanlan.zhihu.com/p/707329908
数据合成系列I deepmind数据合成综述文章Best Practices and Lessons Learned on Synthetic Data for Language Models
https://zhuanlan.zhihu.com/p/692294597
数据合成系列II Qwen2 SFT/Post-Training数据合成方法
https://zhuanlan.zhihu.com/p/709981346
数据合成系列III llama3.1 Post-Training数据合成方法
https://zhuanlan.zhihu.com/p/712416960
浅谈领域模型训练
https://zhuanlan.zhihu.com/p/711537210
合成数据缺陷分析与缓解策略:优化基于合成数据的大语言模型训练
https://zhuanlan.zhihu.com/p/705784458
好样本,事半功倍:使用样本设计工程 (Sample Design Engineering) 来构造更好的LLM下游微调样本
https://zhuanlan.zhihu.com/p/693933168
Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs
https://arxiv.org/abs/2404.13033
探讨大模型预训练与微调之间的相互作用
https://zhuanlan.zhihu.com/p/714896257
Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models
https://arxiv.org/abs/2408.06663
LLM合成LLM训练数据方法小调研
https://zhuanlan.zhihu.com/p/685432166
LLM数学性能暴涨168%,微软14人团队力作!合成数据2.0秘诀曝光,智能体生成教学
https://zhuanlan.zhihu.com/p/715334077
AgentInstruct: Toward Generative Teaching with Agentic Flows
https://arxiv.org/abs/2407.03502
大模型Finetune论文&方法总结
https://zhuanlan.zhihu.com/p/669645171
大模型微调(八):SFT for Alignment 总结纪要
https://zhuanlan.zhihu.com/p/717553974
【自然语言处理】【大模型】SPIN:Self-Paly微调将弱模型转换为强模型
https://zhuanlan.zhihu.com/p/683872342
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
https://arxiv.org/abs/2401.01335
https://github.com/uclaml/SPIN/tree/main
交叉熵损失函数
https://zhuanlan.zhihu.com/p/582071348