大模型微调

Stable Diffusion爱好者常说的LoRa是什么？
https://zhuanlan.zhihu.com/p/610031713

论文阅读：LORA-大型语言模型的低秩适应
https://zhuanlan.zhihu.com/p/611557340

Alpaca-lora 在消费级显卡上训练你自己的 ChatGPT！
https://zhuanlan.zhihu.com/p/614913980
https://github.com/tloen/alpaca-lora

LoCon相对于LoRA的改进
https://zhuanlan.zhihu.com/p/612133434

QLoRA：一种高效LLMs微调方法，48G内存可调65B 模型，调优模型Guanaco 堪比Chatgpt的99.3%
https://zhuanlan.zhihu.com/p/632229856

QLoRA的实测记录
https://zhuanlan.zhihu.com/p/632398047

开源原驼（Guanaco）及背后的QLoRA技术，将微调65B模型的显存需求从780GB以上降低到48GB以下，效果直逼GPT-4，技术详解
https://zhuanlan.zhihu.com/p/632236718
QLoRA: Efficient Finetuning of Quantized LLMs
https://arxiv.org/abs/2305.14314

如何从数据集中自动识别高质量的指令数据-IFD指标的使用
https://zhuanlan.zhihu.com/p/658128530
https://arxiv.org/abs/2308.12032
https://github.com/MingLiiii/Cherry_LLM

Firefly(流萤): 中文对话式大语言模型
https://www.shangyexinzhi.com/article/7399473.html

【OpenLLM 007】大模型炼丹术之小参数撬动大模型-万字长文全面解读PEFT参数高效微调技术
https://zhuanlan.zhihu.com/p/625502729

deepspeed入门教程
https://zhuanlan.zhihu.com/p/630734624
DeepSpeed介绍
https://zhuanlan.zhihu.com/p/624412809
DeepSpeed之ZeRO系列：将显存优化进行到底
https://basicv8vc.github.io/posts/zero/
https://www.deepspeed.ai/training/
LLM 学习笔记-Deepspeed-MoE 论文
https://zhuanlan.zhihu.com/p/670968683


《Universal Language Model Fine-tuning for Text Classification》论文笔记
https://blog.csdn.net/weixin_44815943/article/details/123870564

微软也搞起了开源小模型！利用 OpenAI 的 ChatGPT 和 GPT-4 训练，实力碾压当前最强开源模型
https://zhuanlan.zhihu.com/p/639212768

O-LoRA: 针对LLM的“灾难遗忘”解决方案
https://zhuanlan.zhihu.com/p/663034986

中文LLaMA&Alpaca大语言模型词表扩充+预训练+指令精调
https://zhuanlan.zhihu.com/p/631360711

Instruction Tuning with Human Curriculum
https://arxiv.org/abs/2310.09518

EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling
https://www.semanticscholar.org/paper/EMO%3A-Earth-Mover-Distance-Optimization-for-Language-Ren-Wu/36b88e6cf9cd5fa4809602f365287cb2201f8350

InstructionGPT-4: A 200-Instruction Paradigm for Fine-Tuning MiniGPT-4
https://arxiv.org/abs/2308.12067

OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
https://arxiv.org/abs/2309.11235

微软发布Orca2，“调教式”教会小规模大语言模型如何推理！
https://zhuanlan.zhihu.com/p/670516349
Orca 2: Teaching Small Language Models How to Reason
https://arxiv.org/abs/2311.11045


如何从数据集中自动识别高质量的指令数据-IFD指标的使用
https://zhuanlan.zhihu.com/p/658128530

[林知/术] 全参数微调LLaMA-2-70B备忘
https://zhuanlan.zhihu.com/p/666613055
https://arxiv.org/abs/2308.12032


FSDP(Fully Sharded Data Parallel)
https://blog.csdn.net/studyeboy/article/details/133888212
Fully Sharded Data Parallel: faster AI training with fewer GPUs
https://engineering.fb.com/2021/07/15/open-source/fsdp/
Facebook推出数据并行训练算法FSDP：采用更少的GPU，更高效地训练更大数量级的模型
https://aif.amtbbs.org/index.php/2021/08/23/383/

Megatron-LM论文阅读笔记
https://zhuanlan.zhihu.com/p/631030756
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
https://arxiv.org/abs/1909.08053
[细读经典]Megatron论文和代码详细分析(1)
https://zhuanlan.zhihu.com/p/366906920
Megatron-LM：使用模型并行训练数十亿参数的语言模型
https://zhuanlan.zhihu.com/p/644493033
Megatron-LM 第三篇Paper总结——Sequence Parallelism & Selective Checkpointing
https://zhuanlan.zhihu.com/p/522198082

论文解读系列第四篇：谷歌GPipe训练超大规模神经网络
https://zhuanlan.zhihu.com/p/113233933
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
https://arxiv.org/abs/1811.06965

SE-MoE：可拓展分布式MoE训练及推理框架（百度）
https://blog.csdn.net/cold_code486/article/details/133683242
SE-MoE: A Scalable and Efficient Mixture-of-Experts Distributed Training and Inference System
https://arxiv.org/abs/2205.10034

GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding 论文翻译与精读
https://zhuanlan.zhihu.com/p/672837901
如何超越数据并行和模型并行：从GShard谈起
https://mp.weixin.qq.com/s?__biz=MzU5ODY2MTk3Nw==&mid=2247486137&idx=1&sn=fa429fd4a94a6b815199c9a294276f59&chksm=fe41848fc9360d99b46b20b8bce3e36d7ee981d5e3c64860fe925a0af97de701fde284c1f3a0&scene=21#wechat_redirect
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
https://arxiv.org/abs/2006.16668


Textbooks Are All You Need 论文精读与翻译
https://zhuanlan.zhihu.com/p/673040548
Textbooks Are All You Need II: phi-1.5 technical report 精读与翻译
https://zhuanlan.zhihu.com/p/673021932

大模型如何在指令微调过程中构造或筛选高质量数据？ - 刘聪NLP的回答 - 知乎
https://www.zhihu.com/question/623570103/answer/3224726082

WizardLM：赋予大型语言模型遵循复杂指令的能力
https://zhuanlan.zhihu.com/p/643162614

独家采访WizardLM团队，详解WizardCoder/Math超越GPT4/ChatGPT的RLEIF算法
https://it.sohu.com/a/715204130_121119001


【LLM】Ziya2: 数据中心化学习是所有LLM需要的(Ziya2: Data-centric Learning is All LLMs Need)
https://zhuanlan.zhihu.com/p/665614074
Ziya2: Data-centric Learning is All LLMs Need
https://arxiv.org/abs/2311.03301


如何从数据集中自动识别高质量的指令数据-IFD指标的使用
https://zhuanlan.zhihu.com/p/658128530


大模型背景下如何自动筛选高质量的指令数据
https://zhuanlan.zhihu.com/p/672468811
MoDS: Model-oriented Data Selection for Instruction Tuning
https://arxiv.org/abs/2311.15653

DEITA-大模型指令微调的数据高效筛选方法
https://zhuanlan.zhihu.com/p/675928711

如何自动筛选高质量的指令微调数据喂给大模型？
https://zhuanlan.zhihu.com/p/671340624


GPT-4使用效果不好？美国奥本大学提出Prompt分类法，另辟蹊径构建Prompt设计指南
https://zhuanlan.zhihu.com/p/644545992

放弃评测大模型，普林斯顿大学已经开始评估Prompt了，提出Prompt评估框架
https://zhuanlan.zhihu.com/p/644546392
InstructEval: Systematic Evaluation of Instruction Selection Methods
https://arxiv.org/abs/2307.00259

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning
https://arxiv.org/abs/2312.15685

Learning to Edit: Aligning LLMs with Knowledge Editing
https://arxiv.org/abs/2402.11905

I Learn Better If You Speak My Language: Enhancing Large Language Model Fine-Tuning with Style-Aligned Response Adjustments
https://arxiv.org/abs/2402.11192


[少即是多]LIMA:Less Is More for Alignment 论文解读
https://zhuanlan.zhihu.com/p/641934152
LIMA: Less Is More for Alignment
https://arxiv.org/abs/2305.11206

Learning or Self-aligning? Rethinking Instruction Fine-tuning
https://arxiv.org/abs/2402.18243

SelectIT: Selective Instruction Tuning for Large Language Models via Uncertainty-Aware Self-Reflection
https://arxiv.org/abs/2402.16705

Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
https://arxiv.org/abs/2402.10110

配置不同的学习率，LoRA还能再涨一点？
https://kexue.fm/archives/10001

[林知/术] 全参数微调LLaMA-2-70B备忘
https://zhuanlan.zhihu.com/p/666613055

微软发布了开源大模型WizardLM-2 8X22B，目前开源模型性能最佳，吊打一众开源和闭源大模型
https://zhuanlan.zhihu.com/p/692725484
https://huggingface.co/alpindale/WizardLM-2-8x22B
https://web.archive.org/web/20240415221214/https://wizardlm.github.io/WizardLM2/

浅谈大模型 SFT 的实践落地： 10 问 10 答
https://zhuanlan.zhihu.com/p/692892489

How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition
https://arxiv.org/abs/2310.05492
论文：How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition
https://zhuanlan.zhihu.com/p/682569860

大模型SFT数据精选方法串讲：IFD、Supperfiltering、MoDS、CaR、Nuggets 与 LESS
https://zhuanlan.zhihu.com/p/692647330

使用 LoRA 和 QLoRA 微调LLM：数百次实验的见解
https://zhuanlan.zhihu.com/p/679172768

细数RAG的12个痛点，英伟达高级架构师亲授解决方案
https://zhuanlan.zhihu.com/p/706873537

深入理解语言模型的困惑度(perplexity)
https://zhuanlan.zhihu.com/p/686808564

Token-Efficient Leverage Learning in Large Language Models
https://zhuanlan.zhihu.com/p/702811649
Token-Efficient Leverage Learning in Large Language Models
https://arxiv.org/abs/2404.00914

一行代码提高大模型10%性能，开发者：免费午餐
https://www.163.com/dy/article/IHLEO52V0511DSSR.html
NEFTune：在Embedding中加入噪⾳提⾼指令微调效果！
https://zhuanlan.zhihu.com/p/674860732

大模型SFT数据精选方法串讲：IFD、Supperfiltering、MoDS、CaR、Nuggets 与 LESS
https://zhuanlan.zhihu.com/p/692647330

InsTag开放式标签集，量化SFT数据集的多样性和复杂性
https://zhuanlan.zhihu.com/p/707449818

SFT Packing详解
https://zhuanlan.zhihu.com/p/707329908

数据合成系列I deepmind数据合成综述文章Best Practices and Lessons Learned on Synthetic Data for Language Models
https://zhuanlan.zhihu.com/p/692294597

数据合成系列II Qwen2 SFT/Post-Training数据合成方法
https://zhuanlan.zhihu.com/p/709981346

数据合成系列III llama3.1 Post-Training数据合成方法
https://zhuanlan.zhihu.com/p/712416960

浅谈领域模型训练
https://zhuanlan.zhihu.com/p/711537210

合成数据缺陷分析与缓解策略：优化基于合成数据的大语言模型训练
https://zhuanlan.zhihu.com/p/705784458

好样本，事半功倍：使用样本设计工程 (Sample Design Engineering) 来构造更好的LLM下游微调样本
https://zhuanlan.zhihu.com/p/693933168
Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs
https://arxiv.org/abs/2404.13033

探讨大模型预训练与微调之间的相互作用
https://zhuanlan.zhihu.com/p/714896257
Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models
https://arxiv.org/abs/2408.06663

LLM合成LLM训练数据方法小调研
https://zhuanlan.zhihu.com/p/685432166

LLM数学性能暴涨168%，微软14人团队力作！合成数据2.0秘诀曝光，智能体生成教学
https://zhuanlan.zhihu.com/p/715334077
AgentInstruct: Toward Generative Teaching with Agentic Flows
https://arxiv.org/abs/2407.03502


大模型Finetune论文&方法总结
https://zhuanlan.zhihu.com/p/669645171

大模型微调（八）：SFT for Alignment 总结纪要
https://zhuanlan.zhihu.com/p/717553974

【自然语言处理】【大模型】SPIN：Self-Paly微调将弱模型转换为强模型
https://zhuanlan.zhihu.com/p/683872342
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
https://arxiv.org/abs/2401.01335
https://github.com/uclaml/SPIN/tree/main

交叉熵损失函数
https://zhuanlan.zhihu.com/p/582071348