Upstream branch main (revision 5f38bcab) #10

apolo-developer · 2025-02-08T12:03:35Z

Integrating latest changes from hiyouga/LLaMA-Factory branch main

5f38bca [deps] upgrade vllm (hiyouga#6857)
40048ab fix qwen2vl plugin (hiyouga#6855)
74ade3a [misc] allow extra args (hiyouga#6831)
6dad536 [assets] update wechat (hiyouga#6830)
24c7842 [model] support audio (hiyouga#6701)
a5e943f [data] allow thought in function call (hiyouga#6797)
e2dc5b9 [misc] update license year & fix llama pro (hiyouga#6814)
dd6b7d2 [data] fix qwen tool template (hiyouga#6796)
ab9bd06 [data] fix minicpmv plugin (hiyouga#6801)
069a477 [assets] update wechat (hiyouga#6810)
a417bcf [readme] update flash attention installation instruction on win platform (hiyouga#6788)
b5fda21 [misc] update workflows (hiyouga#6787)
94803d8 [model] add mistral small models (hiyouga#6786)
999c7c8 [model] add qwen2.5 vl models (hiyouga#6779)
15357cd [breaking] support transformers 4.48 (hiyouga#6628)
45e68b9 [webui] improve webui & reasoning mode (hiyouga#6778)
4fb6059 [assets] update wechat (hiyouga#6771)
28417f8 [model] add deepseek-R1 & show think process (hiyouga#6767)
0f45982 fix: avoid redundant normalization in DPO's SFT loss calculation (hiyouga#6722)
de9bc3f [webui] support ja (hiyouga#6698)
3962645 [assets] update wechat (hiyouga#6710)
1f47b61 [model] support yarn (hiyouga#6693)
17b4706 [assets] update wechat (hiyouga#6692)
c0caa7a [misc] update mm plugin (hiyouga#6691)
77bbf65 disable valset by default (hiyouga#6690)
4d0f662 [webui] upgrade to gradio 5 (hiyouga#6688)
7bf09ab fix qwen2 moe (hiyouga#6684)
0279427 [data] Fix minicpmv/o dpo training (hiyouga#6657)
76675b6 Update val_size english description (hiyouga#6653)
563be22 update readme (hiyouga#6648)
7a04021 [optim] clean apollo (hiyouga#6645)
d9189f9 [optim] add support to APOLLO (hiyouga#6617)
9b7ba09 update readme of MiniCPM-o (hiyouga#6642)
1278c3e lint (hiyouga#6641)
deacc00 Support InternLM3 Dense 8B Model (hiyouga#6640)
58d029f Fix tokenizer max length (hiyouga#6632)
158a127 Support Inference of MiniCPM-V-2.6 and MiniCPM-o-2.6 (hiyouga#6631)
98189c8 [model] fix mllama any image (hiyouga#6637)
1c7663d pin vllm version to 0.6.5 (hiyouga#6629)
c3fda50 Support new features of MiniCPM-V (hiyouga#6626)
e3e2c8c [inference] fix stop token for object detection (hiyouga#6624)
03de5ac add nf4 qlora support on Ascend NPU (hiyouga#6601)
3077f20 Fix template name of MiniCPM-V (hiyouga#6620)
6eec50c Merge pull request hiyouga#6598 from BUAADreamer/minicpmv
a019cec remove tests
c2fa4cc fix tests
0cc7260 fix style
cfaa8e4 fix system prompt and tests
01e9cfd add some
1007331 add cpm_o test
c506f76 add cpm_o test
7b44f31 fix format
a650e11 add some
291384d adapt to new mllm_param
ed0895a Merge branch 'main' into minicpmv
382e932 Merge pull request hiyouga#6600 from hiyouga/hiyouga/refactor_mllm_param
f6f630a refactor mllm param logic
e45329e add minicpmv2.6
771cc80 add some
ae1f528 add some
15bbcdf fix some
d090320 fix version
2ee8ba2 fix some
84026be tiny fix
fc045d7 Merge branch 'main' into minicpmv
096a6cb add some
b308ddf Merge pull request hiyouga#6597 from hiyouga/hiyouga/upd_wechat
70ed03b update wechat
5ffd8ad Merge pull request hiyouga#6588 from hiyouga/hiyouga/upd_issue_temp
aa8d0a2 update issue template
8b209cb Merge pull request hiyouga#6585 from hiyouga/hiyouga/add_phi4
ae16ea7 improve template, add phi4 model
6b34b69 Merge pull request hiyouga#6564 from stephen-nju/fix_ray
1843152 Merge pull request hiyouga#6565 from hiyouga/hiyouga/improve_log
9c4c848 fix �get ray args when args not a dict
47e17dd imporve log
d23a988 Merge pull request hiyouga#6542 from erictang000/et/ray-integration
c46675d fix llamaboard with ray
d8cac6f refactor ray integration, support save ckpt
1e8e7be run style check
163ddb6 drafting ray integration
c973f32 Merge pull request hiyouga#6547 from hiyouga/hiyouga/fix_pixtral_dpo
870f23d fix hiyouga#6546
785cc70 add some
b832ed9 Merge pull request hiyouga#6528 from hiyouga/hiyouga/upd_wechat
cd14336 update wechat
ab87bd6 Merge branch 'hiyouga:main' into minicpmv
79c2d70 add some
e6d603a Merge pull request hiyouga#6524 from hiyouga/hiyouga/upd_scripts
dd44c65 update scripts
51ef90c Merge pull request hiyouga#6515 from hiyouga/hiyouga/misc
4b8add7 update model name
a766cad Merge pull request hiyouga#6514 from hiyouga/hiyouga/add_project
29ddc6b Merge pull request hiyouga#6513 from hiyouga/hiyouga/add_gpt2
b3e1137 add project
67442bd add gpt2 model
72d86ec Merge pull request hiyouga#6512 from hiyouga/hiyouga/fix_gen_logic
8741e5b Merge pull request hiyouga#6462 from shibingli/main
1800f8c fix hiyouga#6499
f8e80d5 Merge pull request hiyouga#6493 from hiyouga/hiyouga/upd_wechat
a400d89 update wechat
2382a5f Merge pull request hiyouga#6492 from hiyouga/hiyouga/add_deepseek3
e67b9dc add deepseek3 model
91467ed Merge pull request hiyouga#5507 from piamo/main
40805b0 Merge pull request hiyouga#6483 from hiyouga/hiyouga/fix_paligemma_infer
6f5bb3b fix hiyouga#6482
b558902 Merge pull request hiyouga#6465 from hiyouga/hiyouga/fix_eval_loss
2719867 fix hiyouga#6448
f1d7678 Add ARG HTTP_PROXY in Dockerfile to support HTTP proxy during image building.
a3a49b1 Add ARG HTTP_PROXY in Dockerfile to support HTTP proxy during image building.This commit introduces an ARG parameter named HTTP_PROXY in the Dockerfile. This addition allows for the configuration of an HTTP proxy, facilitating image building in environments with network restrictions.
f68074d Merge pull request hiyouga#6457 from youkaichao/module-run
c39d81c Update cli.py
cd56f88 Merge pull request hiyouga#6443 from hiyouga/hiyouga/add_qvq
ee0e400 add qvq hiyouga#6439
cbd494d Merge pull request hiyouga#6430 from hiyouga/hiyouga/upd_wechat
83202c9 update wechat
b9f73fc Merge pull request hiyouga#6426 from hiyouga/hiyouga/update_readme
8fd38d2 update readme
c23a4d0 Merge pull request hiyouga#5922 from Tuyohai/main
d58746e Merge pull request hiyouga#6418 from hiyouga/hiyouga/add_report
5111cac support report custom args
84cd118 fix paligemma infer
a2ad073 Merge pull request hiyouga#6416 from Zeyi-Lin/main
744ef8c docs: use swanlab
947e22a Merge pull request hiyouga#6401 from Zeyi-Lin/hiyouga/swanlab
82e5d75 fix: project blank
3a7ea20 fix: by hiyouga suggestion
5f6dafd feat: ui improve
0a52962 fix: text
d0eb64d fix: bugs
c6e3c14 Merge pull request hiyouga#6395 from hiyouga/hiyouga/fix_genkwargs
7eb49e5 docs: config framework
3306919 fix: string
d4c1fda fix hiyouga#6391
8c2df41 feat: optimize frontend
d5cf879 feat: swanlab params
ffbb4db Merge pull request hiyouga#6388 from hiyouga/hiyouga/shuffle_control
c7cedc7 support disable shuffling
96f8f10 add swanlab
6ccd64e Merge pull request hiyouga#6384 from hiyouga/hiyouga/fix_webui
369cca8 fix webui
933647e Merge pull request hiyouga#6379 from hiyouga/hiyouga/add_paligemma2
d350905 add paligemma2
015f213 Merge pull request hiyouga#6313 from ge-xing/main
af33627 Merge pull request hiyouga#6369 from hiyouga/hiyouga/template
9879585 support qwen tool format
bcc413c change default replace jinja to false
2fad379 Merge pull request hiyouga#5473 from AlongWY/mistral
115924a Support Mistral format tools
8974a0a Merge pull request hiyouga#6368 from hiyouga/hiyouga/fix_llama_template
df5655f fix llama3 tool template
e12c80a Merge pull request hiyouga#6367 from hiyouga/hiyouga/add_model
b24ae55 support llama3 tool prompt
2a832e4 Merge pull request hiyouga#5819 from yafshar/remote_code
1c8ad22 Add missing key to init_kwargs
0943776 Add trust_remote_code parameter and remove True
04f19ed support telechat2 model
a665ad6 Merge pull request hiyouga#6364 from hiyouga/hiyouga/control_reenterent_gc
f319da6 support non-reenterent-gc & fix hiyouga#6358
6973828 Merge pull request hiyouga#6363 from hiyouga/hiyouga/control_skip_eos
eda76de support control eos, fix hiyouga#6345
9708a39 Merge pull request hiyouga#6362 from hiyouga/hiyouga/mllm_packing
2d107d3 generalized packing & fix hiyouga#6343
81815f0 Merge pull request hiyouga#6359 from hiyouga/hiyouga/fix_qwen2vl_infer
142191e fix hiyouga#6348
e2fbd07 Merge pull request hiyouga#6334 from hiyouga/hiyouga/add_examples
7059055 update assets
2811814 fix mrope
bcb4fb3 Merge pull request hiyouga#6253 from hiyouga/hiyouga/qwen2vl_mm_proj
99c6266 support qwen2vl train proj only
561a8e5 Merge pull request hiyouga#6251 from hiyouga/hiyouga/vllm_qwen2vl_infer
207f8b0 support qwen2vl vllm infer
967a6c1 Merge pull request hiyouga#6246 from hiyouga/hiyouga/update_examples
e5584dc update examples
c42890b Merge pull request hiyouga#6242 from hiyouga/hiyouga/fix_script
eb3e147 fix scripts
cf29846 Merge pull request hiyouga#6160 from village-way/pr_dataloader
6a5074e lint
8328bd8 Merge pull request hiyouga#6238 from hiyouga/hiyouga/vllm_batchinfer
1324d15 support batch infer in vllm
dc78355 Merge pull request hiyouga#6190 from JieShenAI/main
263cb82 Merge pull request hiyouga#6170 from hykilpikonna/main
1874022 Merge pull request hiyouga#6233 from hiyouga/hiyouga/vlm_zero3
dbb9e5b fix vlm zero3 training
7965e98 Merge pull request hiyouga#6224 from hiyouga/hiyouga-patch-1
722a396 update wechat
4c61368 add async call api
961e8c2 add vllm_infer script
6554cde [U] Compute hostname differently
f2b2a37 Merge pull request hiyouga#6175 from hiyouga/hiyouga/add_qwq
68a6121 add qwq
dfb953b [+] Show the hostname
4424d4d fix:tokenized_path not None and load_from_disk return Dataset Trigger stuck
86f4151 Merge pull request hiyouga#6156 from hiyouga/hiyouga/add_o1
046b6fb fix dataset
ec9ff8c add skywork o1
b7c7f30 Merge remote-tracking branch 'origin/main' into hiyouga/add_o1
14d0d92 Merge pull request hiyouga#6157 from hiyouga/hiyouga/fix_ci
b7d4cf2 pin tokenizers version
17afb7d add marco-o1 and openo1 dataset
b26c490 Merge pull request hiyouga#6152 from hiyouga/hiyouga/add_num_proc_in_data_load
88f087c Merge pull request hiyouga#6151 from hiyouga/hiyouga/fix_mllama
362d579 fix hiyouga#6149
598c22e fix mllama cross_mask
6eefb4d support granite3 models
ee059c3 Add deepseek-v2.5 template

[model] fix mllama cross mask

…data_load [data] add num_proc in load_dataset

[ci] pin tokenizers version

[data&model] add marco-o1, skywork-o1 and openo1

… stuck

[model] add QwQ

[assets] chore: update wechat

[data] fix vlm zero3 training

[+] Show the hostname in webui title

add vllm_infer script

[infer] feat: support batch infer in vllm

fix:tokenized_path not None and load_from_disk return Dataset Trigger…

* Update `val_size` Description in locales.py * Update `val_size` Description in data_args.py * Remove extra space in data_args.py

* fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv * update readme * support dpo of minicpmv

* add support for japanese language * add support for japanese language --------- Co-authored-by: engchina <[email protected]>

…ouga#6722)

…orm (hiyouga#6788) * Update README_zh.md * Update README.md

* fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv * update readme * support dpo of minicpmv * update init audio * update init audio * [model]fix image process in minicpmo

* Update tool_utils.py * fix unittest --------- Co-authored-by: hoshi-hiyouga <[email protected]>

* fix llamapro script * change year

* Update template.py * Update template.py * use formatter * fix regex --------- Co-authored-by: hiyouga <[email protected]>

* support qwen2_audio * improve code * lint * fix * fix * fix --------- Co-authored-by: hiyouga <[email protected]>

huangpan.foo and others added 30 commits September 21, 2024 19:33

Add deepseek-v2.5 template

ee059c3

support granite3 models

6eefb4d

fix mllama cross_mask

598c22e

fix hiyouga#6149

362d579

Merge pull request hiyouga#6151 from hiyouga/hiyouga/fix_mllama

88f087c

[model] fix mllama cross mask

Merge pull request hiyouga#6152 from hiyouga/hiyouga/add_num_proc_in_…

b26c490

…data_load [data] add num_proc in load_dataset

add marco-o1 and openo1 dataset

17afb7d

pin tokenizers version

b7d4cf2

Merge pull request hiyouga#6157 from hiyouga/hiyouga/fix_ci

14d0d92

[ci] pin tokenizers version

Merge remote-tracking branch 'origin/main' into hiyouga/add_o1

b7c7f30

add skywork o1

ec9ff8c

fix dataset

046b6fb

Merge pull request hiyouga#6156 from hiyouga/hiyouga/add_o1

86f4151

[data&model] add marco-o1, skywork-o1 and openo1

fix:tokenized_path not None and load_from_disk return Dataset Trigger…

4424d4d

… stuck

[+] Show the hostname

dfb953b

add qwq

68a6121

Merge pull request hiyouga#6175 from hiyouga/hiyouga/add_qwq

f2b2a37

[model] add QwQ

[U] Compute hostname differently

6554cde

add vllm_infer script

961e8c2

add async call api

4c61368

update wechat

722a396

Merge pull request hiyouga#6224 from hiyouga/hiyouga-patch-1

7965e98

[assets] chore: update wechat

fix vlm zero3 training

dbb9e5b

Merge pull request hiyouga#6233 from hiyouga/hiyouga/vlm_zero3

1874022

[data] fix vlm zero3 training

Merge pull request hiyouga#6170 from hykilpikonna/main

263cb82

[+] Show the hostname in webui title

Merge pull request hiyouga#6190 from JieShenAI/main

dc78355

add vllm_infer script

support batch infer in vllm

1324d15

Merge pull request hiyouga#6238 from hiyouga/hiyouga/vllm_batchinfer

8328bd8

[infer] feat: support batch infer in vllm

lint

6a5074e

Merge pull request hiyouga#6160 from village-way/pr_dataloader

cf29846

fix:tokenized_path not None and load_from_disk return Dataset Trigger…

hiyouga and others added 30 commits January 15, 2025 11:06

update readme (hiyouga#6648)

563be22

Update val_size english description (hiyouga#6653)

76675b6

* Update `val_size` Description in locales.py * Update `val_size` Description in data_args.py * Remove extra space in data_args.py

[data] Fix minicpmv/o dpo training (hiyouga#6657)

0279427

* fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv * update readme * support dpo of minicpmv

fix qwen2 moe (hiyouga#6684)

7bf09ab

[webui] upgrade to gradio 5 (hiyouga#6688)

4d0f662

disable valset by default (hiyouga#6690)

77bbf65

[misc] update mm plugin (hiyouga#6691)

c0caa7a

[assets] update wechat (hiyouga#6692)

17b4706

[model] support yarn (hiyouga#6693)

1f47b61

[assets] update wechat (hiyouga#6710)

3962645

[webui] support ja (hiyouga#6698)

de9bc3f

* add support for japanese language * add support for japanese language --------- Co-authored-by: engchina <[email protected]>

fix: avoid redundant normalization in DPO's SFT loss calculation (hiy…

0f45982

…ouga#6722)

[model] add deepseek-R1 & show think process (hiyouga#6767)

28417f8

[assets] update wechat (hiyouga#6771)

4fb6059

[webui] improve webui & reasoning mode (hiyouga#6778)

45e68b9

[breaking] support transformers 4.48 (hiyouga#6628)

15357cd

[model] add qwen2.5 vl models (hiyouga#6779)

999c7c8

[model] add mistral small models (hiyouga#6786)

94803d8

[misc] update workflows (hiyouga#6787)

b5fda21

[readme] update flash attention installation instruction on win platf…

a417bcf

…orm (hiyouga#6788) * Update README_zh.md * Update README.md

[assets] update wechat (hiyouga#6810)

069a477

[data] fix minicpmv plugin (hiyouga#6801)

ab9bd06

* fix template name * tiny fix * support minicpm-o-2.6 * support inference of minicpmv * update readme * support dpo of minicpmv * update init audio * update init audio * [model]fix image process in minicpmo

[data] fix qwen tool template (hiyouga#6796)

dd6b7d2

* Update tool_utils.py * fix unittest --------- Co-authored-by: hoshi-hiyouga <[email protected]>

[misc] update license year & fix llama pro (hiyouga#6814)

e2dc5b9

* fix llamapro script * change year

[data] allow thought in function call (hiyouga#6797)

a5e943f

* Update template.py * Update template.py * use formatter * fix regex --------- Co-authored-by: hiyouga <[email protected]>

[model] support audio (hiyouga#6701)

24c7842

* support qwen2_audio * improve code * lint * fix * fix * fix --------- Co-authored-by: hiyouga <[email protected]>

[assets] update wechat (hiyouga#6830)

6dad536

[misc] allow extra args (hiyouga#6831)

74ade3a

fix qwen2vl plugin (hiyouga#6855)

40048ab

[deps] upgrade vllm (hiyouga#6857)

5f38bca

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upstream branch main (revision 5f38bcab) #10

Upstream branch main (revision 5f38bcab) #10

apolo-developer commented Feb 8, 2025

Upstream branch main (revision 5f38bcab) #10

Are you sure you want to change the base?

Upstream branch main (revision 5f38bcab) #10

Conversation

apolo-developer commented Feb 8, 2025