Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Adding support for different losses which are now supported by Liger
#3815 opened Jul 31, 2025 by Manan17 Loading…
1 of 5 tasks
fix CI docs and grpo slow test
#3814 opened Jul 30, 2025 by kashif Loading…
Add GSPO script examples (VLM/LLM)
#3810 opened Jul 30, 2025 by sergiopaniego Loading…
5 tasks
Rloo final
#3801 opened Jul 29, 2025 by shirinyamani Loading…
5 tasks
add xpu support for mergekit
#3800 opened Jul 29, 2025 by yao-matrix Loading…
GSPO parameters update from v2
#3798 opened Jul 29, 2025 by BounharAbdelaziz Loading…
3 tasks
Add dataset mixer
#3791 opened Jul 28, 2025 by lewtun Loading…
1 of 5 tasks
[GRPO] update transformer version for CB
#3786 opened Jul 28, 2025 by kashif Loading…
Add vLLM server mode support to OnlineDPOTrainer
#3783 opened Jul 27, 2025 by vaelev Loading…
6 tasks done
Add AlphaPO Trainer
#3776 opened Jul 26, 2025 by qingquansong Loading…
3 of 5 tasks
Dynamic sampling option in GRPO trainer based on DAPO paper
#3758 opened Jul 23, 2025 by almeidava93 Loading…
2 of 5 tasks
Support dLLM in GRPO reference model creation
#3743 opened Jul 18, 2025 by xijia-tao Loading…
Add basic support for FSDP/Lora when using TRL/VLLM
#3735 opened Jul 14, 2025 by ojh31 Loading…
5 tasks
[WIP] Fix ppo example accelerator initialization error
#3732 opened Jul 14, 2025 by ccs96307 Draft
2 of 5 tasks
[SFT] Dry up the sft tests
#3657 opened Jun 27, 2025 by kashif Loading…
5 tasks
feat: Initial implementation of RePO trainer and components
#3655 opened Jun 26, 2025 by celsowm Loading…
5 tasks
Ensure Chat Template Safe Prompt Truncation
#3646 opened Jun 25, 2025 by pramodith Loading…
4 of 5 tasks
[WIP] vllm-server-spec-dec-support
#3643 opened Jun 24, 2025 by shirinyamani Loading…
5 tasks
GRPO: Pack Responses within the same group.
#3642 opened Jun 24, 2025 by pramodith Draft
4 of 5 tasks
Add Entropy Control to GRPOTrainer
#3628 opened Jun 22, 2025 by 1485840691 Loading…
Feature: Add SGLang support for GRPO Trainer
#3627 opened Jun 21, 2025 by PrinsYin Draft
5 tasks
ProTip! Adding no:label will show everything without a label.