Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add Entropy Adaptive Fine Tuning to SFT Trainer
#4802 opened Jan 10, 2026 by electroglyph Loading…
fix(grpo-trainer): init self.args before use
#4801 opened Jan 9, 2026 by carlyou Loading…
1 of 5 tasks
Remove DbrxForCausalLM support
#4799 opened Jan 9, 2026 by qgallouedec Loading…
Updat examples to new OpenEnv version
#4796 opened Jan 9, 2026 by sergiopaniego Draft
5 tasks
forward_masked_logits in SFTTrainer
#4794 opened Jan 8, 2026 by qgallouedec Draft
5 tasks
fix xpu vllm client server
#4780 opened Jan 7, 2026 by jiqing-feng Loading…
Set dtype default to float32
#4778 opened Jan 6, 2026 by albertvillanova Loading…
Add reward shaping to PPOTrainer
#4774 opened Jan 5, 2026 by derivative2002 Loading…
5 tasks
make dpo compatible with qwen3vl
#4773 opened Jan 4, 2026 by flutist Loading…
Add a config to limit the number of tool calling iterations.
#4761 opened Dec 29, 2025 by pramodith Loading…
4 of 5 tasks
Extend CLI to orpo trainer
#4757 opened Dec 27, 2025 by murilo-cunha Loading…
3 of 5 tasks
fix: handle None eval_dataset in example code
#4756 opened Dec 27, 2025 by ciaoyizhen Loading…
1 of 4 tasks
perf: avoid output_hidden_states when only last_hidden_state is used
#4755 opened Dec 27, 2025 by ciaoyizhen Loading…
2 of 5 tasks
vllm parameter passthrough for stop sequences
#4754 opened Dec 26, 2025 by kdubovikov Loading…
Clarify Accelerate usage in SFTTrainer documentation
#4744 opened Dec 23, 2025 by Likhita-17 Loading…
1 task done
fix minillm trainer
#4743 opened Dec 23, 2025 by t1101675 Loading…
5 tasks
[GRPOTrainer]: Agent Training Supports Async Tool Calls
#4742 opened Dec 23, 2025 by pramodith Loading…
5 tasks done
ProTip! Exclude everything labeled bug with -label:bug.