Skip to content

feat: enable GRPO training with logprobs from offline trajectory data #812

feat: enable GRPO training with logprobs from offline trajectory data

feat: enable GRPO training with logprobs from offline trajectory data #812

Annotations

1 error

quality-checks

failed Dec 1, 2025 in 1m 41s