Skip to content

feat: enable GRPO training with logprobs from offline trajectory data #829

feat: enable GRPO training with logprobs from offline trajectory data

feat: enable GRPO training with logprobs from offline trajectory data #829

Annotations

1 error

quality-checks

failed Dec 5, 2025 in 1m 44s