Skip to content

feat: enable GRPO training with logprobs from offline trajectory data #830

feat: enable GRPO training with logprobs from offline trajectory data

feat: enable GRPO training with logprobs from offline trajectory data #830

Annotations

1 error

quality-checks

failed Dec 6, 2025 in 1m 47s