# Summary Adding this parent issue tracking all the outstanding issues with the tinker API server in SkyRL ## SkyRL-Train Backend - [ ] Support PPO loss with Tinker - Support critic model in `SkyRLTrainBackend` https://github.com/NovaSky-AI/SkyRL/pull/1389 - [ ] KL Penalty is not supported - Support ref model in `SkyRLTrainBackend` - [ ] Support logprobs with the `sample()` API. - Provide an example for using truncated importance sampling with the tinker API server. - [ ] Better validation for configuration parameters to the `SkyRLTrainBackend` https://github.com/NovaSky-AI/SkyRL/issues/1279 - [ ] Set LR outside of workers in skyrl-train backend: https://github.com/NovaSky-AI/SkyRL/issues/998 - [ ] Add `sample` API to the new inference codepath: https://github.com/NovaSky-AI/SkyRL/issues/1286 - [ ] Add `renderer` to the new inference codepath: https://github.com/NovaSky-AI/SkyRL/issues/1288 ## Jax Backend - [ ] Improve the memory handling and reporting to make it easier to see where the memory goes - [ ] Add multimodal support to APIs and add support for a VLM model - [ ] Support using Ray to manage tinker API server, engine and Jax workers #1393
Summary
Adding this parent issue tracking all the outstanding issues with the tinker API server in SkyRL
SkyRL-Train Backend
- Support critic model in
SkyRLTrainBackend[tinker] Support PPO loss with Tinker and add critic model in SkyRLTrainBackend #1389- Support ref model in
SkyRLTrainBackendsample()API.- Provide an example for using truncated importance sampling with the tinker API server.
SkyRLTrainBackend[tinker] Implement better validation for CLI parameters with the skyrl-train backend #1279sampleAPI to the new inference codepath: [tinker] Sample API in new inference server #1286rendererto the new inference codepath: [SkyRL][tinker] Add renderer endpoint to inference client #1288Jax Backend