Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
Scaling pretraining affects RL sample efficiency (runrl.com)
1 point by ag8 48 days ago | past
Training Qwen to answer briefly yet intelligently using feedback control (runrl.com)
4 points by ag8 80 days ago | past
Launch HN: RunRL (YC X25) – Reinforcement learning as a service (runrl.com)
71 points by ag8 85 days ago | past | 22 comments
Generating the Funniest Joke with RL (runrl.com)
1 point by ag8 7 months ago | past
Why Run RL? How specialized models can outperform the biggest LLMs (runrl.com)
4 points by -_- 7 months ago | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: