Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

OpenAI also has Flex processing[1] for o3. I've spent most of my time with Gemini 2.5, but lately been trying out a ton of o3 as it seems to work quite well and I get really cheap tokens (~95% of my agentic tokens are cached which is 75% discount and flex mode adds 50% for $0.25 / million input tokens)

[1] https://platform.openai.com/docs/guides/flex-processing?api-...



Which agents support flex mode?


I've made my own fork of Codex that always uses flex, or you can route agents through litellm and make it add the service_tier parameter. I haven't really seen native support for it anywhere.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: