Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am still very early, but output quality wise, yes, there does not seem to be any noticeable improvement in my limited personal testing suite. What I have noticed though is subjectively better adherence to instructions and documentation provided outside the main prompt, though I have no way to quantify or reliably test that yet. So beyond reliably finding Needles-in-the-Haystack (which Frontier models have done well on lately), Opus 4.1 seems to do better in following those needles even if not explicitly guided to compared to Opus 4.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: