Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That’s just showing the tests are measuring specific things that LLMs can game particularly well.

Computers have been able to smash high school algebra tests since the 1970’s, but that doesn’t make them as smart as a 16 year old (or even a three year old).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: