Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> for instance "what is 2+2" or some numerical puzzles that needed algebraic thinking

there is only one algebraic approach to solving something like 2+2 and that is counting! 2+2 = (((0 + 1) + 1) + 1) + 1). but llms are infamously bad at counting. which is why 2+2 isn't an algebraic problem to an llm. it's pattern matching or linguistic reasoning token by token.



LLMs are bad at counting because nobody counts in text, we count in our heads which is not in the training material.


This result --

https://x.com/yuntiandeng/status/1889704768135905332

Is this a consequence of the fact that "multiplication tables" For kindergarteners are available online (in training data) abundantly ... typically up to 12 times or 13 times table as plain text ?


i don't think it's just about the training material. it's also about keeping track of the precise number of tokens. you'd have to have dedicated tokens for 1+1+1+1 and another one for 1+1+1+1+1 etc.


Internal representation is multidimensional vectors. A typical 4096 in q4 one can name every particle in the universe and have over 4000 dimensions left for other purposes


i don't think that is a valid argument.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: