Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LLMs are bad at counting because nobody counts in text, we count in our heads which is not in the training material.


This result --

https://x.com/yuntiandeng/status/1889704768135905332

Is this a consequence of the fact that "multiplication tables" For kindergarteners are available online (in training data) abundantly ... typically up to 12 times or 13 times table as plain text ?


i don't think it's just about the training material. it's also about keeping track of the precise number of tokens. you'd have to have dedicated tokens for 1+1+1+1 and another one for 1+1+1+1+1 etc.


Internal representation is multidimensional vectors. A typical 4096 in q4 one can name every particle in the universe and have over 4000 dimensions left for other purposes


i don't think that is a valid argument.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: