LLMs are bad at counting because nobody counts in text, we count in our heads wh...

albert_e · 2025-02-13T11:23:10 1739445790

This result --

https://x.com/yuntiandeng/status/1889704768135905332

Is this a consequence of the fact that "multiplication tables" For kindergarteners are available online (in training data) abundantly ... typically up to 12 times or 13 times table as plain text ?

2-3-7-43-1807 · 2025-02-10T14:05:40 1739196340

i don't think it's just about the training material. it's also about keeping track of the precise number of tokens. you'd have to have dedicated tokens for 1+1+1+1 and another one for 1+1+1+1+1 etc.

lostmsu · 2025-02-10T17:11:59 1739207519

Internal representation is multidimensional vectors. A typical 4096 in q4 one can name every particle in the universe and have over 4000 dimensions left for other purposes

2-3-7-43-1807 · 2025-02-10T17:16:08 1739207768

i don't think that is a valid argument.