Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It’s literally a slot machine for random text. With “services around it” to give the randomness some shape and tools.


It is literally not. 2/3 of the weights are in the multi-layer perceptron which is a dynamic information encoding and retrieval machine. And the attention mechanisms allow for very complex data interrelationships.

At the very end of an extremely long and sophisticated process, the final mapping is softmax transformed and the distribution sampled. That is one operation among hundreds of billions leading up to it.

It’s like saying is a jeopardy player is random word generating machine — they see a question and they generate “what is “ followed by a random word—random because there is some uncertainty in their mind even in the final moment. That is both technically true, but incomplete, and entirely missing the point.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: