Markov chains are very very far off from gpt2.

procaryote · 2025-03-18T08:25:11 1742286311

Aren't they technically the same? GPT picks the next token given the state of current context, based on probabilities and a random factor. That is mathematically equivalent to a Markov chain, isn't it?

SJC_Hacker · 2025-03-18T16:29:33 1742315373

Markov chains don't account for the full history. While all LLMs do have a context length, this is more a practical limitation based on resources rather than anything implicit in the model.