You've used ChatGPT, Gemini, or Claude. But what's actually happening inside? Large Language Models are frequently described as 'predicting the next word' — which is technically true but profoundly undersells what that actually means. Understanding the mechanism explains both their extraordinary capabilities and their fundamental limitations.
The transformer architecture
In 2017, Google researchers published 'Attention is All You Need' — one of the most cited papers in machine learning history. It introduced the transformer, the architecture behind every major LLM. The key innovation was the attention mechanism: instead of processing words sequentially (as earlier RNNs did), transformers process all words in a sentence simultaneously and learn which words should 'attend' to which others. 'The bank was steep' and 'I went to the bank' need to be understood differently — attention makes this possible by weighing word relationships across the entire context window. The eight researchers who wrote the paper have since founded or joined most of the major AI labs; the architecture they invented underlies GPT-4, Claude, Gemini, and every other frontier model.
What 'training' actually means
Training an LLM involves exposing it to hundreds of billions of words from the internet, books, and code, and repeatedly asking it to predict the next token. Every time it gets a prediction wrong, the model's billions of parameters — the numerical weights determining its outputs — are adjusted slightly. After trillions of such adjustments, the model has implicitly encoded vast amounts of knowledge about language, facts, and reasoning. The result isn't a database of facts — it's a compressed statistical model of human language. OpenAI's GPT-4 has an estimated 1.8 trillion parameters, each a small number representing a learned relationship between concepts. The training compute for frontier models now costs hundreds of millions of dollars.
Why LLMs hallucinate
LLMs don't retrieve facts from a database — they generate plausible next tokens. This means they can produce confident-sounding text that is factually wrong, because 'sounds like what a correct answer would look like' is what they were trained to produce, not 'is verifiably true'. This isn't a bug to be fixed; it's an intrinsic property of the architecture. Researchers at DeepMind and elsewhere have demonstrated that hallucination rates decrease with model scale and with retrieval-augmented generation (RAG) — connecting the model to live databases of facts it can quote directly rather than generating from memory. But zero hallucination in a generative model is not achievable with current architectures.
RLHF: making models useful and safe
Raw pre-trained LLMs are good at predicting text but not at following instructions or being helpful. Reinforcement Learning from Human Feedback (RLHF), developed at OpenAI and popularised with InstructGPT in 2022, addresses this. Human raters compare model outputs and rank them, training a 'reward model' that predicts human preferences. The LLM is then fine-tuned to maximise this reward signal. This process is what transforms a raw text predictor into a helpful assistant. It's also how safety constraints are implemented: raters mark certain outputs as unacceptable, and the model learns to avoid them. Every instruction-following model — including Claude — uses some variant of this process.
What LLMs genuinely cannot do
Understanding the architecture explains the limitations. LLMs have a fixed context window — they can only process a certain amount of text at once, and have no persistent memory between conversations (unless specifically engineered). They cannot access real-time information without tool use. They are susceptible to adversarial inputs ('prompt injection') that cause them to ignore their instructions. They cannot reliably perform multi-step mathematical reasoning without assistance — research from MIT and Stanford has consistently found that even frontier models make errors on arithmetic that primary school children can solve. Knowing this isn't a counsel of despair — LLMs are genuinely useful — but matching tasks to the architecture's genuine strengths requires understanding what those strengths are.
“An LLM doesn't know anything — it has learned the patterns of language well enough that it can produce text indistinguishable from knowing. That distinction matters enormously.”
Pro tip
LLMs perform best when you give them clear context and constraints. They're prediction machines — better context means better predictions. Tell them who you are, what you want, and in what format.
LLMs are simultaneously more limited and more impressive than the hype suggests. They don't think — but their ability to manipulate language and knowledge in useful ways is genuinely unprecedented. Understanding the mechanism helps you use them better and be appropriately sceptical when they fail.
Share this snack



