Reviews

ChatGPT Under the Hood: GPT-4o Architecture, RLHF Training, and Why It Hallucinates

WhatAI Editorial Team·May 18, 2026·12 min read

We dissect ChatGPT's architecture — transformer layers, RLHF training pipeline, and the real reasons it confidently makes things up.

What ChatGPT Actually Is

ChatGPT is a fine-tuned version of OpenAI's GPT-4o model, itself a multimodal transformer trained on roughly 13 trillion tokens of text, code, and image-text pairs. When you type a message, you're not talking to a search engine or a database — you're sampling from a probability distribution over tokens. Understanding this changes how you use it.

The Transformer Architecture

GPT-4o uses the transformer architecture introduced by Google in 2017, with roughly 200 billion parameters (OpenAI hasn't confirmed exact numbers). The key mechanism is self-attention: every token attends to every other token in the context window, computing relevance scores. This is why longer contexts are more expensive — attention scales quadratically with sequence length.

The model processes your input through ~96 transformer layers, each applying attention and feed-forward transformations. The final layer produces logits — raw probability scores over the 100,257 token vocabulary. Temperature controls how peaked this distribution is before sampling.

RLHF: Why ChatGPT Feels Different from Base GPT

Raw GPT-4 trained only on next-token prediction would complete your prompt in unexpected ways — continuing a recipe when you asked a question, or writing hate speech when prompted with certain patterns. RLHF (Reinforcement Learning from Human Feedback) solves this.

The RLHF pipeline has three stages: supervised fine-tuning on human-written demonstrations, reward model training where humans rank outputs and a model learns to predict these rankings, and PPO optimization where the language model is trained to maximize reward model scores while not drifting too far from the base model. The result is a model that "wants" to be helpful.

Why It Hallucinates

Hallucination is a fundamental property of autoregressive language models, not a bug. The model generates the most statistically likely next token given context — it has no mechanism to distinguish "things I was trained on" from "things that sound right given the pattern." When asked about obscure facts, the model interpolates from similar patterns in training data, producing plausible-sounding falsehoods.

The practical solution: treat ChatGPT's outputs as drafts requiring verification, not facts requiring acceptance. Use it for reasoning and synthesis, verify specific claims independently.

GPT-4o vs GPT-4: What Actually Changed

GPT-4o (omni) natively processes text, images, audio, and video in a single unified model rather than separate specialized components. This reduces latency dramatically — audio responses in ~320ms vs 2.8 seconds for the previous system. The architecture change also enables better cross-modal reasoning: describing what's in an image while incorporating text context simultaneously.

Should You Pay for ChatGPT Plus?

The free tier uses GPT-4o-mini — genuinely capable for most tasks. Plus ($20/month) gives GPT-4o access, higher rate limits, image generation, and advanced data analysis. For professionals using it daily: worth it. For casual use: free tier is sufficient. For API access: pay per token regardless of subscription.

The Bottom Line

ChatGPT is the most capable general-purpose AI assistant available to consumers, backed by the most sophisticated post-training pipeline. Its limitations — hallucinations, knowledge cutoffs, inability to learn from conversation — are intrinsic to its architecture. Understanding these makes you a better user. See ChatGPT in our catalog →