Show HN: How LLMs Work – Interactive visual guide based on Karpathy's lecture

· ai coding web · Source ↗

TLDR

  • Step-by-step interactive walkthrough of LLM construction from raw web crawl through BPE tokenization, Transformer training, SFT, and RLHF.

Key Takeaways

  • FineWeb pipeline: Common Crawl’s 2.7B pages filtered via URL blocklists, text extraction, language detection, MinHash dedup, and PII removal yields 44TB / 15T tokens.
  • BPE tokenization starts from 256 byte symbols and merges most-frequent adjacent pairs iteratively; GPT-4 uses a 100,277-token vocabulary.
  • Pre-training loss measures next-token prediction error across billions of steps; Llama 3 ran 405B parameters over 15T tokens.
  • Temperature at inference controls how broadly the model samples from the next-token probability distribution; 0.7-1.0 balances coherence and creativity.
  • Post-training is two stages: SFT on human-labeled ideal conversations, then RLHF trains a reward model on ranked responses and RL-tunes the LLM toward higher scores.

Hacker News Comment Review

  • No substantive HN discussion yet.

Notable Comments

  • @learningToFly33: Suggests expanding the guide to cover how embedded data is fed at the final inference step and how it affects prediction results.

Original | Discuss on HN