Al Engineering 101 with Chip Huyen (Nvidia, Stanford, Netflix)
Chip Huyen argues most AI app failures are UX and data problems, not model-selection problems, and that base model scaling gains are plateauing.
- Talking to users and writing better prompts improve AI apps more than chasing new models, frameworks, or vector database choices.
- Fine-tuning should be a last resort; most gains come from prompt optimization and better data preparation before touching model weights.
- Data-labeling startups (Mercor, Scale, Handshake) have massive ARR but dangerously few customers — frontier labs have strong pricing leverage over them.
- Post-training (RLHF, verifiable rewards, distillation) is now where frontier labs differentiate, since pre-training data is largely saturated.
- Test-time compute — generating multiple candidate answers or longer reasoning chains at inference — boosts perceived performance without changing the base model.
- High performers gain most from AI coding tools; managers would rather have a new headcount than expensive coding-agent subscriptions for their teams.
- Voice AI latency requires multiple sequential hops (speech-to-text, LLM, text-to-speech), making interruption detection and naturalness engineering challenges more than AI ones.
- Chip predicts base-model step-change improvements will slow; gains will shift to post-training, multimodal (especially audio/video), and application-layer optimization.
2025-10-23 · Watch on YouTube