Andrew Feldman, Cerebras Co-Founder and CEO: The AI Chip Wars & The Plan to Break Nvidia's Dominance
Watch on YouTube ↗ Summary based on the YouTube transcript and episode description.
Cerebras CEO Andrew Feldman argues Nvidia’s off-chip HBM memory architecture is its core inference weakness, and that inference scaling laws are far from exhausted.
- GPUs run at only 5–7% utilization during inference, meaning 93–95% of compute is wasted — massive algorithmic efficiency gains remain.
- Nvidia’s HBM memory, originally a graphics strength, is now an architectural liability for inference; wafer-scale SRAM is faster for token generation.
- Feldman predicts Nvidia’s market share drops to 50–60% within 5 years, from near-100% today, with chip makers exceeding model providers in enterprise value long-term.
- G42 deal is ~87% of Cerebras revenue, estimated over $1B; Feldman frames large concentrated deals as a learnable strategic-partnership muscle.
- DeepSeek impressed Feldman not for breakthroughs but for disciplined focused engineering with ~200 people and less compute than assumed.
- Cerebras is cash-flow positive because positive gross margins reflect real technical differentiation; negative gross margins signal commodity status.
- Synthetic data will constitute almost all training data within 5 years, filling rare-event gaps the way flight simulators train pilots on edge cases.
- Sub-milliwatt edge inference chips sitting next to sensors are a massively underinvested area Feldman flags as important for robotics and IoT scale.
2025-03-24 · Watch on YouTube