The Rise of Generative Media: fal's Bet on Video, Infrastructure, and Speed

Name: The Rise of Generative Media: fal's Bet on Video, Infrastructure, and Speed
Uploaded: 2025-12-10T12:00:00.000000Z
Description: fal founders Gorkem Yurtseven, Burkay Gur, and Batuhan Taskaya explain why running 600 video models simultaneously is a harder problem than LLM inference — and why top models turn over every 30 days. …

Dec 10, 2025 · cloud · Source ↗

Summary based on the YouTube transcript and episode description.

fal founders Gorkem Yurtseven, Burkay Gur, and Batuhan Taskaya explain why running 600 video models simultaneously is a harder problem than LLM inference — and why top models turn over every 30 days.

A 5-second 24fps video takes ~10,000x the compute of a 200-token LLM prompt; 4K adds another 10x on top.
Top 5 video models on fal have a half-life of 30 days — the leaderboard fully turns over that fast.
fal’s top 100 customers use 14 different models simultaneously, often chained in multi-step workflows.
Video inference is compute-bound (saturating GPU flops), while LLM inference is memory-bandwidth-bound — requiring entirely different kernel optimization strategies.
fal runs across 35 heterogeneous data centers with a custom orchestrator and CDN; hyperscalers charge 2–3x more and lack video inference expertise.
Batuhan Taskaya became one of the youngest Python core maintainers at 14; fal’s tracing compiler finds common execution patterns and swaps in templated semi-generic kernels at runtime.
Jeffrey Katzenberg (ex-DreamWorks CEO) told fal’s generative media conference that AI video is following the same arc as early CGI — initial revolt, then inevitable adoption.
Individual creators are spending up to $500K on fal; customers include Canva, Adobe, and Adaptive Security (Brian Long), which generates personalized security training videos on the fly.

2025-12-10 · Watch on YouTube

Related coverage

Revocation of X.509 Certificates

Dear friend, you have built a Kubernetes (2024)

Show HN: Kloak, A secret manager that keeps K8s workload away from secrets

Why Tokyo is the most important tech destination of 2026