The Powerful Alternative To Fine-Tuning

Name: The Powerful Alternative To Fine-Tuning
Uploaded: 2026-02-27T12:00:00.000000Z
Description: Poetiq CEO Ian Fischer explains how a 7-person ex-DeepMind team beat Claude Opus 4.6 on Humanity’s Last Exam using recursive self-improvement harnesses instead of fine-tuning. Poetiq’s system beat Ant…

Feb 27, 2026 · Source ↗

Summary based on the YouTube transcript and episode description.

Poetiq CEO Ian Fischer explains how a 7-person ex-DeepMind team beat Claude Opus 4.6 on Humanity’s Last Exam using recursive self-improvement harnesses instead of fine-tuning.

Poetiq’s system beat Anthropic’s Claude Opus 4.6 on Humanity’s Last Exam: 55% vs 53.1%, at optimization cost under $100k.
On ARC-AGI v2, Poetiq scored 54% vs Gemini 3 Deep Think’s 45%, at half the cost ($32/problem vs ~$70+).
Adding reasoning harnesses on top of prompts took one benchmark task from 5% to 95% performance with Gemini 1.5 Flash.
Fine-tuning is a trap for startups: costs millions, then next frontier model renders it obsolete; harnesses stay model-agnostic.
The Poetiq meta-system auto-generates reasoning strategies in code, not just better prompts — DSPY-style but recursively self-improving.
The generated prompts for ARC-AGI included a factually wrong example that improved performance — the system found non-human strategies.
Entire company is 7 people (research scientists and engineers); no harness retraining needed when new base models release.

2026-02-27 · Watch on YouTube

Related coverage

Replit's CEO On The Only Two Jobs Left In The Company Of The Future

How To Build A Company With AI From The Ground Up

How to Make Claude Code Your AI Engineering Team

How Stripe Built Their New Website