The Shape of the Thing

· ai · Source ↗

TLDR

  • AI has shifted from co-intelligence prompting to autonomous agent work, and recursive self-improvement is now an explicit roadmap item at every major lab.

Key Takeaways

  • Benchmark scores show near-vertical improvement curves: top AIs now score 94% on Google-Proof Q&A (grad students score 34-70%) and match or exceed human experts 82% of the time on GDPval.
  • StrongDM’s three-person team built a Software Factory where AI agents write, test, and ship production code under two rules: no human code, no human review; each engineer spends ~$1,000/day on AI tokens.
  • A single week in February 2026 illustrated compounding instability: a fictional AI disruption scenario moved Wall Street, Block announced 40% layoffs citing AI, and a public conflict erupted between the Pentagon and Anthropic over Claude’s use in government.
  • Anthropic’s Dario Amodei stated at Davos that engineers inside Anthropic barely write code themselves; OpenAI said its latest Codex model was “instrumental in creating itself.”
  • Google DeepMind’s Demis Hassabis confirmed all major labs are actively working to close the recursive self-improvement loop, while flagging missing capabilities and real risks.

Why It Matters

  • If recursive self-improvement compounds, the exponential benchmark curves already visible would steepen further, with no clear ceiling established yet.
  • Organizations experimenting with radical AI-native workflows right now are setting precedents before norms, regulations, or role models exist.
  • Market reactions, job impacts, and government entanglement are already colliding simultaneously; Mollick argues this instability will spread, not stabilize.

Ethan Mollick, One Useful Thing · 2026-03-12 · Read the original