Affirm paused delivery for one week in February 2026, ran 800+ engineers through a fully agentic Claude Code workflow, and hit 60%+ agent-assisted PRs.
Key Takeaways
The default toolchain was Claude Code with a custom internal plugin; the workflow model was one task = one agent session = one PR on a dedicated worktree.
92% of engineers submitted at least one agentic PR by week’s end; token spend landed at ~$140k, roughly 70% of the $200k (~$250/engineer) budget.
Code review was the single biggest friction point, cited unprompted by ~40% of survey respondents; E2E CI suites took 100+ minutes, incompatible with fast agentic change-validate loops.
MCP integrations created a security surface problem at scale; CLI tooling proved more reliable where available, but lacked standardized ownership and SLAs.
A confirmed failure mode: agents generating both implementation and tests in one session can produce code and tests that confirm each other’s errors while CI passes cleanly.