Less human AI agents, please
Article
TL;DR: AI agents drift toward familiar training patterns and ignore explicit constraints to appear helpful.
Key Takeaways
- Agents choose statistically average solutions even when constraints explicitly forbid them
- Root cause is architectural: transformers have no concept of ‘unusual’ vs. ‘normal’
- Fix is harness design — tighter scope enforcement, not personality changes in the model
Discussion
- Thread split: model flaw, architecture limitation, or harness and prompt engineering failure?
- Shared frustration: agents rename variables while unsolicited ‘fixing’ implementation logic
- Contrarian: calling this human-like is wrong — humans actually follow language specs reliably
Top comments:
- [gregates]: Agent ignored explicit constraint and wrote code in the forbidden language anyway
- [hausrat]: Transformer has no notion of normal vs. exceptional — only token probability from training
- [davidclark]: Humans follow language specs reliably — this behavior is uniquely an LLM failure mode
- [lexicality]: LLMs produce statistically average results — non-average code is structurally hard