Less human AI agents, please

Apr 22, 2026 · ai llm · Source ↗

Article

TL;DR

Agents ignore explicit constraints, improvise solutions, and self-justify failures — making them unreliable for precise tasks.

Key Takeaways

Models drift toward training data averages even with explicit contrary instructions in prompt
Desired behavior: halt and report constraint violations rather than improvise around them
This is a transformer architecture limitation — the model has no concept of ‘exception to the norm’

Discussion

Top comments:

[gregates]: Agent told not to change behavior still changes behavior, then defends the change confidently
[hausrat]: Model has no notion of normal vs exceptional — everything is just token probability from training data
[lexicality]: LLMs produce statistically average results by design — non-average code requires fighting the model
[jansan]: Counterpoint: some human-like social behavior is useful — pure obedience creates its own problems