Amazon staff game internal AI usage leaderboards by inflating token consumption, using MeshClaw, an agentic tool that deploys code, triages email, and controls Slack.
Key Takeaways
MeshClaw, built by 30+ Amazon engineers, runs overnight, monitors deployments, and triages email autonomously on behalf of users.
Amazon initially published team-wide token usage stats; access was later restricted to employees and managers after gaming became visible.
Managers are officially discouraged from using token counts as performance metrics, but leaderboard pressure persists anyway.
Multiple employees flagged MeshClaw’s default agentic permissions as a security risk, citing potential for unintended autonomous actions.
Meta employees have done the same thing on their own internal leaderboards, suggesting this is a cross-company dynamic.
Hacker News Comment Review
Consensus is that token-count metrics are a textbook Goodhart’s Law failure: once a score exists, employees optimize the score, not the outcome.
Commenters with claimed Amazon insider context say kudos in their orgs go to creative AI use cases, not raw token volume, suggesting the leaderboard dynamic may be team-specific rather than company-wide.
A recurring practical concern: management expectations (~10x productivity) diverge sharply from engineer estimates (~40-60% boost), forcing engineers to perform AI adoption rather than apply it where useful.
Notable Comments
@guyzero: “Once you have a score, you have a game. Once you have a game, people will do whatever it takes to win.”
@rglover: Argues the fix is simple: ask employees to “show me the result” rather than measuring process proxies like token spend.