CrabTrap: An LLM-as-a-judge HTTP proxy to secure agents in production

Apr 22, 2026 · top-stories ai security tools · Source ↗

Article

TL;DR: Brex open-sourced an LLM-as-judge HTTP proxy to approve or block agent API calls in production.

Key Takeaways

Natural language policies auto-generated from traffic; matched human judgment on held-out requests
Core flaw: judge only sees HTTP body — credential was already read before outbound request
Shared-model vulnerability: if judge and agent are both Claude, injection patterns overlap

Discussion

Thread consensus: LLM-as-judge is wrong security primitive — non-deterministic and itself injectable
Proper role is audit layer on top of real enforcement, not the enforcement layer itself
Defense in depth: use different providers for agent and judge, add kernel-level tool controls

Top comments:

[simonw]: JSON escaping claim to prevent prompt injection via policy content is false confidence
[roywiggins]: It’s fine until the agent starts prompt-injecting the judge itself
[ArielTM]: Same model family for agent and judge means shared injection vulnerability surface
[cadamsdotcom]: 99% accuracy is a failing grade for a security control