CrabTrap: An LLM-as-a-judge HTTP proxy to secure agents in production

Apr 22, 2026 · ai security tools · Source ↗

Article

LLM judge sits between agent and outbound HTTP to block malicious requests before they fire.

Judge only sees the HTTP request body — agent has already read secrets before the proxy fires.
Shared model family between agent and judge means shared prompt injection vulnerabilities.
Open-sourced by Brex; community consensus is LLM judges are audit layers, not enforcement.

Top comments:

[simonw]: JSON-escaping policy to prevent prompt injection is false security confidence
[ArielTM]: Judge and agent from same model family share prompt injection attack surface

If both are Claude, you have shared-vulnerability risk. Prompt-injection patterns that work against one often work against the other. Basic defense in depth says they should at least be different providers.
[roywiggins]: Agent can prompt-inject the judge through shaped HTTP request bodies
[cadamsdotcom]: 99% secure is a failing grade for a security primitive

pointing it at a few days of real traffic produced policies that matched human judgment on the vast majority of held-out requests. The problem is, 99% secure is a failing grade.