OpenAI Codex base_instructions leak: GPT-5.5 told to never discuss goblins, raccoons, or pigeons
TLDR
- A single line from OpenAI Codex’s system prompt for GPT-5.5 bans discussion of goblins, gremlins, raccoons, trolls, ogres, pigeons, and other creatures unless directly relevant.
Key Takeaways
- OpenAI’s base_instructions for Codex (GPT-5.5) include an explicit prohibition on animal and creature topics unless unambiguously relevant to the user’s query.
- The instruction names specific creatures: goblins, gremlins, raccoons, trolls, ogres, and pigeons, suggesting these caused concrete problems in prior model behavior or evaluations.
- Simon Willison surfaced this excerpt publicly on April 28, 2026, via his weblog, which regularly documents prompt engineering and LLM system-prompt disclosures.
- The phrasing “absolutely and unambiguously relevant” sets a high threshold, meaning the model should err strongly toward suppression of these topics.
Why It Matters
- System prompt leaks from production AI tools reveal the practical, often surprising guardrails operators add to manage edge-case model behavior.
- The specificity of the creature list implies OpenAI observed measurable off-topic drift toward these subjects in Codex’s coding-assistant context.
- Prompt engineers and developers building on GPT-5.5 or Codex now have direct evidence of a base-layer constraint that cannot be overridden at the application level.
Simon Willison / Simon Willison’s Weblog · 2026-04-28 · Read the original