OpenAI Codex base_instructions leak: GPT-5.5 told to never discuss goblins, raccoons, or pigeons

· ai · Source ↗

TLDR

  • A single line from OpenAI Codex’s system prompt for GPT-5.5 bans discussion of goblins, gremlins, raccoons, trolls, ogres, pigeons, and other creatures unless directly relevant.

Key Takeaways

  • OpenAI’s base_instructions for Codex (GPT-5.5) include an explicit prohibition on animal and creature topics unless unambiguously relevant to the user’s query.
  • The instruction names specific creatures: goblins, gremlins, raccoons, trolls, ogres, and pigeons, suggesting these caused concrete problems in prior model behavior or evaluations.
  • Simon Willison surfaced this excerpt publicly on April 28, 2026, via his weblog, which regularly documents prompt engineering and LLM system-prompt disclosures.
  • The phrasing “absolutely and unambiguously relevant” sets a high threshold, meaning the model should err strongly toward suppression of these topics.

Why It Matters

  • System prompt leaks from production AI tools reveal the practical, often surprising guardrails operators add to manage edge-case model behavior.
  • The specificity of the creature list implies OpenAI observed measurable off-topic drift toward these subjects in Codex’s coding-assistant context.
  • Prompt engineers and developers building on GPT-5.5 or Codex now have direct evidence of a base-layer constraint that cannot be overridden at the application level.

Simon Willison / Simon Willison’s Weblog · 2026-04-28 · Read the original