A $12 domain registration and one Wikipedia edit were enough to get multiple frontier LLMs to confidently confirm a fabricated 6 Nimmt! world championship.
Key Takeaways
The attack is a circular citation loop: register a domain, publish a fake press release, cite it on Wikipedia, and RAG-enabled LLMs treat two sources as independent corroboration.
Any LLM with web search inherits the trustworthiness of whatever ranks for a query; SEO poisoning now flows directly into context windows as confident-sounding output.
Wikipedia edits that survive long enough get absorbed into pretraining corpora, making the fabrication persistent across every model trained on that scrape even after the edit is reverted.
The agent-layer risk is the most serious: agents acting on retrieved vendor policies or external content let a poisoned source specify real actions on real infrastructure.
Detectable heuristics exist: Wikipedia edits citing a single external domain registered within the same time window are a clear signal for both Wikipedia editors and training pipeline filters.