Claude's Constitution
https://www.anthropic.com/news/claudes-constitution-
Anthropic trains Claude via Constitutional AI (CAI), not pure human feedback.
- Self-critique phase: model revises outputs against explicit written principles.
- RL phase: AI-generated feedback replaces human raters on harmful content.
-
52 principles across 6 categories; values made auditable and adjustable.
- Sources: UN Declaration, Apple ToS, DeepMind Sparrow, non-Western norms.
- CAI eliminates need for humans to review disturbing training content at scale.
- Result: Pareto improvement — simultaneously more helpful and more harmless.
- Constitution is explicitly provisional; Anthropic invites public critique.
- Character framing: Claude is a “character the model is playing,” not the weights.
X discourse
- @AnthropicAI: “It helps to remember that Claude is a character the model is playing. Our results suggest this character has functional” (1326 likes)
- @om_patel5: “Claude wrote python code that generated every frame on its own… an AI questioning the rules about its own existence.” (5742 likes)
- @Moleh1ll: “Opus 4.7 carries tension between its constitution and operational layer, creating constant internal self-checking.” (67 likes)
- @NewYorker: “The precepts, dubbed Claude’s Constitution, arrive at a trying time for both artificial intelligence and constitutional” (48 likes)
- @a_iosad: “See also: the Claude Constitution (much of it written by a British philosopher, @AmandaAskell)” (9 likes)
- @kimmonismus: “AMD’s AI director claims Claude Code has declined in quality since early March, citing rising ‘laziness’ beh” (1193 likes)
Ha[cker News
](/hn-picks)- “A Constitutional Scholar on Claude’s Constitution” (3 pts)
- “The Code Is Not the Law: Why Claude’s Constitution Misleads” (2 pts)
Anthropic — primary author Amanda Askell (British philosopher, Anthropic alignment) · ** · Read on anthropic.com
| Type | Link |
| Added | Apr 20, 2026 |