A replay cache handles clean retries; the real idempotency failures are concurrent requests, partial side effects, downstream uncertainty, and same-key/different-body collisions.
Key Takeaways
Scope idempotency keys by tenant and operation name, not globally; a broken client’s abc-123 should never collide with another tenant’s key.
Store a request_hash of the normalized, validated command (not raw bytes) so same-key/different-body can be caught and rejected with 409 IDEMPOTENCY_KEY_REUSED_WITH_DIFFERENT_REQUEST.
Use an atomic INSERT ... ON CONFLICT DO NOTHING to assign execution ownership; any check-then-insert pattern allows two instances to both execute the side effect.
Track status transitions explicitly: IN_PROGRESS, COMPLETED, FAILED_REPLAYABLE, FAILED_RETRYABLE, UNKNOWN_REQUIRES_RECOVERY – each needs a documented retry behavior.
Response replay vs. resource reconstruction is a contract decision: replaying stored responses risks retaining PII or one-time tokens; reconstructing from resource ID can return a changed representation.