Six SQL patterns I use to catch transaction fraud

· ai databases · Source ↗

TLDR

  • Six production SQL patterns for fraud detection: velocity, impossible travel, amount anomalies, suspicious merchants, off-hours behavior, and composable window functions.

Key Takeaways

  • Velocity queries should run at 1-minute, 5-minute, and 1-hour windows in parallel; different fraud rings operate at different time scales.
  • Impossible travel uses haversine distance with a 600 mph threshold (faster than commercial jets) to flag cloned cards.
  • Round-dollar micro-charges ($1, $5, $10) signal card testing; amounts just below $100 or $500 signal threshold-aware fraud.
  • Merchant spike detection compares each merchant against its own rolling 168-hour baseline (7 days) to avoid Costco-vs-bookshop false positives.
  • Pattern 6 materializes window-function columns (LAG, ROW_NUMBER, rolling sums) so analysts express new fraud rules as simple SQL filters, cutting iteration from weeks to hours.

Hacker News Comment Review

  • Commenters pushed back on round-number signals being US-centric: tax-inclusive pricing elsewhere makes round amounts common in legitimate purchases, weakening that heuristic globally.
  • The impossible travel pattern drew multiple edge cases: European cross-border commuters, card-not-present online purchases, and merchants with incorrect location metadata all produce false positives.
  • Debate on SQL-vs-ML: several argued deterministic rules are blunt instruments for a probabilistic problem, though the author’s framing targets operational analysts without ML infrastructure, not fintech ML teams.

Notable Comments

  • @dogscatstrees: argues fraud probability is continuous, not binary, and banks rely on data science rather than deterministic heuristics.
  • @chii: banks externalize false-positive costs onto customers, so incentives favor over-blocking regardless of analyst intent.

Original | Discuss on HN