DuckDB’s fts extension brings Okapi BM25 full-text search to any data source with a two-line install, covering stemming, stop words, and accent normalization.
Key Takeaways
Install is frictionless: INSTALL fts; LOAD fts; then PRAGMA create_fts_index('emails', 'id', 'subject', 'body'); against any DuckDB table.
BM25 tuning via k₁ (term frequency weight) and b (document length penalty) is exposed directly in match_bm25() query calls.
Missing vs. Postgres/Elasticsearch: no ts_headline-style match highlighting, no phrase queries, no pluggable synonym dictionaries.
Stemming gaps (e.g. mice -> mice, not mous) can be debugged locally with the Python snowballstemmer library before indexing.
Author recommends DuckDB FTS for exploratory work, with a clear escape hatch: dump and import into Postgres or Elasticsearch when you need more.
Hacker News Comment Review
The one comment shifts focus to a adjacent gap: no turnkey open-source tool to publish and search a mailbox for non-technical users, with DuckDB or otherwise.
Notable Comments
@rahimnathwani: Built a Claude-assisted email browser on Vercel; notes search quality was poor and preprocessing heavier than expected for even a small corpus.