Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep

· ai ai-agents coding · Source ↗

TLDR

  • Semble is a CPU-only code search library for AI agents combining Model2Vec embeddings and BM25 with RRF fusion, indexing repos in ~250ms and querying in ~1.5ms.

Key Takeaways

  • Achieves NDCG@10 of 0.854 on ~1,250 queries across 63 repos in 19 languages, 99% of CodeRankEmbed Hybrid quality at 218x faster indexing.
  • Returns only matched chunks, using ~98% fewer tokens than grep+read; reaches 94% recall within 2k tokens vs. 100k needed by grep+read for 85% recall.
  • Runs entirely on CPU with no API keys or GPU; indexes average repos in ~250ms, queries in ~1.5ms using static Model2Vec potion-code-16M embeddings.
  • Deploys as an MCP server (Claude Code, Cursor, Codex, OpenCode) or via bash/AGENTS.md for sub-agents that cannot call MCP tools directly.
  • Ranking layer adds adaptive lexical/semantic weighting, definition boosts, identifier stemming, file coherence bonuses, and noise penalties for test/legacy files.

Hacker News Comment Review

  • Core skepticism: the 98% token reduction claim compares against reading entire matched files, which is a real agent behavior but not how an experienced developer would use grep, making the baseline somewhat inflated.
  • Commenters with direct eval experience report that models heavily RL-trained on grep often distrust novel tool output, retrying or re-reading files and erasing token savings; one workaround is a global CLAUDE.md instruction to prefer LSP or semantic tools over grep.
  • Debate on whether semantic search tools make agents “dumber” by replacing reasoning with retrieval is split: some see regression, others note Claude Code defaults to slow line-range sed reads anyway, making pre-indexed semantic search a strict improvement.

Notable Comments

  • @jerezzprime: No agent benchmarks (e.g. CC or Copilot CLI with grep replaced) are provided; RL-trained tool preference may silently negate all savings.
  • @aadishv: Live test on the browser-use/browsercode repo showed semble returning useful results in practice.

Original | Discuss on HN