Show HN: I built a Web-Scraper API that is 6-7x more efficient than current ones

· web · Source ↗

TLDR

  • Runo is a scraping API that returns schema-defined typed JSON directly, with auto JS rendering, bot bypass, and batch/crawl modes built in.

Key Takeaways

  • Schema is plain JSON (field name, type, example, optional hint); no CSS selectors or XPath required.
  • Three endpoints: /extract (single URL), /batch (fan-out), /crawl (recursive, quota-capped at 25% of monthly balance per job with unused-page refunds).
  • Auto-escalates from plain HTTP to Playwright headless on JS-gate or 403; render_mode field in response shows which path ran.
  • Flat per-request pricing vs. credit-multiplier models; Scale tier quoted at $0.90 per 1K requests vs. ~$6 for Firecrawl equivalent.
  • Unresolvable fields return explicit null; type coercion enforced across string, float, integer, boolean, date, and array types.

Hacker News Comment Review

  • Commenters with production scraping experience want a live demo endpoint to test real URLs, see actual latency, and confirm bot detection tier before committing.
  • Open questions focus on gaps in the current API: no mention of cookie injection or UI interaction simulation (clicks, form fills), which matter for authenticated or multi-step scraping.
  • Hardware-attested CAPTCHAs (Google next-gen reCAPTCHA) were flagged as a likely ceiling for the bypass claims; no response from the builder yet.

Notable Comments

  • @drewrbaker: Asks whether cookie supply and UI interaction simulation are supported, two common requirements for authenticated scraping workflows.
  • @rvz: Raises hardware-attestation reCAPTCHA as a potential hard limit for the bot bypass tier.

Original | Discuss on HN