Scraping 241 UK council planning portals – 2.6M decisions so far

· Source ↗

TLDR

  • One builder scraped 241 UK council planning portals and assembled 2.6M planning decisions into a single searchable dataset.

Key Takeaways

  • 241 separate council portals means 241 different data schemas, auth flows, and anti-scraping postures – the engineering surface is the whole project.
  • 2.6M decisions is a meaningful archive; UK planning data is technically public but practically inaccessible without this kind of aggregation.
  • The product monetizes via a postcode checker with a £19 paywall, implying a B2C angle targeting homeowners and developers researching local approval rates.
  • Refusal rates vary significantly by council area, which is the core insight buyers actually want before committing to a planning application.
  • Appeal data is a natural extension: appeals go through a separate public gateway (PINS) and would materially increase dataset completeness.

Hacker News Comment Review

  • The postcode checker’s “Mixed results” output drew direct criticism – commenters with real planning experience said the free tier gives too little signal to justify handing over a credit card, which is a conversion problem.
  • Multiple builders suggested agentic scraping approaches: Browserless for tricky hosts, Claude/Gemini for paywall bypass, and orchestrator/evaluator loops with multimodal LLMs for general scraping at scale.
  • The site’s own Terms of Service ban automated scraping of its data – commenters flagged the irony immediately given the entire product is built on scraping public portals.

Notable Comments

  • @efaref: Personal appeal case added 2 years and tens of thousands in costs; their council shows as high-refusal in the data, validating the dataset’s signal.
  • @ferngodfather: “Pot kettle” – ToS bans scraping their own scraped-public-data product.
  • @ashish-alex: Building similar in another domain using agentic browser automation with an orchestrator/judge loop around a multimodal LLM.

Original | Discuss on HN