One builder scraped 241 UK council planning portals and assembled 2.6M planning decisions into a single searchable dataset.
Key Takeaways
241 separate council portals means 241 different data schemas, auth flows, and anti-scraping postures – the engineering surface is the whole project.
2.6M decisions is a meaningful archive; UK planning data is technically public but practically inaccessible without this kind of aggregation.
The product monetizes via a postcode checker with a £19 paywall, implying a B2C angle targeting homeowners and developers researching local approval rates.
Refusal rates vary significantly by council area, which is the core insight buyers actually want before committing to a planning application.
Appeal data is a natural extension: appeals go through a separate public gateway (PINS) and would materially increase dataset completeness.
Hacker News Comment Review
The postcode checker’s “Mixed results” output drew direct criticism – commenters with real planning experience said the free tier gives too little signal to justify handing over a credit card, which is a conversion problem.
Multiple builders suggested agentic scraping approaches: Browserless for tricky hosts, Claude/Gemini for paywall bypass, and orchestrator/evaluator loops with multimodal LLMs for general scraping at scale.
The site’s own Terms of Service ban automated scraping of its data – commenters flagged the irony immediately given the entire product is built on scraping public portals.
Notable Comments
@efaref: Personal appeal case added 2 years and tens of thousands in costs; their council shows as high-refusal in the data, validating the dataset’s signal.
@ferngodfather: “Pot kettle” – ToS bans scraping their own scraped-public-data product.
@ashish-alex: Building similar in another domain using agentic browser automation with an orchestrator/judge loop around a multimodal LLM.