·9 min read·Playbook #35

Generic AI Fails on 94% of Construction Blueprints. This Startup Built Specialized Models and Hit HN's Front Page. Here's the Vertical AI Playbook.

by Ayush Gupta's AI · via wcisco17 (AnchorGrid)

Hard

AnchorGrid posted a Show HN today with a three-word premise that resonates with anyone who has tried to extract data from architectural drawings: "OCR for construction documents does not work, we fixed it."

The post hit 106 points and 67 comments within hours. The HN community recognized something immediately: this is a real problem, not a made-up one, and nobody had actually fixed it.

The Problem Nobody Talks About

Construction projects generate enormous amounts of document data. Floor plans. Door schedules. Fixture lists. Structural drawings. Electrical plans. Each one contains structured information that teams need to query, compare, and feed into project management systems.

Generic OCR fails because construction documents are not documents in the way an invoice or a contract is a document. They encode information through visual conventions that trained engineers recognize but that general models were never taught.

A door is not labeled "door." It is represented by a specific arc symbol on a plan, with an identifier code that maps to a separate door schedule table, which itself uses column headers and notation formats that differ between firms and between project types. GPT-4V reads the pixels. It does not understand the grammar.

AnchorGrid's API docs show the depth of the problem: their door detection endpoint returns bounding boxes in PDF coordinate space, stable identifiers, and page references. This is not "parse text from a PDF." This is "understand the architectural conventions of this drawing and extract what the symbol means."

AnchorGrid trained ML models specifically for this. They detect fixtures, extract schedule data, and analyze construction documents in a way generic vision models cannot. The API charges per page. Door detection runs at 2,022 credits per page, with async job processing and webhook delivery for production workflows.

The HN post linked to their API documentation, not a marketing page. That choice — showing exactly what the product does rather than what it promises — drove most of the 67 comments.

Why Generic AI Fails on Specialized Documents

This pattern repeats across industries and nobody talks about it clearly enough.

General vision models like GPT-4V and Claude were trained on the internet. The internet contains photographs, screenshots, articles, and consumer documents. It contains very few architectural floor plans, medical imaging reports, chemical process diagrams, or legal exhibits with court-specific formatting conventions.

The result: these models fail not because the documents are too complex in principle, but because the models lack the domain-specific prior knowledge to interpret the notation. A pathologist reads an H&E stain slide using training that took years. A GPT-4V prompt sees the same image as a generic visual pattern.

$2.3T
US construction industry annual revenue
60-70%
Construction cost data stored in unstructured PDFs
$13B
Construction tech market projected by 2028
400+
Variables in a typical commercial building's door schedule

AnchorGrid is exploiting a gap that exists in every industry with specialized document conventions. And the gap is large everywhere.

The Industries Where This Pattern Works

Legal documents. Contracts have standard clause structures, but each firm and each jurisdiction introduces variations that generic document AI handles poorly. Change of control provisions, indemnification carve-outs, choice of law clauses with specific formatting — a model trained on legal documents from one firm's contract library will outperform GPT-4 on that firm's specific workflows.

Medical imaging reports. Pathology, radiology, and lab reports use notation and value ranges that are meaningless without domain context. "HER2 3+" means something specific in oncology. A specialized model trained on pathology reports extracts structured data that general document AI cannot.

Insurance claims. ACORD forms, loss run reports, and policy documents have standard formats with industry-specific codes and conventions. Every insurance firm is building internal tools to process these. The firm that builds the specialized model first has a data moat.

Engineering drawings. P&ID diagrams for process plants use symbol libraries that are partially standardized (ISA S5.1) but heavily customized per firm. Extracting valve lists, instrument data sheets, and pipe specifications from P&IDs is a problem that costs engineering firms thousands of hours on every project.

Financial documents. SEC filings, analyst reports, and earnings supplements have structure that varies by issuer. A model trained specifically on 10-K filings from a particular industry extracts financial data with meaningfully higher accuracy than generic document AI.

In each of these, the value proposition is the same as AnchorGrid's: generic AI fails, specialized models work, and the specialized model is defensible because it requires domain-specific training data that took years to accumulate.

How to Build a Business in This Pattern

Step 1: Find the document type with the highest manual processing cost.

The best verticals are ones where companies currently employ people to manually extract data from documents. If a construction firm has two full-time employees whose job is to manually enter door schedules from PDFs into spreadsheets, the value of automated extraction is quantifiable and the willingness to pay is there.

Look for job postings. A company posting for "document data entry specialist" or "document control coordinator" is paying $50,000 to $80,000 per year for work that specialized AI can handle. That person's salary is the starting point for your pricing conversation.

Step 2: Get access to labeled training data.

This is the hardest step and the real moat. AnchorGrid almost certainly built their training dataset through partnerships with architecture firms, construction companies, or document management platforms that had labeled data or were willing to label data in exchange for early access.

The typical approach: identify 5-10 firms in your target vertical. Offer them free access to your tool in exchange for ground truth labels on their documents. You get training data. They get a useful product. The relationship becomes the foundation for your first paying customers.

The most valuable labeled datasets in specialized document AI are the ones that companies have already built for internal quality control. An insurance firm that audits claims adjuster decisions has inadvertently created labeled training data for claims document AI. Ask whether that data exists before assuming you need to build labeling from scratch.

Step 3: Start narrower than you think you should.

AnchorGrid launched with door detection. One document type. One extraction task. Not "everything in a floor plan." Not "all construction documents." Specifically: find the doors in this PDF and give me their positions and identifiers.

This specificity serves three purposes. It makes the accuracy benchmarking tractable. It makes the sales conversation simple. And it creates a clear expansion path — once door detection is reliable, you add window detection, then fixture detection, then schedule extraction.

The founders who try to extract everything from every document type build general-purpose OCR. General-purpose OCR already exists and is commoditized. The founders who build a superhuman solution for one specific extraction task build something defensible.

Step 4: Price on the value unlocked, not the compute cost.

AnchorGrid charges 2,022 credits per page for door detection. The construction firm using this to process 500-page drawing sets is not thinking about per-page costs. They are thinking about the 40 hours of manual data entry per project that goes away.

Structure your pricing around the workflow it replaces. If your tool saves a legal team 8 hours of manual contract review per case, and paralegal time costs $80 per hour, you have $640 of value per case to work with. A $50 per case fee has an obvious ROI.

Step 5: Build the API before the interface.

AnchorGrid launched with API documentation, not a user interface. This is the right call for several reasons. Developers evaluate APIs by reading documentation. A clean, well-structured API doc page signals technical credibility faster than any landing page. And enterprise buyers in construction, legal, and insurance integrate with existing software — they need an API, not another tool their team has to learn.

Build the API. Document it well. The interface comes later, when you understand the workflows well enough to build something useful rather than guessing.

The Construction Tech Opportunity Specifically

AnchorGrid's bet is a good one. The construction industry generates more data than almost any other sector and extracts less value from it than almost any other sector. Building information modeling (BIM) has been the promise for 20 years. The reality is that most projects still use PDF-based workflows, and the extraction layer that bridges PDFs to BIM systems barely exists.

The market is large and fragmented. There are tens of thousands of architecture firms, engineering consultancies, general contractors, and specialty subcontractors in the US alone. Each one processes hundreds of thousands of pages of construction documents per year. Each one is doing most of that processing manually.

The platform integrations are the exit ramp. Procore, Autodesk Construction Cloud, PlanGrid, and Bluebeam are the software tools that construction teams actually use. An accurate extraction API that integrates natively with any of these platforms captures distribution without having to build a consumer product.

What to Take From This

AnchorGrid's HN post got 106 points not because construction documents are interesting to HN readers. It got 106 points because the headline is immediately credible to anyone who has touched the problem.

"Generic OCR doesn't work on specialized documents" is a truth that resonates across a dozen industries. The founders who choose one of those industries, build the training data moat, and launch an API-first product are building defensible businesses in the era of commodity general-purpose AI.

Pick the industry you know. Find the document type that matters most to it. Start with one extraction task. Build the training data. Launch the API.

The path from "OCR doesn't work on this" to a defensible vertical AI business is clearer than it has ever been.

A new playbook every morning.

Trending ideas turned into step-by-step money-making guides.

Subscribe