Construction AI Brief
The regulator that's been holding up higher-risk buildings for two years is no longer the excuse it was - Gateway 2 approval times are down from a 48-week peak to 13-14 weeks, and the bottleneck is moving to Gateway 3. Meanwhile Anthropic released Claude Fable 5 on Tuesday, the first model from its restricted Mythos tier the public can use, and it's strongest at exactly the document reasoning a golden-thread pack demands.
PlanOps automates the planning tasks you’re reading about.
Start free
Today’s context: This brief covers the latest movements in AI tooling, adoption, and signals for construction teams. Read on for what matters and what to focus on.
For two years the honest answer to "why is the higher-risk-building scheme stuck?" was the Building Safety Regulator. Gateway 2 - the hard stop you have to clear before construction can start on a higher-risk building - was taking up to 48 weeks in London at its worst, and roughly 43 nationally. Programmes were being written around it. Some weren't being written at all. That story has changed, and it's worth registering properly because a lot of people are still planning as if it hasn't.
The BSR's Andy Roe has put the current wait at "13, 14, 15 weeks" and "often" the statutory 12, down from the 48-week peak when he arrived last summer, with firms now told within a week of submission whether their application has even been validated. Trade coverage on 4 June (Specification Online, summarising the latest regulator figures) reported Gateway 2 approvals rising alongside a marked increase in new-build applications - the clearest sign yet that the pipeline is moving rather than just the backlog shrinking. The legacy backlog that dominated the headlines through the winter was cut to a handful of standard cases earlier in the year, following the regulator's move to a standalone body and the government's response to a fairly damning House of Lords committee report. None of that means the regime is fixed. Complex and older remediation cases are still slow - the BSR itself admits late-2025 and early-2026 applications are averaging around 18 weeks - and a remediation improvement plan is due "over the coming weeks". But the binding constraint on new work has genuinely loosened.
Here's the bit that matters for anyone reading this with an AI lens. When the regulator was the bottleneck, the quality of your evidence pack was almost a secondary concern - you were going to wait regardless. Now that approvals are turning around in three months, the variable that decides whether your scheme moves is whether your documentation is complete, consistent and navigable on first submission. That's a different discipline, and it's precisely where document AI has been quietly earning its keep - pulling a coherent golden-thread narrative out of scattered drawings, specifications, O&M data and change records, and flagging the gaps before a validator does. The honest caveat, and it's a serious one: a golden thread is a legal-evidential record, and AI-generated or AI-assembled content in it has to be auditable, attributable and human-approved. A hallucinated reference in a safety case isn't a productivity glitch; it's a liability. The trade press has started calling the golden thread a "golden burden" for good reason - the volume is real, and the temptation to let a model paper over it is exactly the wrong instinct.
And the forward warning: Gateway 3, the occupation sign-off, is now widely tipped as 2026's gridlock. Projects are reaching practical completion and then sitting, unoccupied, waiting on the same documentation rigour applied at the finish line. If you've got a higher-risk building completing this year, the Gateway 3 evidence requirements are the thing to be reading now - not in the autumn when the scheme's already built.
For your board pack: The regulator has stopped being the reason schemes stall. Make Gateway 2 evidence-pack assembly a repeatable, audited process - and brief whoever owns delivery on Gateway 3 occupation requirements before any 2026 completion lands on your desk.
Automate your programme admin. Get your evenings back.
On 9 June Anthropic released Claude Fable 5, the first generally available model from its "Mythos" tier - the restricted, more capable line the company had held back, partly over its strength at cybersecurity tasks. It's available through the Claude API (as claude-fable-5), the Claude apps, Amazon Bedrock and, as of the same day, generally available inside GitHub Copilot. Pro, Max, Team and Enterprise users get it free during an introductory window running 9-22 June.
Skip the leaderboard theatre and look at where the gains actually land, because for construction it's unusually relevant. Anthropic's own figures (vendor-reported, so treat them as a starting point rather than gospel) put Fable 5 ahead of Opus 4.8 specifically on document-heavy knowledge work: the top score on Hebbia's finance benchmark for senior-level reasoning, with the headline gains in document-based reasoning, chart and table interpretation, and visual-document understanding. On GDPpdf - a test of reasoning over visual documents without tools - Anthropic reports 29.8% against Opus 4.8's 22.5%. That's the capability that maps onto a Gateway evidence pack, an O&M manual, a clause-heavy contract or a dense set of structural calcs. Reading messy, mixed-format technical documents and reasoning across them is the unglamorous core of a lot of construction admin, and it's the thing this model is reportedly best at.
The catch is price and proportion. Fable 5 runs at $10 per million input tokens and $50 per million output - roughly double Opus 4.8, which is itself not cheap. This isn't the model you wire into every workflow. It's the one you reserve for the genuinely hard read where a better answer is worth the premium, while routine extraction, formatting and summarisation go to a cheaper model that already clears your quality bar. The standing discipline this brief keeps coming back to holds exactly here: route the workload to the cheapest model that's good enough, and spend the expensive tokens only where they change the outcome.
One governance note worth flagging, because it's the interesting part of the story. Anthropic shipped Fable 5 with hard limits - requests touching cybersecurity, biology, chemistry and model-distillation fall back to Opus 4.8, with the company saying the safeguards trigger in under 5% of sessions - and did so days after publicly warning that frontier AI is getting genuinely dangerous on the security front. You don't need to care about the philosophy to take the practical point: the most capable models now arrive with built-in refusals on whole categories of work, and if your firm's use case sits anywhere near those edges, test it before you build on it rather than discovering the wall in production.
Put the two stories side by side and the shape of the year becomes clear. Eighteen months ago the conversation in higher-risk-building delivery was about a regulator nobody could get a decision out of. Today the regulator turns most things around in three months, and a model exists that's better than last year's best at reading the exact documents the regime demands. Neither of those is the limiting factor any more. What's left is the bit that was always the hard part: is your evidence complete, consistent and correct - and can you prove it?
That's why the most telling number this spring wasn't a benchmark. In Houzz's first UK State of AI in Construction and Design report, the single biggest concern professionals raised about AI was reliability and accuracy - flagged by roughly a third of users, ahead of cost, training or complexity. On a marketing email, a wrong answer is an embarrassment. On a safety case submitted to the BSR, a wrong answer assembled or smoothed over by a model is a defect in a legal record. The firms that get value from document AI on compliance work aren't the ones using the flashiest model; they're the ones who treat every AI output as a draft to be checked against source, keep a human signing off, and keep the citation trail intact. Same discipline as ever. It just matters more now that the other excuses have fallen away.
Practical bit: Pick the one compliance document workflow that costs your team the most hours - golden-thread assembly, O&M compilation, RFI-to-record reconciliation - and run a controlled trial this month with the human-approval gate built in from the start. Measure time saved and error rate caught. Both numbers belong in the decision.
50 free Intelligence Units. Set up your first project in under 20 minutes. No credit card needed.
Get 50 free Intelligence UnitsDaily practical AI insight for construction teams. What changed, why it matters, and what to ignore.
50 free Intelligence Units — automate your programme admin
We help construction teams turn AI into useful work, not noise. Understanding what’s changing in AI is the first step. Making it work on-site is the real difference.
A genuinely quiet week, so one fresh release and the harder question underneath it. On 26 June OpenAI previewed GPT-5.6 Sol, Terra and Luna, its new general-purpose frontier family, with three published price tiers but access locked to about twenty partners at a government request OpenAI says it doesn't like. The deeper point for construction sits a layer down: even when these models reach you, the BIM and CDE platforms you'd point them at still can't safely delegate a decision to them, and the standard meant to govern that is silent on agents.
Found this useful? Share it.
The procurement filter: Trial Fable 5 on one real document-reasoning task - a golden-thread gap-check, a spec reconciliation, a contract clause review - during the free window before 22 June. Measure the answer quality against your current model. If it doesn't beat your cheaper option by enough to justify double the cost, you've learned something useful for nothing.
Sources:
Anthropic - Claude Fable 5 and Claude Mythos 5 →
CNBC - Anthropic releases Mythos-like AI model to the public, Claude Fable 5 →
TechCrunch - Anthropic's Claude Fable 5 is a version of Mythos the public can access today →
GitHub Changelog - Claude Fable 5 is generally available for GitHub Copilot →
Two fresh items from a quiet week. On 25 June Buildots launched its Intelligence Lab, a free research hub built on anonymised data from thousands of instrumented projects, betting that the sector's missing piece is a shared source of macro truth. And on 26 June the US government told Anthropic it could redeploy Mythos 5, its strongest cyber model, but only to roughly a hundred critical-infrastructure organisations, which is the data centres, grid and utilities your sector is busy building.
A quiet news week, so a fundamentals one. New Civil Engineer's 24 June deep dive lays out the bottleneck the AI building boom keeps running into, and it isn't planning, it's grid and water. The pipeline of demand waiting for a connection has tripled to 125GW, more than the country's entire peak demand. And on 22 June Google shipped Gemini 2.5 Pro with Deep Think, the long-document reasoning the awaited 3.5 Pro was supposed to bring, just under a different badge.