Construction AI Brief
A genuinely quiet week, so one fresh release and the harder question underneath it. On 26 June OpenAI previewed GPT-5.6 Sol, Terra and Luna, its new general-purpose frontier family, with three published price tiers but access locked to about twenty partners at a government request OpenAI says it doesn't like. The deeper point for construction sits a layer down: even when these models reach you, the BIM and CDE platforms you'd point them at still can't safely delegate a decision to them, and the standard meant to govern that is silent on agents.
PlanOps automates the planning tasks you’re reading about.
Start free
Today’s context: This brief covers the latest movements in AI tooling, adoption, and signals for construction teams. Read on for what matters and what to focus on.
On 26 June 2026 OpenAI previewed GPT-5.6, its new general-purpose frontier family, in three tiers: Sol as the flagship, Terra as the balanced everyday model, and Luna as the cheap, fast option, with a compute-intensive Sol Ultra mode sitting above Sol. The capability story is real but not the interesting bit. The interesting bit is that OpenAI published the prices. Sol runs at US$5 input and US$30 output per million tokens, Terra at US$2.50 and US$15, and Luna at US$1 and US$6 (OpenAI's own figures). For the first time you can read the per-token cost of the new frontier off a page.
So why should a contractor care about a token price? Because it's the number that decides whether an agentic workflow is affordable on an actual job. A copilot that answers a one-line question is cheap on any tier. An agent that loops over a 400-page operation and maintenance pack, reading, checking, cross-referencing, burns output tokens at a rate that turns Sol's thirty-dollar rate into real money fast, while Luna's six does the same job for a fifth of the cost if it's good enough. When you price that work in Intelligence Units, the tier you choose is the line that moves the bill, not the brand on the box. I'd not over-read OpenAI's benchmark either, the TerminalBench 2.1 scores of 88.8% for Sol and 91.9% for Sol Ultra are the vendor's own and measure agentic coding, not quantity surveying. The direction is right, the number needs a real job to test it.
There's a catch you can't ignore. You can't buy it yet. During the preview the models reach only about twenty trusted partner organisations, through the API and Codex, and not in ChatGPT at all, a restriction OpenAI says it took at the US government's request following a 2 June executive order. OpenAI went further than most would, saying publicly it believes in broad access and that this kind of gating shouldn't become the norm. Worth noting that this happened the same week Anthropic's strongest model was being rationed to named critical-infrastructure operators, a thread we covered on 29 June. Two of the biggest labs, the same gate, within days. The pattern is now the story.
The procurement filter: When you cost any AI copilot for a project, ask which model tier it runs on and what that does per Intelligence Unit at the volume you'd actually use, not the demo volume. A five-fold price gap between Sol and Luna is the difference between an agent that pays for itself and one that doesn't.
50 free Intelligence Units. See what AI can do for your projects.
Picture the gate lifting and GPT-5.6 arriving on your desk tomorrow. Where would you point it? At your BIM model and your CDE, the places your project actually lives. And this is where the release runs into a wall the trade press has been circling for months. The clearest write-up is Martyn Day's piece in AEC Magazine on 28 April, which reads a Google DeepMind paper on AI delegation and concludes, fairly bluntly, that today's BIM platforms are architecturally incompatible with safe agentic delegation. Not incomplete. Incompatible.
The distinction he draws is the one that matters on a high-risk building. There's a difference between an AI that assists, suggesting a layout or drafting a schedule that you then validate and own, and an AI you delegate a decision to, where the agent itself takes responsibility for satisfying the fire egress rules, the structural limits and the accessibility code all at once, and proves it did. Current tools do the first. They can tell you whether a clash was detected. They can't tell you whether the agent honoured the energy target, or how it reached its answer, or sign that answer with something auditable later. The reasoning is a black box, and in structural safety or fire compliance a black box is exactly what you can't have. The comparison only goes so far, but we don't accept opaque reasoning in bridges or aircraft, and a higher-risk building sits in the same bracket.
What makes this a now problem rather than a someday one is that the vendors are already shipping. Autodesk Assistant, Bentley Copilot and Trimble Agent Studio are all agent-capable platforms out in the market, while the draft revision of ISO 19650, the standard that governs how building information is managed and whose Part 3 is open for comment right now, says nothing about agents, autonomous workflows or delegated authority at all. It still assumes a human produces the information and a human is accountable for it. The market is moving a full revision cycle faster than the standard. That gap isn't academic. It's precisely where the liability lands when something goes wrong and nobody can say which layer made the decision.
For your board pack: Before you let any agent-capable platform near a live UK job, ask the vendor two plain questions: can you show me how the agent reached its answer, and whose name is accountable for it under the RICS standard? If the honest answer to the first is no, you've found the limit of what you can safely delegate today.
Sources:
Tie the two together and you get the honest read on where most UK firms actually are. The model frontier is sprinting, gated and priced. The platform underneath it can't yet be trusted with a delegated decision. And the workforce sits behind both: RICS survey work still puts roughly 45% of construction organisations at no AI use at all, with skills shortages, poor data quality and integration problems named as the brakes. So the gap between what the technology can do and what the average firm can use it for is, if anything, widening this quarter, not closing.
That's not a counsel of despair, it's a steer on where to spend the summer. You're not falling behind because you can't get GPT-5.6, you can't, and nor can almost anyone. You fall behind by leaving your project data and your sign-off chain in a state where no agent, this year's or next year's, could be trusted with them. Clean information, a clear accountable owner for every output, and an honest map of where an agent's reasoning would have to be visible before you'd rely on it. That's unglamorous work and it's the work. Get it right and you're ready the day the gate lifts. Skip it and the cleverest model in the world has nowhere safe to stand.
A practical step: Pick one workflow this month, the O&M handover or the Gateway 2 documentation, and ask of it: if an agent did this, could I prove what it did and name who's responsible? Fixing the gaps that question exposes is worth more than any model you can't buy yet.
50 free Intelligence Units. Set up your first project in under 20 minutes. No credit card needed.
Get 50 free Intelligence UnitsDaily practical AI insight for construction teams. What changed, why it matters, and what to ignore.
50 free Intelligence Units — automate your programme admin
We help construction teams turn AI into useful work, not noise. Understanding what’s changing in AI is the first step. Making it work on-site is the real difference.
Two fresh items from a quiet week. On 25 June Buildots launched its Intelligence Lab, a free research hub built on anonymised data from thousands of instrumented projects, betting that the sector's missing piece is a shared source of macro truth. And on 26 June the US government told Anthropic it could redeploy Mythos 5, its strongest cyber model, but only to roughly a hundred critical-infrastructure organisations, which is the data centres, grid and utilities your sector is busy building.
Found this useful? Share it.
A quiet news week, so a fundamentals one. New Civil Engineer's 24 June deep dive lays out the bottleneck the AI building boom keeps running into, and it isn't planning, it's grid and water. The pipeline of demand waiting for a connection has tripled to 125GW, more than the country's entire peak demand. And on 22 June Google shipped Gemini 2.5 Pro with Deep Think, the long-document reasoning the awaited 3.5 Pro was supposed to bring, just under a different badge.
ISO 19650 dropped 'BIM' for whole-life information with its Part 3 consultation open now, Palantir and Autodesk both moved to own the ontology above your drawings, and New Civil Engineer showed on 24 June that the data-centre boom is gated by power and water, not planning. A week where the value and the constraint both sat one layer below the model.