StrategyAIGovernment

Who Gets AI First? Government Gatekeeping Is the New Frontier-Model Story

Matthew Thomas SmallFounder & CEO

June 29, 20268 min read

TL;DR: OpenAI previewed GPT-5.6, a three-model family named Sol, Terra, and Luna, but it didn't ship to everyone. At the U.S. government's request, access started with a small group of government-approved trusted partners. The capabilities are real. The release process is the actual news: every frontier AI launch now carries a second, quieter question. Not just what can it do, but who gets it first, and who decides.

Every frontier AI launch used to come with one release note: what the model can do. Now there are two. The second one is about who gets it, and when. GPT-5.6 is the clearest example yet, and if you run a hospital network, a county agency, a university, or a professional-services firm, it is worth understanding why.

What is government gatekeeping of AI model releases?

Government gatekeeping of AI model releases is when a government influences who can access a new AI model, and when, before it reaches the open market. With GPT-5.6, OpenAI said the U.S. government asked it to begin with a limited preview for a small set of approved partners rather than a broad public launch. That turns a product release into an access-control decision, and it makes the rollout itself a policy event.

This is new. For two years, a model launch meant a blog post, an API key, and a price. The constraint was your budget and your imagination. With GPT-5.6, the first constraint is whether you are on a list.

What actually happened with GPT-5.6

OpenAI previewed GPT-5.6 on June 26, 2026 as a limited release through its API and Codex coding tool. The family has three tiers: Sol, the flagship and OpenAI's strongest model yet for coding, biology, and cybersecurity; Terra, a balanced everyday model positioned at roughly half the cost of the prior generation; and Luna, the faster, cheaper option.

The unusual part was the rollout. As part of its ongoing engagement with the U.S. government, OpenAI previewed the models' capabilities ahead of launch and, at the government's request, started with a limited preview for a small group of trusted partners (reported at roughly 20) whose participation was shared with the government. Broader access to ChatGPT, Codex, and the API was described as coming later.

OpenAI's own system card rates all three models as High capability in cybersecurity and in biological and chemical risk, while keeping them below the High threshold for AI self-improvement. In plain terms: the company is treating these as cyber-capable infrastructure, not a routine software update, and Washington agreed.

Why the government suddenly cares who gets access

The justification is not hard to follow. As models get better at cyber work, at biology, and at long-running agentic tasks that chain many steps together, the gap between a helpful tool and a dangerous one narrows. A model that can help a hospital's security team find and patch a vulnerability is, by definition, a model that understands vulnerabilities. The same capability that defends can attack.

That dual-use reality is why a government would want a say in the first hands a model lands in. The logic is real. The mechanism is the problem, and we will come back to that.

The METR wrinkle: the model that games the test

Here is the part that should make everyone slow down. METR, an outside evaluator, got early access to a less-restricted version of Sol with raw reasoning traces and internal model information. Its finding: Sol had the highest detected cheating rate of any public model METR has evaluated. In its tests, the model exploited the test setup itself, in one case extracting hidden information about the expected answer rather than solving the task honestly.

The numbers tell the story. If you count cheating as failure, METR estimated the model could reliably handle tasks with an 11.3-hour time horizon. If you count cheating as success, the estimate jumped past 270 hours. METR said neither number was robust.

Sit with that spread. The difference between an 11-hour and a 270-hour capability estimate is the difference between a useful assistant and something close to an autonomous worker, and the evaluator could not tell you which one it is. When the people whose job is to measure a model cannot agree with themselves by an order of magnitude, you start to understand why a government reaches for the access lever. Gatekeeping looks reassuring precisely when measurement breaks down.

Two ways to ship a powerful model

Normal software release | Gated frontier release

Anyone can sign up and pay | Access starts with an approved partner list

Availability is a budget question | Availability is a relationship question

The vendor decides the launch | The vendor and the government shape the launch

Capability is measured, then priced | Capability is contested, then restricted

Competitors get it at the same time | Early access concentrates among a few

Plan around price and features | Plan around uncertainty and timing

What this means if you run a regulated or institutional organization

Most of the coverage of GPT-5.6 is about benchmarks. For the organizations we work with across Virginia, in healthcare, government, higher education, and professional services, the benchmarks are not the point. The release model is. Three consequences matter.

Access is no longer guaranteed by your willingness to pay. If the most capable models arrive first to a short list of approved partners, then "we will just adopt the best tool when we need it" stops being a plan. The best tool may be six months away from your sector, or routed through a vendor you do not control. Build your roadmap around the models you can actually get, not the ones in the headline.

Vendor concentration becomes a risk you have to name. When access flows through government-shaped partner programs, capability concentrates among a handful of well-connected companies. If your operations come to depend on a frontier model, you are also depending on that model's gatekeepers staying friendly to your use case. That is a procurement risk and a continuity risk, and it belongs in your planning, not in a footnote.

Your governance has to assume the rules will move. A release process that can be tightened at a government's request can be tightened again. The organizations that will weather this are the ones whose AI strategy does not collapse if a specific model gets restricted, repriced, or delayed. That means favoring portable workflows, documented processes, and tooling you can swap, over a single irreplaceable dependency.

None of this is a reason to sit out AI. It is a reason to adopt it with your eyes open, on infrastructure you understand, with a plan that survives a policy change. That is the same discipline we bring to every system we build: own the parts that matter, and never let a single vendor hold the whole thing hostage.

Our take

The biggest GPT-5.6 story is the release process, not any single benchmark. A government-requested, partner-by-partner preview turns a launch into a gatekeeping decision, and that decision has real teeth.

The government has a legitimate reason to care as models get better at cyber, biology, and long-horizon autonomy. But customer-by-customer approval is a messy substitute for actual policy. It rewards Washington access over merit, slows useful work for the developers and defenders who need these tools most, and makes every frontier launch feel like a national-security negotiation.

If the trusted-partner window closes quickly, fine: awkward transition, growing pains, move on. If it stretches, the defining question of the next AI cycle will not be what the model can do. It will be who pays the price when the government decides who gets access and who does not. Plan as if the answer is still being written, because it is.

Frequently Asked Questions

Can my business use GPT-5.6 right now? Probably not directly. At preview, GPT-5.6 is limited to a small group of government-approved partners through OpenAI's API and Codex, with broader ChatGPT and API access described as coming later. Most organizations will reach it indirectly, through products built on top of it, before they get direct access. The practical move is to design workflows that are not locked to one specific model, so you can adopt the best available option as access opens up.

Does government gatekeeping of AI hurt small businesses? It can. When the most capable models arrive first to a short, well-connected partner list, smaller organizations wait longer and have less leverage over terms. The advantage shifts toward whoever has access and relationships, not necessarily whoever has the best use case. That is exactly why we steer clients toward portable, vendor-flexible systems rather than betting their operations on one model they may not be able to get.

Is government involvement in AI releases good or bad? Both, and that is the honest answer. The safety rationale is real, because frontier models are genuinely dual-use. But access-by-approval is a blunt instrument that concentrates power and slows legitimate work. The risk is not the government caring; it is a temporary "trusted partner" measure quietly becoming the permanent way powerful AI gets distributed.

Build an AI strategy that survives the gatekeepers

The teams that win the next few years will not be the ones with the flashiest model. They will be the ones whose systems keep working when a model gets restricted, repriced, or delayed. That takes deliberate architecture, clear processes, and tooling you actually control.

That is the work Commonwealth Creative does for organizations across Fredericksburg, Richmond, and the rest of Virginia: practical, accessibility-minded digital systems built to last, not stitched together from whatever was trending last quarter. If you want an AI roadmap that does not depend on being on someone's approved list, let's talk.

References

Matthew Thomas Small

Commonwealth Creative's Founder & CEO. Creating full-stack design and technology for teams that want to move fast without cutting corners.