Enterprise Pricing
Get a demo
Frontier AI

State of Frontier AI

The frontier model landscape at a glance - release cadence, the geography of labs and the countries behind them, and how the field splits between open-weight and closed models.

Updated June 29, 2026 Model Tracker → JSON API →

Release cadence by lab

Tracked frontier model releases over time, stacked by the lab that shipped them. Curated frontier set, so earlier periods undercount older models.
Period

Release cadence by country

The same releases, stacked by the country the lab is based in - the US vs China cadence race.
Period

Release cadence by access

The same releases, stacked by open-weight vs closed - how the open/closed mix shifts over time.
Period

The geography of frontier AI

Models per lab, grouped by the country the lab is based in.

Open vs closed

Open-weight (downloadable) vs closed (API-only) models, overall and by country.
Open-weight share by release year
Leading open-weight models
Strongest downloadable models on SWE-bench Verified (autonomous coding).
Monthly active users as disclosed by each company (2026). Definitions vary - some are standalone apps, some are embedded across a company's products - and Google Search still operates at several times the scale of any single AI app.

Run it yourself: best open weights + local hardware

The open-weight frontier is now strong enough to self-host for real work. Current community/benchmark consensus on what to run for production inference and coding agents, and the hardware to run it. All models below are tracked above - browse the full table.
Coding agents
  • GLM-5.2 / Kimi K2.7 Code / DeepSeek V4 - top-tier agentic coding (long-horizon tasks, tool-call accuracy). 1T-class MoE; need a server.
  • Qwen3-Coder (32B) / Devstral Small - the practical local picks; strong coding on a single GPU.
General / production inference
  • DeepSeek V4 / Qwen 3.6 / MiniMax M3 - frontier-class quality, MoE cost advantage when self-hosting at scale.
  • Gemma 4 (26B-A4B / 31B) / Mistral Small / Llama - efficient, permissive licensing, easy local deployment.
Hardware tiers (approximate, Q4 quantization)
Single GPU
24-32 GB · RTX 4090/5090

Dense models up to ~32B and small MoE at Q4: Qwen3-Coder 32B, Gemma 4 31B, DeepSeek 32B distill. A capable local coding copilot.

Workstation / Mac Studio
64-256 GB · 2-4x GPU or M-series unified

70B dense and mid-size MoE; 100B+ MoE on a high-memory Mac. Small-team self-hosting via vLLM tensor parallelism.

Server
640 GB+ · 8x H100/H200

The 1T-class MoE flagships (DeepSeek V4, Kimi K2.7, GLM-5.2) at INT4/FP8 for production batch throughput. e.g. ~630 GB for a 1T model at INT4.

MoE models must fit all parameters in memory even though only the active set computes per token. Q4_K_M (GGUF) is the quality/size sweet spot; vLLM is the default production server (continuous batching, tensor parallelism). VRAM figures are estimates and vary with context length and quant. Sources: Spheron GPU cheat-sheet, MindStudio.

The agent landscape

The agent types in production in 2026 and what defines them. Every capable agent shares four traits - autonomy, reasoning, tool use, and memory - and specializes from there.
Coding agents

Autonomous multi-file edits with run / test / debug loops across a repo. Claude Code, Cursor, Devin, Codex, Copilot.

Deep research

Multi-source web research, synthesis, and cited reports over long horizons. ChatGPT, Gemini, Perplexity deep research.

Computer use / browser

Drive a real browser or desktop - click, type, fill forms, finish web tasks. ChatGPT Agent, Perplexity Computer, computer-use models.

Customer support

Resolve tickets end to end, with context from the systems where data already lives. Sierra, Ada, Intercom Fin, Agentforce.

Data & analytics

Query, join, and reason over structured and unstructured data on demand. Analytics agents, LlamaIndex, warehouse-native agents.

Voice

Handle live phone calls at scale - the fastest-moving frontier, already taking millions of contact-center calls. Genesys, PolyAI, Yellow.ai.

Sales / SDR

Prospect, qualify, personalize outreach, and book meetings against your CRM. 11x, Artisan, Clay, Agentforce SDR.

Marketing & content

Draft, optimize, and ship content and campaigns - increasingly tuned for AI search (GEO). Jasper, Writer, HubSpot agents.

Security / SOC

Triage alerts, investigate threats, and draft remediations across the security stack. OpenAI Daybreak, Anthropic Glasswing, CrowdStrike.

Personal assistant

Manage inbox, calendar, and tasks - the everyday agent embedded across consumer apps. ChatGPT, Gemini, Copilot, Meta AI.

Search & brand visibility

Track and grow a brand's presence across SERPs and AI search, then act on the gaps. DemandSphere Agents.

Workflow automation

Chain steps across apps into reliable back-office pipelines with humans in the loop. n8n, Zapier agents, sub-agent orchestration.

Agent frameworks & harnesses
The tooling layer splits the same way the models do - open-source frameworks you self-host and inspect, versus proprietary products you buy and run as a service.
Open-source
  • Nous Research - open model + agent collective; Hermes models, Forge and Atropos RL/agent tooling.
  • LangGraph - stateful multi-agent orchestration from LangChain; the most-deployed open agent layer.
  • CrewAI - role-based multi-agent "crews" with shared goals and tools.
  • Microsoft AutoGen - conversational multi-agent framework for cooperating agents.
  • OpenAI Agents SDK - lightweight open SDK for handoffs and tool use (successor to Swarm).
  • OpenHands - open autonomous software-engineering agent (formerly OpenDevin).
  • Pi - minimal MIT-licensed coding-agent harness (Earendil); 15+ model providers, plugins and skills.
  • smolagents - Hugging Face's minimal framework for code-writing agents.
Proprietary
  • Claude Code - Anthropic's terminal-native coding agent.
  • Codex - OpenAI's cloud software-engineering agent.
  • Devin - Cognition's autonomous software engineer.
  • Cursor - Anysphere's agentic IDE with background agents.
  • Gemini CLI - Google's open-source agentic terminal client.
  • Jules - Google's asynchronous coding agent.
  • Manus - general-purpose autonomous task agent.

Monetization & commerce

How the labs make money, and how AI is becoming a sales channel. Revenue figures are third-party estimates (annualized run-rate, mid-2026).
Estimated revenue by lab
~$30B
Anthropic ARR
~$25B
OpenAI ARR

Anthropic passed OpenAI in run-rate and filed to IPO near ~$47B annualized; OpenAI targets $30B full-year. Google's AI revenue is bundled into Cloud / Workspace and not cleanly isolable; xAI is off a smaller base.

Advertising

OpenAI's ChatGPT ad pilot launched Feb 2026 (~$100M annualized, ~$60 CPM, $200K minimum). Google now shows ads alongside ~25% of AI Overviews plus shopping ads in AI Mode. Perplexity dropped ads entirely. The AI-ad market is estimated at $15-25B, growing 35-50% a year.

Commerce / shopping

Agentic commerce lets agents discover, compare, and buy inside the chat. OpenAI's Agentic Commerce Protocol powers ChatGPT Instant Checkout (Stripe-settled); Google's Universal Commerce Protocol pairs with shopping in AI Mode. Merchants expose product feeds; the agent handles the buy.

Valuations & funding
~$965B
Anthropic IPO
~$852B
OpenAI IPO

Both leaders filed to go public in mid-2026 (xAI + SpaceX reportedly combined ~$1.25T). Enterprise is where the margins are - Anthropic wins roughly 70% of head-to-head enterprise deals against OpenAI - while consumer subscriptions and ads drive volume.

Sources: ValueAdd VC, The AI Corner, Opascope (agentic commerce). Figures are estimates.

Agent, skill & connector marketplaces

The ecosystem layer that makes agents useful: where you find whole agents, the reusable skills you teach them, and the connectors that plug them into real systems.
Agent marketplaces
Skill marketplaces
Connector marketplaces (MCP)
  • Anthropic MCP directory - official catalog of Model Context Protocol servers.
  • Smithery - large community registry of MCP servers.
  • Glama - curated MCP server directory and host.
  • Composio - managed tool and connector layer for agents.
MCP (Model Context Protocol) has become the de-facto wiring standard - one protocol, many catalogs - so a connector built once works across Claude, ChatGPT, and a growing list of agent runtimes. First-party catalogs (ChatGPT connectors, Gemini extensions) cover the rest.

Definitions

Key terms used on this page.
Open-weight model

A model whose trained weights are publicly downloadable, so anyone can run, fine-tune, or self-host it (DeepSeek, Qwen, GLM). It does not always mean the training data is open.

Closed model

A model offered only through an API or product, with weights not released (GPT, Gemini, Claude).

Mixture-of-Experts (MoE)

An architecture where only a subset of parameters ("experts") activate per token, so a model can hold trillions of total parameters but compute like a much smaller one.

Reasoning model

A model that spends extra compute "thinking" (chain-of-thought) before answering, trading latency for accuracy on hard problems.

Context window

The maximum amount of text, in tokens, a model can consider at once - including the prompt and its own output.

Quantization

Compressing a model's weights to fewer bits (e.g. 4-bit) to cut memory use with modest quality loss - key to running large models on limited hardware.

Agent / agentic

An AI system that plans and takes multi-step actions toward a goal using tools - combining autonomy, reasoning, tool use, and memory.

MCP (Model Context Protocol)

An open standard for connecting AI models to external tools, data, and services - the common interface for agent integrations.

Agentic commerce

Letting AI agents discover, compare, and buy products on a user's behalf inside the chat (e.g. ChatGPT Instant Checkout).

AEO / GEO

Answer Engine Optimization / Generative Engine Optimization: making content easy for AI search engines to surface and cite.

FAQ

More than 80 frontier models from 17 labs across the United States, China, France, and Canada, updated as new models ship.

The United States leads by tracked model count, with China a clear second; together they account for the large majority of frontier models. France (Mistral) and Canada (Cohere) round out the current set.

Close to half. Open-weight (downloadable) and closed (API-only) models are split roughly evenly, and Chinese labs skew open-weight while US labs skew closed.

ChatGPT and Meta AI each report around a billion monthly users, with Google Gemini close behind - though Google Search still operates at several times that scale, and Google's AI Overviews alone reach billions of people monthly.

For top-tier agentic coding, GLM-5.2, Kimi K2.7 Code, and DeepSeek V4 lead but need server-class GPUs. For a single 24-32 GB GPU, Qwen3-Coder 32B and Devstral Small are the practical local picks.

Subscriptions and API usage dominate, with Anthropic and OpenAI each at multi-billion-dollar annualized run-rates. Advertising (ChatGPT and Google) and agentic commerce with in-chat checkout are fast-growing newer channels.

Terms of citation

This page and its underlying data are free to cite and share under CC BY-NC 4.0, with attribution. Commercial use requires written permission from DemandSphere.
  • Attribution: credit "DemandSphere - State of Frontier AI" and link back to this page.
  • License: CC BY-NC 4.0 (attribution, non-commercial). No commercial use without permission.
  • Underlying data: model figures come from the AI Frontier Model Tracker, also available via the free JSON API and MCP server. Usage and traffic figures are third-party or company-disclosed estimates, cited inline.
How to cite
DemandSphere. "State of Frontier AI." DemandSphere Radar, 2026, www.demandsphere.com/research/demandsphere-radar/state-of-frontier-ai/. Accessed 29 Jun 2026.

Commercial use or questions: contact DemandSphere.

Data: AI Frontier Model Tracker · JSON API · CC BY-NC 4.0