Skip to content

competition on ai market: lesson of xai

analysis Hand on a CANCEL key beside a CRT reading "xAI PROMO PERIOD EXPIRED" in an emptied PC lab — devs leave when free ends.


the pace of model releases in 2026 has become genuinely hard to track. a few years ago, a major new language model was a multi-month event. today, labs ship on something close to a weekly cadence. you blink and there's a new grok, a new qwen, a new glm. the leaderboards refresh before most developers finish reading the announcement post.

Table of new model releases Jan–Apr 2026: Qwen 8, OpenAI 7, Z.ai 7, MiniMax 6, Google 5, xAI 3, Anthropic 3, Moonshot 2, DeepSeek 1.

the pattern that jumps out immediately is geographic. chinese labs collectively shipped more new models in four months than american labs combined – and they did it while simultaneously pushing prices down hard.


the old guard is losing ground

a year ago, openrouter's token traffic told a familiar story. google, anthropic, openai, and deepseek together commanded roughly 85% of all tokens routed through the platform. by april 2026, that same group accounts for roughly 50%. the pie grew approximately 4× larger year-over-year, so this is not a story of absolute decline – the major western labs are serving more tokens than ever. they're just serving a much smaller slice of an exploding market.

OpenRouter token share Apr 2025–Apr 2026: Google, Anthropic, DeepSeek, OpenAI combined drop from ~85% to ~50% as "others" expand.

the other 50% belongs to a wave of models that were barely audible a year ago. minimax, kimi, qwen, deepseek v3.2, glm – each found a niche, nailed it, and scaled fast. open-source releases have become almost daily events, and developers on openrouter are ruthlessly pragmatic: they route to whatever gives the best result per dollar, and right now that answer changes week to week.


deepseek and the race to zero

if there is one lab that most clearly defines the competitive logic of this era, it is deepseek. the hangzhou startup has made aggressive pricing its primary weapon. in january 2025, the release of r1 – a reasoning model that matched or exceeded openai's o1 on key benchmarks at 90-95% lower cost – triggered a full-blown price war.

nvidia suffered its largest single-day market cap drop in history as markets absorbed the implication: if inference costs collapse, the economics of the entire ai infrastructure stack come into question.

deepseek has not let up since. in late april 2026, just this week, the lab announced:

  • a 75% promotional discount on its new v4-pro model – already a model that undercuts gpt-5.5, gemini 3.1 pro, and claude opus 4.7 at full rate
  • input cache hit prices permanently cut to one-tenth of previous levels across the entire api
  • after the promo, v4-pro input tokens cost ~$0.036/m. gpt-5.5 charges $0.50/m for cached input. a conversation on gpt-5.5 can be 32× more expensive.

the logic is straightforward: lower the barrier, acquire developers, make switching painful through integration depth. it's a classic land-and-expand playbook applied to inference economics. and it is working – deepseek remains one of the most-used model families on openrouter by absolute volume.

but here's the thing about using price as your primary weapon: you have to be able to sustain it. deepseek arguably can, through its mixture-of-experts architecture and huawei ascend integration. temporary discounts, as we're about to see, are a very different story.


xai: the cautionary tale

OpenRouter token share with xAI overlaid in green: peaks above 40% Sept–Dec 2025, collapses to near zero by April 2026.

to understand why promotional pricing alone can't build durable market share, you don't need to look far. the most instructive example played out on openrouter itself, between august and december 2025.

for most of that year, xai had been a niche player on the platform. then, in a short window, the lab executed a full-court press:

the rise:

  • aug 26 – grok code fast 1 launches at $0.20/m input, $1.50/m output, explicitly targeting agentic coding – the fastest-growing use case on openrouter
  • sep 19 – grok 4 fast ships with a 2m-token context window at $0.20/m input, $0.50/m output, undercutting nearly every comparable western model
  • nov 2025 – free distribution through consumer apps begins, flooding openrouter with non-developer traffic and broadening xai's usage profile dramatically
  • late nov – grok 4.1 fast offered free on openrouter for two weeks, alongside free access to all xai agent tools

the peak:

  • xai reaches 41.1% of total openrouter token share
  • grok code fast 1 leads all reasoning model traffic on the platform
  • in programming – the dominant use case – xai models claim over 46% of the category

the collapse:

  • promotions end. free tiers expire.
  • by april 2026, xai does not appear as a named player in openrouter's top model rankings

the arc is almost textbook. xai achieved genuine, impressive dominance – for a window of weeks. but the mechanism that drove it was promotional availability, not earned preference. the moment the free access ended, developers rerouted. on a platform like openrouter, that takes a single parameter change.


the actual lesson


xai's real failure wasn't the discounts. it was what came after them.

during the promotional window, xai had something genuinely valuable: millions of developers actively using their models, building with them, getting comfortable with the api. that's incredibly hard to manufacture. but when the free tier ended, there was nothing to hold those developers in place. no feature that had become load-bearing in their stack. no next model that shipped fast enough to give them a reason to stay and pay.

look at the release cadence again. in the first four months of 2026, qwen shipped 8 models, openai 7, zhipu 7. xai shipped 3 – two of them betas. while competitors were iterating weekly and giving developers new reasons to deepen their integrations, xai was quiet.

xai cracked acquisition. they completely failed retention.

retention in the developer market isn't about loyalty — it's about switching cost. and switching cost comes from one thing: the model getting meaningfully better, fast enough that rebuilding your stack around it seems worth it. kimi, minimax, qwen kept shipping. every new release was a reason to stay. xai gave developers cheap access and then asked them to start paying, without giving them a compelling reason to do so.

xai had the attention. they just didn't earn the habit.

Stay in the loop

Get the latest AI news delivered to your inbox weekly

Thanks for subscribing!