Hello China Tech

Hello China Tech

Chinese AI Models Are Winning the Cost Layer

Chinese AI models are taking share on OpenRouter because the inference market is pricing cost faster than Washington is pricing risk.

Poe Zhao's avatar
Poe Zhao
Jun 01, 2026
∙ Paid
img

According to OpenRouter usage data cited by Bloomberg in May 2026, 7 of the 10 most-used AI models on the platform were built by Chinese companies. DeepSeek, Tencent, and Moonshot AI all appeared near the top of the rankings. Anthropic’s Claude Opus 4.7 was the highest-ranked American model in that snapshot.

OpenRouter, a platform that routes developer API calls across more than 400 AI models, has become a clearinghouse for the global inference market. It now processes 25 trillion tokens per week. Six months earlier, the figure was 5 trillion. On May 26, OpenRouter announced a $113 million Series B led by CapitalG, Alphabet’s investment arm, with NVentures, Nvidia’s venture arm, among the participants. TechCrunch, citing The New York Times, reported a roughly $1.3 billion post-money valuation.

The investor list is telling. The parent company of Google and the maker of the world’s dominant AI chips are both backing a platform whose core value proposition is helping developers find cheaper and more suitable alternatives across providers. That bet implies both expect the multi-model, cost-driven inference market to grow faster than any single provider’s share of it.

The volume shift carries its own weight. At 5 trillion tokens a week, OpenRouter was a niche tool for cost-conscious developers. At 25 trillion, it is infrastructure. OpenRouter CEO Alex Atallah has framed inference at scale as a multi-model routing problem. CapitalG, its lead investor, put the infrastructure analogy more directly: OpenRouter is trying to fill for AI inference the kind of infrastructure gap that Stripe filled for digital payments.

On a platform processing that volume, the models receiving the most traffic carry weight well beyond experimentation.

img

Washington has reached a different conclusion about what the data means.

On April 29, the House Select Committee on China and the House Committee on Homeland Security sent letters to 2 American companies: Airbnb and Anysphere, the maker of the AI coding tool Cursor. Both letters cited serious concerns about the national security and data-security implications of using AI models developed by Chinese firms.

The Airbnb inquiry was prompted in part by comments from CEO Brian Chesky, who said in an interview last October that the company preferred Alibaba’s Qwen, an open-weight large language model often described as open source, in certain situations because it was “fast and cheap.” In Anysphere’s case, the probe specifically targeted Composer 2, a model the company later disclosed was built on Kimi, developed by Beijing-based Moonshot AI.

The scope is now clear. The concern extends well beyond one company’s customer-service chatbot.

Airbnb’s customer-support AI bot, expanded over the past year, now handles about 40 percent of issueswithout escalating them to a human agent, up from about one-third earlier this year. In a series of Bloomberg interviews in May, Chesky pushed back on the probe. Airbnb uses “a variety of open-source models, including US open-source models,” he said. “We are not providing data to any Chinese companies. They don’t have access to any data.” The technical point, he added, is straightforward: “An open-source model does not have access to data. It doesn’t work that way. I think people need to understand how this stuff works.”

At Google I/O, Alphabet CEO Sundar Pichai shifted the frame from model origin to frontier capability. “I worry less about, ‘Are we adopting open-source models from China?’” he said, “and more, ‘Are we doing enough in the US to make sure we are staying at the frontier?’”

Two prominent technology CEOs are pushing back on the Congressional framing from different positions. The one deploying Chinese models says the security concern misreads how the technology works. The one competing against them says the strategic priority should be American frontier capability.

Their responses converge on a shared observation: the investigation risks treating model origin as a proxy for data transfer. In many open-weight deployments, that proxy does not match the technical architecture.

There is an important distinction between the cases. OpenRouter captures routed inference traffic, where prompts move through an API gateway and model providers. Airbnb’s defense rests on a different architecture: open-weight models deployed inside its own controlled infrastructure. The market signal is similar. Companies are choosing cheaper, capable, and more flexible models. The data-flow risk is not the same.

The competitive dynamic Congress is responding to, meanwhile, keeps accelerating.

In May, DeepSeek announced that a 75 percent discount on its flagship V4-Pro model would become permanent after the end of the month. Cached input pricing starts at RMB 0.025 per million tokens. Uncached input costs RMB 3. Output runs RMB 6. Even against Claude Sonnet 4.6’s discounted prompt-caching economics, DeepSeek’s cached-input price is still around 1 percent of Anthropic’s comparable input cost.

Making a promotional discount permanent does not prove durable profitability. It does show that DeepSeek is willing to set the market around that price. Whether the rate reflects durable unit economics, temporary subsidy, or a capacity strategy remains opaque. The pricing floor is now visible.

The pressure on the other side of the equation is equally specific. Uber president and chief operating officer Andrew Macdonald recently acknowledged that the company had reportedly exhausted its annual AI budgetjust four months into 2026, as Claude Code usage and token consumption rose faster than the company could clearly connect to shipped consumer features.

As enterprises deploy agent workflows that chain dozens of inference calls per task, the cumulative cost of premium-priced tokens scales from a line item into a constraint on how aggressively a company can deploy AI at all. When budgets evaporate that fast, model origin becomes harder to prioritize. Cost becomes the first screen.

Three months ago, we documented the moment Chinese models first surpassed American ones in weekly token consumption on OpenRouter. The pattern was emerging. The data now shows something more consolidated: dominant market share, 5-fold volume growth, and permanent pricing set far below American competitors.

The question Congress is investigating, whether American companies should be using Chinese AI models, may already trail the question the market is answering every day.

The open-weight distinction Chesky drew has specific consequences for what policy can and cannot do. Behind it lies a cost structure that permanent pricing has now made visible, and a set of enforcement constraints that neither hearing nor letter can easily resolve.

User's avatar

Continue reading this post for free, courtesy of Poe Zhao.

Or purchase a paid subscription.
© 2026 Hello China Tech · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture