Article
AI News Social Media

Companies turn to model routers to stop AI bills spiralling

by TechDefused Newsroom
The image features a miniature figurine of a person reading a book while sitting on top of oversized coins. This composition illustrates the concept of finance and investment through a creative, symbolic representation. — Credit: Photo by Mathieu Stern on Unsplash c Photo by Mathieu Stern on Unsplash

Routers are having a moment. They let AI agents send each request to the model best suited to it, and they are becoming a favoured way to trim runaway AI bills.

Not every task needs the biggest model

The logic is simple. Using a frontier model for a trivial query is like driving a Lamborghini to fetch the groceries. A router reads the task and picks accordingly, saving the powerful model for work that earns it. That matters most in coding, where tools such as Claude Code have driven bills higher. Splitting a job across several models is tricky, but the payoff is real.

Palantir and Databricks move in

Startups offer routers, and larger players are piling in. Palantir's Evolve tool includes a router that edits and optimises prompts for different models, and stops runaway costs by not firing the same prompt at several models at once. It cites a 97% cost cut for one customer that moved from a higher-tier OpenAI model to a cheaper one. Another client, McCarthy Building, used 60% fewer tokens last quarter than a year earlier. Databricks and Snowflake are building similar tools, the latter inside its Cocoa coding product.

Sometimes more tokens, not fewer

The goal is not always fewer tokens. Palantir notes it can be cheaper to use more tokens on a weaker model, or fewer on a stronger one. The point is fit, matching each task to the model that does it at least cost. Firms with access to enterprise data argue they are best placed to route well, though leaner startups can still innovate.

From token maxing to spending smart

The shift follows a year in which companies urged staff to use AI freely and worried about cost later. Executives at Uber, ServiceNow and Snowflake have since pushed for discipline. One complication stands out. ServiceNow's technology chief noted how hard it is to take a tool away once staff have it. So the aim is to keep people using AI while cutting the bill, rather than pulling access.

Routers sit within a broader idea of harnesses that make AI cheaper and more effective. Firms are starting to count the full cost, including the energy and water that data centres consume. Larger models can be slower as well as dearer. That, more than anything, is why routers look set to spread.

by TechDefused Newsroom