Baseten Wants to Be the AWS of AI Inference. It Just Raised $150 Million to Try

The story of any gold rush is rarely just about the miners who strike shiny veins. It is also about those who build the roads, sell the tools and lay the tracks.

After all, for companies to get to the AI-powered future, they will likely need infrastructure that takes away the complexity of model orchestration, allowing them to concentrate on their differentiators.

“Every organization in the world is either going to become AI-first or AI-enabled,” Tuhin Srivastava, Co-founder and CEO of Baseten, said during a discussion for the PYMNTS Monday Conversation series hosted by PYMNTS CEO Karen Webster.

“As companies move toward this, you really only have one competitive advantage,” Srivastava added. “That advantage is speed. And when it comes to speed, you can only move fast if you delegate away stuff that is not core to you.”

“It’s like the card rails enabling FinTechs to build their applications on top of the existing infrastructure that’s solid, secure, global, accepted,” Webster noted, drawing an analogy to the rise of FinTech.

The parallel is apt, particularly because the companies that moved fastest in FinTech were often startups unburdened by legacy systems.

“You think about payment rails — there aren’t many of them, but they’ve established durability, value and trust,” Webster said.

This is the contest to build the rails of AI, an infrastructure layer designed to make the process of inference, or running machine learning models in real-world applications, as seamless and dependable as possible.

https://players.brightcove.net/6415959430001/ScdtaQRMq_default/index.html?videoId=6379803858112

Shift in Value From Training to Inference

The early narrative of AI revolved around training: who could afford the data centers, who could marshal the compute, who could train the biggest and most sophisticated models. That arms race continues, but training is only one chapter. Once a model is built, it must be used. Inference is where AI meets reality.

For its own part, Baseten offers an “Inference Stack” consisting of tools for deployment (Model APIs, Truss packaging, Chains), flexible deployment options (cloud, hybrid, self-hosted), support for open-source models and enterprise-grade concerns (latency, cost, reliability).

“Our biggest customers five years from now don’t exist today,” Srivastava enoted. “These companies are being founded today, and they can move fast because they don’t have decades of history weighing them down.”

But what about incumbents? Can large enterprises, with their scale and inertia, keep pace?

Webster pushed the question, noting that consumers now expect businesses to “totally get with the program.”

“They want to move fast. I even think they can move fast from a technology perspective,” said Srivastava. “Where it’s harder is justifying ROI because they’re so big — and they have more to lose with the existing customer set.”

Yet the pressure is undeniable.

“Six years ago, when we used to go talk to enterprises, they would be like, oh, why should we care?” Srivastava added. “It was almost like, convince me I should care. Right now, it’s like, how do I get to the point where I can embrace this?”

Building Trust at Scale

The appeal of being the “rails” lies not only in technical indispensability but also in economics. Inference infrastructure has the potential to generate recurring, usage-based revenue. Every call to an AI model requires inference, and each inference requires compute.

Of course, infrastructure is only as good as the trust it inspires. For enterprises, trust hinges on three things: scale, security and reliability. Enterprises need guarantees: that outputs will be safe, that uptime will be reliable, that latency will be predictable.

“One problem is, how do we trust you can manage our scale?” Srivastava explained. “The great thing is, the most used products today are these really fast-growing companies, and the scale we operate with them actually dwarfs most, if not all, enterprise scale.”

Inference infrastructure, after all, is not just a technical problem; it is a business risk management problem.

“There’s a really delicate balance, which is how do we move very fast without breaking things? It’s a brand-building exercise: how do we show up and say, I know what you care about, I know what’s at risk for you, and here’s how we’re taking care of it,” said Srivastava.

“To me, speed to change is more important than speed to market,” he added, noting it’s why even dominant players like OpenAI and Google face constant pressure.

Still, with AI tools becoming more accessible than ever, how does a company like Baseten defend its position?

“Defensibility comes from workflows and user feedback loops,” Srivastava explained. “If you have something linked to proprietary data, where the way models are used generates unique value that flows back into improving the models, that’s where defensibility is. That’s where the sum becomes greater than the parts.”

Defensibility also comes from expansion. Looking ahead, Baseten has ambitions beyond inference.

“For us, inference is one part of AI infrastructure. Beyond that, there’s training, evaluation, fine-tuning. We really want to own that entire loop,” Srivastava said. “We want to build the next AWS for inference.”

Even the name “Baseten” reflects that ambition. Inspired by the base-10 counting blocks of his Australian childhood, Srivastava explained: “It’s how we make sense of the world, and interestingly, the decimal system. For us, we’re helping people make sense of the world with AI.”

The company recently raised $150 million to pursue that vision, which is it has already started putting to use.

Source: https://www.pymnts.com/

Shift in Value From Training to Inference

Building Trust at Scale

Related Posts