Advancing AI 2025: New AMD GPUs and rack design aim at cheaper AI compute

AMD used its Advancing AI 2025 event to highlight new updates to its data centre GPU line, including the launch of the MI350 series and a preview of what’s coming next with the MI400 and Helios rack platform.

The company said its core approach to AI in the data centre hasn’t changed. It still centres on improving price-to-performance, supporting open source and open hardware, and working closely with cloud and enterprise customers. That message was used to frame a series of updates focused on new silicon, updated software, and a growing customer base.

“With the MI350 series, we’re delivering the largest generational performance leap in the history of Instinct, and we’re already deep into development of MI400 for 2026 — a rack-level solution built from the ground up,” said Dr. Lisa Su, AMD chair and CEO.

Dr. Lisa Su, AMD chair and CEO presented the MI350. — Dr. Lisa Su, AMD chair and CEO presented the MI350 at Advancing AI 2025.

MI350, AMD’s latest data centre GPU platform, is now shipping. The MI355 and MI350x GPUs include 288GB of HBM3E memory per module and can run models with up to 520 billion parameters on a single GPU. They use the same compute architecture but are tuned for different setups: MI355 for higher performance, liquid-cooled systems, and MI350x for lower power, air- or liquid-cooled systems. AMD says they offer up to four times more AI compute compared to the previous generation, and support standard rack configurations with both air and liquid cooling.

The company showed inference throughput gains of 3x to 4.2x over its previous generation when running models like LLaMA 3.1 and DeepSeek. On training, AMD claims the MI355 can deliver 3.5x higher throughput in pretraining and up to 2.9x more performance in fine-tuning. It also reported up to 13% faster training times compared to Nvidia’s B200, using recent MLPerf 5.0 benchmarks.

“In head-to-head comparisons — such as DeepSeek R1 or LLaMA 3.1 — the MI355 delivers top-tier throughput using open-source frameworks like SGLang and vLLM,” said Dr. Lisa Su. “We’re generating up to 30% more tokens per second than Nvidia’s B200 and even matching the more complex and expensive GB200, even when they’re using their latest proprietary software stacks.”

MI350 also aims to support cheaper inference. AMD said the system can deliver 40% more tokens per dollar when compared to Nvidia B200 setups using proprietary software, thanks in part to support for open-source stacks like SGLang and vLLM.

The MI350 is available now in standard racks and is being adopted by major OEMs and hyperscalers. AMD expects partner server launches and cloud service provider instances to go live in Q3.

Looking ahead, AMD previewed the MI400 and its Helios rack, due in 2026. MI400 is being built to support models with hundreds of billions to trillions of parameters. Each GPU will offer 432GB of HBM and nearly 20TB/s of memory bandwidth. It’s expected to support up to 300GB/s of scale-out bandwidth.

The Helios rack combines MI400 GPUs, EPYC CPUs, Pensando NICs, and AMD’s ROCm software stack. It’s meant to be a fully integrated solution for training and inference at scale, with support for open standards like Ultra Ethernet and Ultra Accelerator Link.

Andrew Dieckmann, Corporate Vice President and General Manager of Data Center GPU at AMD, said the design choice for Helios—including its double-wide rack format—came from working closely with partners to balance complexity, reliability, and performance. “Large data centres tend not to be square footage constrained—they tend to be megawatts constrained,” he said. “So we think this is the right design point for the market based on what we’re delivering.”

Dieckmann also said AMD will continue to offer both air- and liquid-cooled options with the MI400 family, just like with MI350. “Most new AI deployments are in liquid-cooled data centres, but there’s still a robust market for air-cooled setups, particularly in enterprise environments,” he said.

AMD Instinct MI350 Series Family (Source - AMD) — AMD Instinct MI350 Series Family (Source – AMD)

The roadmap isn’t just about new hardware, but how customers use older ones. Dieckmann noted that AMD still sees demand for MI300 and MI325, even as MI350 gains adoption. “Some customers continue to buy MI300, even though MI325 is in the market,” he said. “There will be overlap for many quarters as different users transition at different speeds.”

Asked about AMD’s approach to pricing compared to Nvidia, Dieckmann said that price performance always depends on the workload and customers are responding to the system’s economic value. “Performance itself is a moving target—we keep improving our software stack, and so does our competition,” he said.

“We’ve told customers they can get 30–40% more tokens per dollar compared to other setups,” he said. “When you break that down and optimise for their specific workloads, those are big savings. Serving costs matter—being able to generate tokens cheaply is key to profitability. That’s why this message is resonating.”

AMD also used the event to note its broader strategy. The company is growing its cloud presence, now serving multiple public and regional providers. It’s also working with more than 40 national and government-backed AI programs. Software support continues to ramp up, with biweekly container releases and Day Zero support for models like LLaMA 4, Gemma 3, Qwen, and DeepSeek.

The developer ecosystem is also a focus. AMD is working closely with groups like PyTorch, Hugging Face, and vLLM to improve support for its hardware. It’s also running developer contests and offering early access to tools through its online hub.

The MI350 rollout and the MI400 preview suggest AMD is trying to close the performance and scale gap with Nvidia, while playing to its strengths in open systems and price-performance. Whether Helios can push AMD into more training-heavy workloads at scale will likely become clearer in the next year.

Source:

Related Posts