Generative AI applications and ML models are performance-hungry. Today’s workloads — GenAI model training and inferencing, video, image and text data pre- and post-processing, synthetic data generation, SQL and vector database processing, among others — are massive. Next-generation models, like new applications using agentic AI, will require 10 to 20 times more compute to train using significantly more data.
But these huge-scale AI deployments are only as viable as the ability to apply these technologies in an affordable, scalable and resilient manner, says Dave Salvator, director of accelerated computing products at Nvidia.
“Generative AI, and AI in general, is a full-stack problem,” Salvator says. “The chips are obviously at the heart of the platform, but the chip is just the beginning. The full AI stack includes applications and services at the top of the stack, hundreds of libraries in the middle of the stack, and then, of course, constantly optimizing for the latest, greatest models.”
New technologies and approaches are needed in order to fully unleash the possibilities of accelerated computing in the AI era, including AI platform innovations, renewable power and large-scale liquid cooling, to deliver more affordable, resilient and power-efficient high-performance computing — especially as organizations grapple with the increasing energy challenge.
These data centers can’t be retrofitted — they need instead to be purpose-built, adds Wes Cummins, CEO and chairman of Applied Digital.
“It’s a big lift, upgrading to the type of cooling, power density, electrical, plumbing and the HVAC that needs to be retrofitted. However, the biggest issue goes back to power,” Cummins says. “Efficiency directly translates to lower costs. By maximizing energy efficiency, optimizing space usage and improving infrastructure and equipment utilization in the data center, we can lower the cost of generating the product out of the hardware.”
Applied Digital is collaborating with Nvidia to deliver the affordable, resilient and power-efficient high-performance computing required to build the AI factory of tomorrow.
How the Nvidia accelerated computing platform makes the purpose-built AI factory possible
The AI factory solves for the end-to-end workflow, helping developers bring AI products to fruition faster. Its compute-intensive processes are significantly more performant, using more power but far more efficiently, so data prep, building models from scratch and pre-training or fine-tuning foundation models are done in a fraction of the time with a fraction of the energy expended.
Models are built faster, more efficiently and more easily than ever with support from truly full-stack solutions. And as advanced generative AI and agentic AI applications start to come to market, even the inference side of deploying is going to become a multi-GPU, multi-node challenge.
Recent Nvidia accelerated computing innovations provide the performance and efficiency needed to address these advanced, accelerated compute requirements, such as the Nvidia Blackwell platform. It uses a lightning speed fabric technology called Nvidia NVLink, which is about seven times faster than PCIe, connecting 72 GPUs in a single domain and can scale up to 576 GPUs to unleash accelerated performance for trillion- and multi-trillion-parameter AI models. Nvidia NVLink Switch technology fully interconnects every GPU, so that any one GPU amongst those 72 can talk to any other at full-line speed, with no bandwidth tradeoff and at low latency. NVLink enables fast all-to-all and all-reduce communications that are extensively used in AI training and inference.
Getting server nodes communicating with each other increasingly becomes a larger part of what can gate performance or allow performance to continue to scale, so really fast performant and configurable networking becomes a critical component of a large system. Nvidia Quantum-2 InfiniBand networking is tailored for AI workloads, providing highly scalable performance with advanced offload engines that reduces training times for large-scale AI models.
“Our goal is to make sure that those scaling efficiencies are as high as they can be, because the more you scale, the more scaled communication becomes a critical part of your performance equation,” Salvatore says.
Keeping high-performance supercomputers running 24/7 can pose a challenge, and failures can be expensive. Interrupted training jobs cost time and money; for deployed applications, if a server goes down and additional servers have to take up the slack, user experience is significantly impacted, and so on.
To address the specific uptime challenges of a GPU-accelerated infrastructure, Blackwell is designed with dedicated engines for reliability, availability and serviceability (RAS). The RAS engine keeps infrastructure managers up to date on server health, and servers self-report any problems so they can be quickly located in a rack of hundreds.
Tapping into ecologically sound power sources
The amount of power necessary to meet the demand for AI infrastructure and drive AI applications is posing a mounting challenge. Applied Digital has a unique approach to solving the issue, which includes “stranded” power, or already-existing energy resources that are untapped or underutilized, and renewable energy. These existing power resources speed up time-to-market while enabling a more ecologically sound method of delivering energy, and will be central to the company’s strategy until more efficient, low-carbon power generation systems become common.
Stranded power is created in North America in two ways: one, when an organization with power-heavy applications goes out of business, such as an aluminum smelter or a steel mill. A large amount of generation and distribution infrastructure was originally put in place to support that factory.
Applied Digital’s primary renewable energy source is wind power, from wind farms in states where land is cheap and the wind is plentiful. Wind turbines are often curtailed because there are frequently not enough sources to push that energy to, and pushing it to the electricity grid can drop prices into the negatives. The company co-locates data centers near these wind farms — in North Dakota, they tap into two gigawatts of wind power feeding into a nearby substation.
“What’s unique about the AI workloads is they’re not as sensitive to network latency to the end user,” Cummins says. “We’re able to be more flexible and actually take the load, the application, directly to the source of power, which we’ve done in multiple locations. Not only do we get to use a large percentage of renewables, but we’re using electricity that would otherwise go unutilized. It creates a lot of local economic benefit and brings a lot of interesting tech jobs into locations in America that have been left behind for the last 20 years.”
Leveraging tech advancements in liquid cooling
Technology advancements in liquid cooling further optimize power efficiency and sustainability. Liquid cooling reduces thermal load and limits power consumption needs. Direct-to-chip liquid-cooled server racks also reduce the need for water resource consumption versus air-cooled or evaporative cooling systems. In 2025, Applied Digital will be deploying liquid cooling to chip at scale. The aim is to drive the PUE metric, or power utilization efficiency, as close to one as possible.
For a PUE of one, 100 percent of the electricity drives the IT workload; anything above one uses that amount of power for cooling and mechanical. Historically, PUE of a very efficient hyperscale data center, depending on the location, would be anywhere from 1.35 to 1.5, while below 1.5 qualifies as a green data center.
“In our location using liquid cooling, we expect the PUE to be 1.15 on a year-round basis,” Cummings says. “Because of the efficiency, liquid cooling will significantly improve the PUE of any data center, agnostic of location.”
Liquid cooling offers a few other wins — it dramatically drops the ambient sound level in data centers, without the noise of the fans used for air cooling in the chassis. Subtracting those fans also adds to the energy efficiency. Direct liquid cooling also eliminates the need for chillers, which reduces power use yet again. And the amount of HVAC needed to cool the data center is also significantly reduced.
“If you compare our facility in North Dakota to something in a southern state, the power price, the low PUE, and the efficiency at which the facilities can run on 100 megawatts per year,” Cummins says, “we estimate that we save our customers approximately $50 million a year in operating costs.”
Dig deeper: To learn more about building the AI factory of the future, contact Applied Digital. Discover how our innovative solutions drive energy efficiency, enhance performance and support sustainability for next-generation computing.