Taking the Leap into Discrete GPUs

‘Behind the Builders:’ Intel Fellow Aditya Navale explains how the graphics processor transitioned from drawing better pixels to handling humanity’s most complex computational challenges.

If you guessed Intel employees are as eager as the rest of the world for Intel to enter the discrete GPU market, you’d be correct.

“I always wanted Intel to get into discrete graphics,” says Aditya Navale, an Intel Fellow who’s been working on graphics technology for more than 20 of his 30 years at the company.

The recent launch of Intel® Arc™ discrete graphics for laptops answered the first wave of anticipation ­­— the desktop versions arrive later this year — and marked a major step in Intel’s long history in graphics. Navale’s team in the Accelerated Computing Systems and Graphics Group sits at the literal center of it, developing the core IP architecture that underlies multiple generations of Intel GPUs, including the first Intel Arc A-Series graphics coming out.

Intel is already the market segment leader by volume in PC graphics when counting integrated graphics — which commonly reside on the same die as the CPU. “Going from integrated to discrete is a huge leap,” Navale says. “It’s an extremely complex task and it’s very challenging.”

Since 2019, gaming performance on Intel’s integrated graphics has quadrupled, but Intel Arc GPUs take that base technology and multiply it again. Current Intel integrated graphics top out at 96 execution units, while Intel Arc graphics will go up to 512 Xe vector engines. “When you increase a machine size like that, over 5x,” Navale says, “the challenge is to get the maximum uplift in performance at a given power envelope.”

“One of our goals on Intel Arc graphics, in addition to becoming a meaningful player in the marketplace, is to learn how to architect and design and build software for big GPUs,” he explains.

To offer a competitive alternative as a new entrant means not only offering compelling features and performance but also supporting an array of games and applications. “It’s always software first that drives our architecture,” he says.

From Drawing Pixels to Deep Learning

What’s driving the need for graphics capability several times beyond what’s already in your common workhorse laptop? The answer is a colorful study in contrasts.

The main job of the GPU is to accelerate graphics rendering: to create 2D and 3D images on the 2D screen you’re looking at. Put even more simply, the GPU helps draw the pixels on your screen. Where a CPU is designed to handle one or two sophisticated tasks at a time, a GPU is designed to do many small tasks — drawing all those pixels — in parallel.

About Aditya Navale: Builder in brief

Home site:
Folsom, California

Title: Intel Fellow, director of GPU core IP architecture

Team: XPU architecture and engineering, Accelerated Computing Systems and Graphics Group

Years at Intel: 30

Path to Fellow-ship: “A combination of opportunity, accomplishment, importance to the corporation and a little bit of popularity contest.”

Perfect workday: “A series of small strategy and problem-solving sessions. The best ideas always come out of or evolve from a discussion.”

Favored off-the-clock reset: “I usually run. That’s my way of recharging and getting my exercise. When you get stressed, if you start running, over time, the stress goes away.”

When you’re doing something like reading this article, the pixels on your screen aren’t changing very much, so there isn’t much for the GPU to do. But switch over to a photo-realistic 3D game and things are changing constantly. “The more realism the game wants to bring in, the more work the GPU has to do,” Navale explains. Finer details like fur waving in the breeze or multiple light sources and shadows mean more work to get to each pixel displayed — and for those details to render smoothly on your screen, it has to be done quickly.

Games are just the start.

As people apply the GPU as a highly parallel data processor, Navale says, “the use cases for GPUs are exploding.” Beyond pixels, the GPU is now helping with humanity’s most sophisticated computational challenges thanks to its use across artificial intelligence, deep learning and high-performance computing.

If developing chips for such seemingly different jobs sounds complicated, Navale says the world of software built on GPUs helps bring “a method to the madness.”

“We have a software ecosystem that has to incorporate all these new requirements,” he says. “Since it has to account for HPC, AI, gaming and who-knows-what, it has to evolve in a very synergistic way. It requires a lot of thought and deliberate forward motion of the architecture.”

The GPU’s Multiplying Demands Continue — into Zettascale

The multiplying demands on the GPU are just getting underway, requiring flexibility and new design approaches to push the GPU into new heights of performance. “The way we architect and implement the IPs, we account for the fact that the IP can go into an integrated segment or all the way to a huge discrete GPU,” Navale says. “That scalability is built in. We also have lots of parameterization, which allows us to extract scalability in an easy and fast way.”

To reach zettascale supercomputers — the next order of magnitude in the world’s most powerful systems — “the scalability is going even further,” he explains. That means not only multiplying capabilities within each chip, but also assembling multiple chips together as systems-in-packages.

“A little bit of that already has happened on Ponte Vecchio,” which combines 47 different tiles into a single GPU. “But now that paradigm is growing and has more momentum and more traction as we move forward.”

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.

Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See backup for configuration details. No product or component can be absolutely secure.