Graphic Cards A New Way to Talk about GPUs

A New Way to Talk about GPUs By Jason Cross , July 29, 2005

He was perusing the forums over at Beyond 3D, marveling at the absolute guesswork and rumor-mongering going on about ATI's upcoming R520 graphics chip. If the forum's prognosticators are to be believed, it will run at around 700MHz, or nowhere close to that. It'll have 32 pipelines, or 24, or 20, or 16. It'll have some sort of unified shader architecture, though nothing like the GPU in the Xbox 360, or it will have a completely traditional architecture. It will enable AA together with HDR (a current sore spot for Nvidia's cards), or that's just totally unfeasible.

The random guesswork isn't really a surprise. ATI has been incredibly quiet and secretive about its next major GPU architecture: All we really know for sure is that it will be Shader Model 3.0–compliant, several revisions have taped out by now, and some version of it was running the impressive Alan Wake demo at E3 this past May. Some sites post new rumors every week, usually contradicting the rumors from the week before.
What really struck me is how the fans of 3D graphics are sticking hard and fast to a certain way of looking at GPUs. They discuss everything in terms of "pipelines," with some even going so far as to say that the GeForce 7800 GTX isn't a "true" 24-pipeline chip because it only has 16 raster operation units (ROPs), and can therefore only really draw 16 pixels per clock, max. I've spoken with both ATI and Nvidia on the subject, and they both say 16 ROPs is plenty. The truth is, the more-advanced 3D games are so limited by shader operation speed and texture fetching that the GPU is drawing nowhere near 16 pixels or samples per clock, they say. I was told by one engineer that the performance benefit of moving from 16 to 24 ROPs would be less than 5%, but it would come at a considerable cost in transistor count.

In the grand old days of just three or four years ago, even the most advanced 3D engines basically just layered a few textures on top of each other with simple blending modes. Every now and then a pixel shader would be used to make the water look like bumpy Mylar, but beyond that, shaders were mostly used to perform more of these texture blends at once. It was appropriate to talk about GPU performance by counting pipelines and how many pixels or samples could be drawn per second. You had your fill rate, your clock speed, your memory bandwidth, and that was enough.

The world is changing rapidly. Games that use DirectX 9 level shaders, either Shader Model 2.0 or 3.0, are tricky. Some shaders use floating-point math, some integer math. The math required to draw a single pixel is increasing—not just on spot areas like bumpy and shiny water, but on virtually every pixel in the game. And it's not just blending together some textures, either. "Data textures" like normal maps or gloss maps are used to feed comparatively complex calculations to determine the final color of a pixel. Compared with the number of pixels a GPU will effectively output per clock cycle, a whole lot of math is going on, and the number of texture fetches is going up, too.

We need a new way to talk about GPUs. Pipelines, clock rates, and fill rate were a useful shorthand a couple of years ago, but that's no longer the case. What do we do when the same shader units that perform pixel shading operations are used for vertex shading operations? What do we do when the arithmetic logic units (ALUs) aren't organized into neat little "pipelines" or even quads anymore? How do we account for the fact that not all ALUs are created equal—some can perform more operations per cycle than others, and different GPUs may have ALUs that perform operations of different types. How do we account for the increasing value of on-chip caches?

Before long, the performance of GPUs may hinge on some of the same features that make for a good desktop CPU, things like out-of-order instruction processing, translation lookaside buffers, or data prefetching logic.

What do you think the most important metric of next-generation GPUs will be? And what simple, understandable terms should we use to compare them?

Read Complete article @ [rank=www.extremetech.com/article2/0,1697,1841940,00.asp]ExtremeTech.com[/rank]
 
I agree with the writer completely. Having reviewed all that jazz for quite some time now, I'm sick of looking at hardware from a traditional viewpoint.
Before, CPUs were looked at entirely from a frequency perspective. Thankfully, AMD came to the rescue (with their PR rating) and proved that clockspeeds are not the true way to evaluate processors.
This had occured to me too, a couple of months ago. Every place I visit, pipelines are all that matters when it comes to discussing GPUs. And, not pipelining in general, but Pixel pipelines to be more specific. "24 pixel pipelines is the way to go," is what I read every so often. I think the poor geometry processor a.k.a. vertex shader better find a good PR company for itself, if it intends to stay in the limelight for a bit longer.
 
Back
Top