Graphic Cards Calculating Raw Computing Power Of A GPU

kidrow · Sep 21, 2013

Hello all,
I'm curious about how the raw computing power of a GPU is calculated & how it can be used for comparison.

While looking for a suitable card for GPU rendering in 3d applications such as 3ds Max, Maya etc. I came across opinions such as "ATI xxxx has better compute capabilities than Nvidia xxxx" which left me wondering how these calculations are arrived at. Is there a simple formula such as no. of cores x bandwidth x .....?

How are the different architectures compared? So, Nvidia vs ATI or even Fermi vs Kepler, for that matter.

Besides raw numbers, is there any practical significance in distinguishing between gaming performance & raw computing power? Reviews based on the former are aplenty & if they are a good indication of the latter, it makes things a lot easier.

I'm guessing there is no straight answer to this but I'd appreciate if someone could point to what are the elements to be considered to make a proper guesstimate. I've come across terms such as ROP, texture fill rate, Gflop(?), & the like & am not sure what they mean. So before embarking on some googling, I'd be glad if someone could at least point out what are the relevant terms I should be googling for.

Thanks for your time. Much appreciated.

krishnandu · Sep 21, 2013

How about OCL Hashcat Benchmark?

kidrow · Sep 21, 2013

Ok, so I googled that & iinm, it's a benchmarking software which calculates the raw compute power. While such benchmarks are helpful, the trouble I've had is that there does not seem to be a comprehensive database of results covering a broad spectrum of GPUs. The benchmark results are either spread out across various forums or are done mostly with high-end cards. Since I don't have a large enough budget, those results are not that helpful. Hence my query.
Thanks for your reply though.

krishnandu · Sep 21, 2013

Ohh well, in that case may be you need to look at reviews and comparisons.

cranky · Sep 21, 2013

It's actually quite simple, the compute power of any processor including GPUs is measured in FLOPS. This is a published number for every CPU/GPU, available in technical documentation from manufacturer websites.

Where it gets complicated is the program that allows the OS and hardware to talk to the processor. The driver, IOW. There is an undetermined performance loss in that piece of software and makes direct comparisons difficult except between same driver revisions.

In addition to this, in a GPU may have specialised units dedicated to specific tasks and this complicates issues further.

Having said that, in each individual case especially for a professional user there best option is to pick from the offerings available and examine the performance for your specific needs.

A quick note, there is very little correlation between gaming performance - mostly DirectX, and professional apps - mostly OpenGL/CL or custom platforms. For example, the ATI Fire based on the 7950 will be much poorer at games than its consumer counterpart. Earlier manufacturers would simply disable and enable specific functionality to distinguish between consumer and pro parts, now they don't do that stuff.

kidrow · Sep 21, 2013

cranky said:
It's actually quite simple, the compute power of any processor including GPUs is measured in FLOPS. This is a published number for every CPU/GPU, available in technical documentation from manufacturer websites.

Thanks for that reply. It gave me a good starting point to begin googling & getting some sort of framework built in my head.

So if I understand correctly, all things being equal, GFLOPS is an indication of how fast a GPU is. Once that is determined, I have to move on to specific software capabilities & limitations to shorten the pool further.

I got hung up on the fact that a 3d rendering engine such as Iray uses CUDA. AFAIK, it means that it is completely different from OpenGL/CL. Now for Iray, generally speaking, the greater the no. of CUDA cores, the better. But this does not hold true when the architecture changes from Fermi to Kepler. So my googling took me into ROPs, texture fill rate, & GFlops territory.

So for non-gaming purposes, one can generalize with GFlops & move onto software-specific support. All in all, it's a better idea to browse software specific forums & look at software specific benchmarks. Right?

cranky · Sep 21, 2013

kidrow said:
So if I understand correctly, all things being equal, GFLOPS is an indication of how fast a GPU is. Once that is determined, I have to move on to specific software capabilities & limitations to shorten the pool further....

<snip>

So for non-gaming purposes, one can generalize with GFlops & move onto software-specific support. All in all, it's a better idea to browse software specific forums & look at software specific benchmarks. Right?

The answer is in your post

Don't underestimate the software side of things. All things being equal, vendor support and driver performance actually matter a lot. A driver crash or upgrade causes loss of profitability and manhours. Unlike gamers you will and should be chasing reliability, uptime and consistency, rather than raw performance once you have arrived at your basic choices.

Chaos · Sep 22, 2013

If you must use a cuda only renderer, then basically you are stuck with nvidia. Unfortunately the way nvidia has treated folks with the current kepler gen is pretty sad. If you want compute power, you'd have to get the titan plain and simple. The GK104 based cards are pretty poor for compute. Also the sad truth is that even the titan gets its rear handed to it in integer workloads by the 7970/7950 if the software has an opencl code path. Plus only the Titan and the super pricey tesla cards have 1/3rd rate double precision enabled. Every other card including the 780 which is based on the GK110 chip - the same as the Titan has its dual precision rate locked at 1/24th the single precision. If you software doesn't need dual precision then you are safe. Otherwise you are stuck. In comparison, none of the AMD cards have any locks of that sort - every Tahiti (79xx) based card does 1/4th rate dual precision.

There's a difference between the theoretical max gigaflops and what is actually achieved in real world apps. Unfortunately kepler is pretty poor in the latter because of some weird scheduling decisions at the hardware level.

kidrow · Sep 22, 2013

cranky said:
The answer is in your post

Don't underestimate the software side of things. All things being equal, vendor support and driver performance actually matter a lot. A driver crash or upgrade causes loss of profitability and manhours. Unlike gamers you will and should be chasing reliability, uptime and consistency, rather than raw performance once you have arrived at your basic choices.

Chaos said:
If you must use a cuda only renderer, ....
There's a difference between the theoretical max gigaflops and what is actually achieved in real world apps. Unfortunately kepler is pretty poor in the latter because of some weird scheduling decisions at the hardware level.

Thank you both.

I should've probably mentioned that I'm a beginner & not a professional. So any GPU purchase I make is going to be just to test out waters & pretty low-budget. Not disputing anything you said. Just saying. You are right on all counts of course.

Most GPU renderers in the 3d CG realm are cuda only. & so yes, it's wise to invest in an Nvidia card even if one is unsure of the software, despite the fact that ATI cards have better raw performance in OpenCL. (That's the gist of what I absorbed from your post, though you really lost me at integer workloads & double & single precision

)

I've concluded that it's really best to approach this from the software side of things & work my way up. It's best to look at software-specific benchmarks & come to a decision.

Again, thanks for your replies. Much obliged.

Chaos · Sep 22, 2013

If you must use i-ray or mental ray (which are basically the same under the hood), you basically have no choice but to go nvidia as you can't expect an opencl renderer for either soon as nvidia acquired mental images.

However there are others - vray, luxrender and indigo which all support opencl. What you choose to use is your choice.

kidrow · Sep 22, 2013

Chaos said:
If you must use i-ray or mental ray (which are basically the same under the hood), you basically have no choice but to go nvidia as you can't expect an opencl renderer for either soon as nvidia acquired mental images.

However there are others - vray, luxrender and indigo which all support opencl. What you choose to use is your choice.

What I meant was it's safer to go with Nvidia atm since renderers such as Iray, Octane, Redshift are cuda-only. With vray, lux or indigo, you have the choice of both. But I'm not sure of which renderer is OpenCL-only. In that sense, Nvidia will give you greater flexibility.

Of course, this applies to someone who wants to test out waters. If someone working at a professional level has the pipeline nailed, it would make more sense for him to employ the best resources possible. In which case, if OpenCL then ATI, as you say.