Now, enter HyperTransport 3.0 and a future ATIMD GPU connected directy to it via the Torrenza platform. It can still have its pool of dedicated fast, wide memory for graphics use, but at the same time, it can access the whole system memory at CPU-like speed via HT 3.0, at low latency and, if need be, cache coherency. In this case, this GPU would become a true coprocessor, in nearly the same fashion as the 8087 was to 8086 some 25 years ago.
Is it a coincidence then that ATI "Stream Computing" initiative, enabling ATI GPUs to work in concert with CPUs to solve complex computational problems, comes exactly at the time of them being acquired by AMD? After all, the scientists, engineers and financial wizards can hardly resist the ~ 360 peak GFLOPs of an ATI X1950XTX, when a comparably priced Intel Core 2 Duo E6700 at 2.66GHz gives you 'only' 42 peak GFLOPs in single precision, half that in double precision - and current Athlon64 top CPUs are half of the Intel figure.
Now, both R600 and G80 are expected to break the half teraflop peak processing power per chip - and, quite possibly, widen the range of apps able to be accelerated on them too. At the same time, there are no major per-core FPU performance boosts planned beyond what today's Intel Core2 and tomorrow's AMD K8L can do.
In summary, GPUs may be the trick in bringing the teraflop computing power to the desktop (SLI's G80 or Xfired R600 will surely do the trick - if the app uses it), and petaflop power to mid-sized supercomputing clusters - affordably. With Torrenza, AMD holds the advantage right now in implementing these early, unless Intel decides to give CSI to the X86 platform a bit earlier for tightly coupled co-processing - yet again after some 20 years.