Recently announced Havok FX add-on to the Havok API will in fact run on ATI’s video cards in the same way that NVIDIA has publicly stated theirs will. ATI claims they’ll run it better.
Overview :
Both ATI and NVIDIA are using the same method of computing physics on their GPUs -- load and process some physics calculations if there's just one GPU, or load and process all physics calculations onto one full GPU if there are two GPUs (Crossfire or SLI). In regards to both methods, the approach is monolithic, meaning that both ATI and NVIDIA prefer to load all things related to graphics onto the GPU.
ATI claims that its latest X1900 family has more than enough processing power left sitting idle most of the time to take care of physics and 3D rendering. This is a strong indication that the current state of 3D graphics is far too concerned with frame rate when it should be looking into how best to utilize the chips that ATI and NVIDIA produce.
ATI’s View on Gaming Physics :
This diagram shows the various shapes that physics objects can have in their simulated environment. The most basic method is AABB cube separated object that essentially has edges directly to the cubed edges in terms of interaction. The sphere method allows a higher resolution of physics simulations at the expense of more processing power. Tetrahedrons are the third option that physics simulation has use where a complex polygon of space surround the rigid body and are the physical edges of the object. Finally, a mesh is the best option for physical simulation (in particular, a single pixel mesh) but it is very compute bound.
When objects interact with each other in a game, the items changes in their phase representations as interactions occur. In the above example, the two cubes are moving towards each other and once their AABB cube models breach each other, the physics simulation knows it needs to move to a more detailed simulation of the objects in order to more accurately present their interactions. Once the objects have interacted and moved away from each other they can resort back to the more course physics models thus saving processing power.
According to ATI, physics calculations are very heavy in floating point arithmetic and conditional branches to that test for item interactions.
As is painfully obvious in all games out today, physics simulations on PCs are all hacked and modified versions of real “text book physicsâ€. Programmers are forced to cut their simulations to lower resolutions in order to get the games to run at all and thus the need for AABB cubes, spheres and other physics shapes are necessary. The ideal situation would be to have enough compute power to simulate physics perfectly, but we aren’t anywhere near that yet.
ATI's Architectural Advantages :
ATI was eager to specify how their GPU architecture is well built for physics simulations. First, the R580 has a tremendous amount of floating point capability with its 48 pixel shaders. ATI estimates that 375 GFlops for a single card and 750 GFlops for a CrossFire system are open for different processing models.
Compared to a blazing fast modern CPU that has 10 GFlops of total floating point calculation capability, the GPU has tremendous opportunity. Interestingly, though no one outside AGEIA knows for sure how much power their PhysX chip actually has, ATI feels that even if they have 25 GFlops of performance running at 100% efficiency, they can more than make up for their slightly lower efficiency with higher GFlops to spend.
Another architecture feature that is very important to physics processing on ATI’s GPUs is the inclusion of their dedicated branching logic. When the R520 and R580 were launched, this feature was critical to ATI’s computing logic that graphics shaders were moving to a highly branched coding style. These same branching units can be utilized when doing physics calculations as well.
Finally the highly threaded cores that the R520 architecture was designed with in conjunction with the branching logic allow the ATI GPUs to break up physics calculations into smaller, easier to process sections. Again, just as this feature helped the R520 and R580 in pixel shading performance, physics shading can utilize it as well.
As you can see in the slide above, a finer, more granulated threading system allows ATI’s GPU to process a much closer amount of pixels that actually need processing. The more black areas that exist in each example represent more data that is NOT processed and thus more efficient. As the thread size increases to even 16x16 (256 bytes) the number of physics “pixels†or events that must pass through the system increase dramatically.
ATI’s Software Approach :
They are working on a Data Parallel Processing Architecture Abstraction layer that will allow developers to utilizing the hardware without having to go through Direct3D or OpenGL APIs. This software is being given away for free to developers and API developers to use as they see fit. And while physics is the focus for now, this architecture abstraction will be open to all other types of GPGPU work too.
Without having to go through Direct3D, ATI’s hardware should be able to pull more performance out of their architecture than if they left the calculations to D3D. However, the abstraction layer that ATI is writing CAN go through D3D and OpenGL if the developer would like it to.
ATI did admit that a common API for physics coding would simplify the industry and allow competition between NVIDIA, ATI and even AGEIA to exist at a level where we could actually tell you which one is better with some kind of certainty.
Summary - GPU Configurations :
In a current generation system shown above, the CPU is doing the physics
work, the GPU is doing the graphical work.
ATI is telling us that their cards will also support rendering and physics acceleration in a single GPU configuration as shown here. If the game being played, at the resolution the user selects, is able to render more frames than necessary per second than required for adequate visual quality, the extra GPU cycles can be utilized in physics calculations.
An even better solution, for the user and for ATI, is that you have two ATI GPUs that can split the rendering and physics calculations between the two different cards. What is most impressive to me is that ATI has assured me that these two cards do not have to run in CrossFire mode, and thus they do not have to be the same GPU. If you have an X1900 XTX now, and in about eight months you buy a new ATI 2800 XTX, you can save your X1900 XTX for physics calculations. As of now, NVIDIA has said they do not support this feature but see the value in doing so. Hopefully that means the feature will be coming soon as the upgrade opportunities this offers readers are fantastic.
Rivals :
While the solutions from all three of the current physics news-makers differ quite a bit, AGEIA, NVIDIA and ATI are at least talking about physics which is good for our industry and gaming in general. As gaming physics becomes more prominent, users will see the advantages of physics acceleration and force game developers and hardware vendors to innovate for the benefit of everyone.
ATI feels that their solution is more than capable of competing with AGEIA, should the physics coding and API work be done to the necessary level They are going to start providing the tools necessary to do it, though they do have a lot of work to do if they want to catch AGEIA in that department.
Final Take :
ATI is saying that its method for processing physics on the GPU is superior to both AGEIA's and NVIDIA's. According to the company, those who have already purchased any one of the X1800 or X1900 series can rest assured that their investment will last.
Using its propriety API, ATI is able to offload physics processing to any GPU in a dual-GPU setup, regardless of whether or not the cards are in Crossfire mode or that they are even from the same family. This way, those who upgrade later can use their existing X1800 or X1900 cards for discrete physics processing while using the newer card for 3D acceleration duties.
As of right now, ATI's method appears to offer the best combined benefits of both AGEIA's discrete processing as well as being able to switch between Crossfire, Crossfire + Physics.
Overview :
Both ATI and NVIDIA are using the same method of computing physics on their GPUs -- load and process some physics calculations if there's just one GPU, or load and process all physics calculations onto one full GPU if there are two GPUs (Crossfire or SLI). In regards to both methods, the approach is monolithic, meaning that both ATI and NVIDIA prefer to load all things related to graphics onto the GPU.
ATI claims that its latest X1900 family has more than enough processing power left sitting idle most of the time to take care of physics and 3D rendering. This is a strong indication that the current state of 3D graphics is far too concerned with frame rate when it should be looking into how best to utilize the chips that ATI and NVIDIA produce.
ATI’s View on Gaming Physics :
This diagram shows the various shapes that physics objects can have in their simulated environment. The most basic method is AABB cube separated object that essentially has edges directly to the cubed edges in terms of interaction. The sphere method allows a higher resolution of physics simulations at the expense of more processing power. Tetrahedrons are the third option that physics simulation has use where a complex polygon of space surround the rigid body and are the physical edges of the object. Finally, a mesh is the best option for physical simulation (in particular, a single pixel mesh) but it is very compute bound.
When objects interact with each other in a game, the items changes in their phase representations as interactions occur. In the above example, the two cubes are moving towards each other and once their AABB cube models breach each other, the physics simulation knows it needs to move to a more detailed simulation of the objects in order to more accurately present their interactions. Once the objects have interacted and moved away from each other they can resort back to the more course physics models thus saving processing power.
According to ATI, physics calculations are very heavy in floating point arithmetic and conditional branches to that test for item interactions.
As is painfully obvious in all games out today, physics simulations on PCs are all hacked and modified versions of real “text book physicsâ€. Programmers are forced to cut their simulations to lower resolutions in order to get the games to run at all and thus the need for AABB cubes, spheres and other physics shapes are necessary. The ideal situation would be to have enough compute power to simulate physics perfectly, but we aren’t anywhere near that yet.
ATI's Architectural Advantages :
ATI was eager to specify how their GPU architecture is well built for physics simulations. First, the R580 has a tremendous amount of floating point capability with its 48 pixel shaders. ATI estimates that 375 GFlops for a single card and 750 GFlops for a CrossFire system are open for different processing models.
Compared to a blazing fast modern CPU that has 10 GFlops of total floating point calculation capability, the GPU has tremendous opportunity. Interestingly, though no one outside AGEIA knows for sure how much power their PhysX chip actually has, ATI feels that even if they have 25 GFlops of performance running at 100% efficiency, they can more than make up for their slightly lower efficiency with higher GFlops to spend.
Another architecture feature that is very important to physics processing on ATI’s GPUs is the inclusion of their dedicated branching logic. When the R520 and R580 were launched, this feature was critical to ATI’s computing logic that graphics shaders were moving to a highly branched coding style. These same branching units can be utilized when doing physics calculations as well.
Finally the highly threaded cores that the R520 architecture was designed with in conjunction with the branching logic allow the ATI GPUs to break up physics calculations into smaller, easier to process sections. Again, just as this feature helped the R520 and R580 in pixel shading performance, physics shading can utilize it as well.
As you can see in the slide above, a finer, more granulated threading system allows ATI’s GPU to process a much closer amount of pixels that actually need processing. The more black areas that exist in each example represent more data that is NOT processed and thus more efficient. As the thread size increases to even 16x16 (256 bytes) the number of physics “pixels†or events that must pass through the system increase dramatically.
ATI’s Software Approach :
They are working on a Data Parallel Processing Architecture Abstraction layer that will allow developers to utilizing the hardware without having to go through Direct3D or OpenGL APIs. This software is being given away for free to developers and API developers to use as they see fit. And while physics is the focus for now, this architecture abstraction will be open to all other types of GPGPU work too.
Without having to go through Direct3D, ATI’s hardware should be able to pull more performance out of their architecture than if they left the calculations to D3D. However, the abstraction layer that ATI is writing CAN go through D3D and OpenGL if the developer would like it to.
ATI did admit that a common API for physics coding would simplify the industry and allow competition between NVIDIA, ATI and even AGEIA to exist at a level where we could actually tell you which one is better with some kind of certainty.
Summary - GPU Configurations :
In a current generation system shown above, the CPU is doing the physics
work, the GPU is doing the graphical work.
ATI is telling us that their cards will also support rendering and physics acceleration in a single GPU configuration as shown here. If the game being played, at the resolution the user selects, is able to render more frames than necessary per second than required for adequate visual quality, the extra GPU cycles can be utilized in physics calculations.
An even better solution, for the user and for ATI, is that you have two ATI GPUs that can split the rendering and physics calculations between the two different cards. What is most impressive to me is that ATI has assured me that these two cards do not have to run in CrossFire mode, and thus they do not have to be the same GPU. If you have an X1900 XTX now, and in about eight months you buy a new ATI 2800 XTX, you can save your X1900 XTX for physics calculations. As of now, NVIDIA has said they do not support this feature but see the value in doing so. Hopefully that means the feature will be coming soon as the upgrade opportunities this offers readers are fantastic.
Rivals :
While the solutions from all three of the current physics news-makers differ quite a bit, AGEIA, NVIDIA and ATI are at least talking about physics which is good for our industry and gaming in general. As gaming physics becomes more prominent, users will see the advantages of physics acceleration and force game developers and hardware vendors to innovate for the benefit of everyone.
ATI feels that their solution is more than capable of competing with AGEIA, should the physics coding and API work be done to the necessary level They are going to start providing the tools necessary to do it, though they do have a lot of work to do if they want to catch AGEIA in that department.
Final Take :
ATI is saying that its method for processing physics on the GPU is superior to both AGEIA's and NVIDIA's. According to the company, those who have already purchased any one of the X1800 or X1900 series can rest assured that their investment will last.
Using its propriety API, ATI is able to offload physics processing to any GPU in a dual-GPU setup, regardless of whether or not the cards are in Crossfire mode or that they are even from the same family. This way, those who upgrade later can use their existing X1800 or X1900 cards for discrete physics processing while using the newer card for 3D acceleration duties.
As of right now, ATI's method appears to offer the best combined benefits of both AGEIA's discrete processing as well as being able to switch between Crossfire, Crossfire + Physics.