News AMD’s Radeon RX 7900 XTX Beats NVIDIA’s GeForce RTX 4090 In DeepSeek’s AI Inference Benchmarks; Here’s How You Can Run R1 On Your Local AMD Machines

bssunilreddy

Keymaster

AMD’s Radeon RX 7900 XTX Beats NVIDIA’s GeForce RTX 4090 In DeepSeek’s AI Inference Benchmarks; Here’s How You Can Run R1 On Your Local AMD Machines​

AMD's Radeon RX 7900 XTX runs the DeepSeek R1 AI model with exceptional performance, beating NVIDIA's GeForce RTX 4090 in inference benchmarks.
Well, DeepSeek's newest AI model has taken the industry by storm, and while many of us are wondering about the computing resources used to train the model, it seems like an average consumer can squeeze out the adequate performance needed to run the model, through AMD's "RDNA 3" Radeon RX 7900 XTX GPU. Team Red has shared DeepSeek's R1 inference benchmarks comparing the flagship RX 7000 series GPU with NVIDIA's counterpart, showing a superior performance across multiple models.

Consumer GPUs for AI workloads have worked out for several individuals out there, mainly since they had previous decent perf/$ value compared to mainstream AI accelerators. And, by running models locally, you basically have your privacy safeguarded as well, which has been a big concern with DeepSeek's AI models. Fortunately, AMD has pushed out an extensive guide on how to run DeepSeek R1 distillations on Team Red's GPUs, and here are the instructions:

Step 1: Make sure you are on the 25.1.1 Optional or higher Adrenalin driver.

Step 2: Download LM Studio 0.3.8 or above from lmstudio.ai/ryzenai

Step 3: Install LM Studio and skip the onboarding screen.

Step 4: Click on the discover tab.

Step 5: Choose your DeepSeek R1 Distill. Smaller distills like the Qwen 1.5B offer blazing fast performance (and are the recommended starting point) while bigger distills will offer superior reasoning capability. All of them are extremely capable.

Step 6: On the right-hand side, make sure the “Q4 K M” quantization is selected and click “Download”.

Step 7: Once downloaded, head back to the chat tab and select the DeepSeek R1 distill from the drop-down menu and make sure “manually select parameters” is checked.

Step 8: In the GPU offload layers – move the slider all the way to the max.

Step 9: Click model load.

Step 10: Interact with a reasoning model running completely on your local AMD hardware!

Well, if the above instructions don't work out for you, AMD has pushed out a tutorial on YouTube, diving into individual steps. Make sure to check it out and run DeepSeek's LLMs on your local AMD machines in order to ensure that your data isn't being misused. With the upcoming GPUs by NVIDIA and AMD, we expect inferencing power to increase massively, given that there are dedicated AI engines onboard to facilitate such workloads.

Source: https://wccftech.com/amd-radeon-rx-...rtx-4090-in-deepseeks-ai-inference-benchmark/

Btw this is why nvidia stocks fell so much immediately after DeepSeek AI was announced. I was thinking now know why.
 
  • Wow
Reactions: Skoomahed
Well I did hear r1 used different technology and not the cuda route, previously in the self hosted llm space people generally didn't recommend AMD GPUs because all models were based on cuda technology. But if all the other models started following the r1 route it would be so good, since it would nearly end NVidia monopoly in GPU hosting for llm space as before only NVidia were considered :)
 
Well I did hear r1 used different technology and not the cuda route, previously in the self hosted llm space people generally didn't recommend AMD GPUs because all models were based on cuda technology. But if all the other models started following the r1 route it would be so good, since it would nearly end NVidia monopoly in GPU hosting for llm space as before only NVidia were considered :)
Nvidia still has a monopoly , their stock price is still around 7-8x more since 2023 . If you do anything other than gaming Nvidia has the edge somehow .
They have poured millions to open source projects like blender to make nvidia's GPUs a no brainer choice .

Still great to see something that rejected their CUDA dominance , soon we might have a good alternative such as oneAPI getting more adoption
 

AMD’s Radeon RX 7900 XTX Beats NVIDIA’s GeForce RTX 4090 In DeepSeek’s AI Inference Benchmarks; Here’s How You Can Run R1 On Your Local AMD Machines​

AMD's Radeon RX 7900 XTX runs the DeepSeek R1 AI model with exceptional performance, beating NVIDIA's GeForce RTX 4090 in inference benchmarks.
Well, DeepSeek's newest AI model has taken the industry by storm, and while many of us are wondering about the computing resources used to train the model, it seems like an average consumer can squeeze out the adequate performance needed to run the model, through AMD's "RDNA 3" Radeon RX 7900 XTX GPU. Team Red has shared DeepSeek's R1 inference benchmarks comparing the flagship RX 7000 series GPU with NVIDIA's counterpart, showing a superior performance across multiple models.

Consumer GPUs for AI workloads have worked out for several individuals out there, mainly since they had previous decent perf/$ value compared to mainstream AI accelerators. And, by running models locally, you basically have your privacy safeguarded as well, which has been a big concern with DeepSeek's AI models. Fortunately, AMD has pushed out an extensive guide on how to run DeepSeek R1 distillations on Team Red's GPUs, and here are the instructions:

Step 1: Make sure you are on the 25.1.1 Optional or higher Adrenalin driver.

Step 2: Download LM Studio 0.3.8 or above from lmstudio.ai/ryzenai

Step 3: Install LM Studio and skip the onboarding screen.

Step 4: Click on the discover tab.

Step 5: Choose your DeepSeek R1 Distill. Smaller distills like the Qwen 1.5B offer blazing fast performance (and are the recommended starting point) while bigger distills will offer superior reasoning capability. All of them are extremely capable.

Step 6: On the right-hand side, make sure the “Q4 K M” quantization is selected and click “Download”.

Step 7: Once downloaded, head back to the chat tab and select the DeepSeek R1 distill from the drop-down menu and make sure “manually select parameters” is checked.

Step 8: In the GPU offload layers – move the slider all the way to the max.

Step 9: Click model load.

Step 10: Interact with a reasoning model running completely on your local AMD hardware!

Well, if the above instructions don't work out for you, AMD has pushed out a tutorial on YouTube, diving into individual steps. Make sure to check it out and run DeepSeek's LLMs on your local AMD machines in order to ensure that your data isn't being misused. With the upcoming GPUs by NVIDIA and AMD, we expect inferencing power to increase massively, given that there are dedicated AI engines onboard to facilitate such workloads.

Source: https://wccftech.com/amd-radeon-rx-...rtx-4090-in-deepseeks-ai-inference-benchmark/

Btw this is why nvidia stocks fell so much immediately after DeepSeek AI was announced. I was thinking now know why.
this is good news! now hoping they get better are img generation too. there is a speculation that 9070 might come really close to rt of nvidia
 
1738255367349.png
 
Well I did hear r1 used different technology and not the cuda route, previously in the self hosted llm space people generally didn't recommend AMD GPUs because all models were based on cuda technology. But if all the other models started following the r1 route it would be so good, since it would nearly end NVidia monopoly in GPU hosting for llm space as before only NVidia were considered :)
They essentially used something called Parallel Thread Execution, which is Nvidia's low-level framework. It's like using Assembly language to code instead of Python. If that makes sense
 
  • Like
Reactions: DigitalDude