News NVIDIA’s Blackwell AI Servers Faced With Overheating & Glitching Issues; Major Customers, Including Microsoft & Google, Start Cutting Down Orders

bssunilreddy

Keymaster

NVIDIA’s Blackwell AI Servers Faced With Overheating & Glitching Issues; Major Customers, Including Microsoft & Google, Start Cutting Down Orders​

NVIDIA Has Started To Delay Blackwell AI Server Orders Leading To Customers Preferring Older "Hopper" Generation


VIDIA's Blackwell AI servers are reportedly facing a supply chain bottleneck as Team Green fails to resolve the overheating and architectural flaws.

Well, this certainly isn't the start NVIDIA would've hoped for with its Blackwell AI lineup, but it looks like Team Green is certainly being faced with a massive barrier. For those unaware, NVIDIA's Blackwell AI servers were initially expected to see volume production back in Q4 2024, but it was reported back then that the company's new AI architecture is faced with a design flaw, ultimately bringing in higher thermals. Despite Team Green claiming to have sorted the issue out, a new report by The Information has refuted this, saying that the Blackwell AI servers are "glitching" out.

According to the report, it is claimed that the first significant shipment of NVIDIA's GB200 AI servers witnessed overheating and glitching issues, with the key problem lying in the "way chips connect." This problem has ultimately bothered mainstream customers like Microsoft, Amazon, Google and Meta, which is why the report claims that companies have cut-down Blackwell orders, and these firms are said to have placed orders worth over $10 billion.

The situation is indeed alarming for NVIDIA and its AI business, given that supply chain issues in such products can be devastating for the firm's finances. While we are still unaware of the exact problem, it was claimed previously that the issue lies with TSMC's advanced packaging technology, i.e., CoWoS, which refers to the "chip connection" issue we mentioned above. NVIDIA did say previously that they had changed the Blackwell GPU mask made at TSMC, but this hasn't resolved the issue.

Companies are now switching to NVIDIA's well-established alternatives, such as those from the Hopper generation, until Team Green sorts out Blackwell's problems. For now, we are unaware of how big of an impact the Blackwell design flaw will make on NVIDIA's revenue figures, but given that the company is unable to sort the issues out, Blackwell's success may be at stake here, which will prove to be troublesome for NVIDIA.

Source: https://wccftech.com/nvidia-blackwell-ai-servers-faced-with-overheating-glitching-issues/