Ever wonder how smart assistants like Siri or Alexa can understand what you’re saying and come up with an answer quickly? These smart assistants, like self-driving cars, use powerful machine learning systems and natural language processing. An engineer trained a computer to accomplish specific tasks by processing large amounts of data and recognizing patterns in the data. As consumer adoption of these technologies continues, high-level artificial intelligence (AI) will also grow.
High-level artificial intelligence requires powerful computing. So, how do you evaluate options to optimize the removal of 17.6kW of heat from a chip over 20 cm on a side containing 84 integrated circuits (IC) sites? In the Thermal management for AI hardware and electronics cooling for a deep learning machine webinar, Guy Wagner from Electronics Cooling Solutions discusses how Simcenter Flortherm XT computational fluid dynamics (CFD) simulations helped the team evaluate options for cooling and determine the best solution.
Evaluating traditional options for air cooling a large, hot chip
First, the team developed quick simulations to assess whether an air-cooled solution could keep within the model constraints. Considering reasonable temperature rises, the team simulated 2,000 CFM and a heat sink.
Why air cooling wouldn’t work:
- Deafening noise levels
- Unable to fit the fan configuration in a standard 19” rack
- Unable to keep the IC sites within a narrow temperature range (not even close)
So, the team at Electronics Cooling Solutions looked at the next viable option – liquid cooling.
Using CFD simulations for reviewing options for liquid cooling a large, hot chip
The team was able to rule out air-cooled as an option quickly and started building liquid-cooled simulations. In this example, a coolant picks up the heat at the cold plate and flows to the heat exchanger (HEX), where the heat is removed and then flows through the pump and back to the cold plate at the heat source.
- Cold plate: removes the heat load from the electronics at the required flow rate
- Pump: provides the proper pressure and flow rate for the system
- Air-to-liquid HEX: removes heat from the coolant at the available airflow and liquid coolant flow rate
Once the team set up the potential solution, they ran simulations to test the required constraints. Simcenter Flotherm XT modeled the fluid flow through the entire module (including all the micro-channels).
Using CFD simulations to cool the largest chip ever built
What was the team trying to cool? Here are a few stats about Cerebra’s Wafer Scale Engine:
- 46,225 mm2 silicon
- 1.2 trillion transistors
- 400,000 AI-optimized cores
- 18 Gigabytes of on-chip memory
- 9 Petabytes of memory bandwidth
- 100 Petabytes/s of fabric bandwidth
- TSMC 16 nm process
Are you interested in the details of the simulations? Check out the on-demand webinar: Thermal management for AI hardware and electronics cooling for a deep learning machine.