What are the top challenges of high-performance computing/AI semiconductor package design?

By Keith Felton

If you’re designing a high-performance processor-based package,  it’s common for the semiconductor package design to contain multiple logic chips that together function as the processor – basically a modular architecture based on a disaggregated SoC, which is flanked by high bandwidth memory (HBM) stacks. Such architectures are becoming popular today in high performance computing (HPC) and Artificial Intelligence (AI) focused systems.  A good example would be Intel’s Ponte Vecchio GPU. Typically, these designs integrate the device’s high speed/bandwidth signals on high performance interposers, as full area interposers or as bridge interposers, such as Intel’s EMIBs in the case of Ponte Vecchio or AMD’s elevated bridge. Full area interposers are typically silicon and expensive, but lower cost glass options are starting to emerge and will likely see broader usage.

Two of the top challenges: HBM routing and a tool’s ability to deliver the interactive performance needed

These designs are typically large with very big connectivity structures. Each HBM signal needs to be routed on one layer after fanout/breakout from the HBM and logic devices bump pads i.e., no further layer transitions. Now, because there are multiple channels and typically multiple HBM stacks, it’suseful if your design tool has intelligent channel reuse where a master channel, think of it as reusable design IP, can first be characterized and verified, then replicated multiple times for the other channels. Replicating the reusable design IP allows the other channels to retain their reference to the master IP channel so that any subsequent modification of the master can be automatically propagated if needed.

Demanding rules: power, ground structures and filled metal areas

All advanced substrates have demanding rules regarding power and ground structures and filled metal areas. These can include plane hatching with offsets, plane striping, outgassing, metal balancing and rules for thermal tie legs for power and ground pins. Creating these structures can easily take up 50% of the overall design cycle time, so it’s key that their creation be dynamic and produce tapeout quality results throughout the design process, not through lengthy post processing. If not, then getting planes/metal filled areas finalized and tapeout clean will add considerable time to a project. And should a late-stage edit be required, which inevitably happens, then you may be forced to regenerate and requalify the planes/metal filled areas all over again resulting on more schedule creep/slip.

Leveraging concurrent design

As I mentioned earlier, these designs are typically large, dense and considerably complex and challenging for any one designer to undertake so having the ability to leverage multiple designers, maybe with different areas of expertise, to work concurrently can significantly reduce your overall design cycle and ensure you hit your tapeout date. Another proven way of shrinking your design cycles for these types of designs is having the ability to automate and customize your design flow and design tools. This can help the designer focus on getting their tasks completed in the most productive manner and can prevent mistakes and resulting ECO’s.

A good example here is around the ability to create custom design rules checks (DRC) All design tools come with a set of standard DRCs, such as metal width and spacing, but often your substrate fabricator, OSAT or foundry will have process specific rules that require geometry-based checks. These rules are commonly checked during tapeout mask signoff, but any errors detected at this stage will most likely cause a design spin followed by another tapeout and signoff cycle which can take days or even weeks. So having the ability create geometry-based DRCs within the actual design tool can allow the designers to detect and address issue before finding them after tapeout and prevent a lengthy signoff process and possible manufacturing delays.

In summary, designing packages for the latest HPC/AI devices beings many challenges that may be new to traditional package design teams and may require the adoption of new design tools and workflows.

To learn more you can download: Heterogeneous chiplet design and integration: bringing a new twist to SiP design.

Leave a Reply

This article first appeared on the Siemens Digital Industries Software blog at