How to Avoid Metastability on Reset Signal Networks, a/k/a Reset Check is the New CDC

It’s axiomatic that digital circuitry must initialize properly before it’s used. Once upon a time, verifying a design’s reset signaling was a pretty straightforward process – simply confirm the continuity of reset signal from the pad ring to all the IPs and instances inside the DUT. Fast forward to the present, and previously unheard of bugs on reset signal networks are being created via:

* SoCs comprised of many IPs, where 3rd party IPs can handle clocking and reset differently

* Multiple power shutoff domains, where each domain has its own reset signaling

* Aggressive optimization of reset signaling networks to reduce power and area overhead.

To illustrate, consider the following cases where reset and clock signaling are not always in the same domains, potentially leading to big trouble for the second flip flop (dff2):

questa reset check 2dff example

Case 1: rst1 & rst2 are different asynchronous reset signals despite same clock / same clock domain (i.e. clk1 = clk2)

Case 2: rst1 & rst2 are different asynchronous reset signals and clk1 and clk2 are in different clock domains

Alternatively, to borrow from the DVCon USA 2016 paper on this subject, Reset and Initialization, the Good, the Bad and the Ugly by Ping Yeung of Mentor Graphics and Kaowen Liu of MediaTek, design registers can be sorted into three types: GOOD registers (those that are initialized properly), BAD registers (those that are not initialized) and UGLY registers (those that are initialized, but are subsequently corrupted).

The bottom line is that independent “reset domains” can give rise to metastability and signal reconvergence issues similar to clock domain crossing (CDC) bugs. The proliferation of reset domains, and the increasing complexity of rest signaling topologies, means that manual inspection & verification is unsafe, creating considerable risk of unpredictable chip behavior when samples come back. Even worse, like with CDC, the metastability induced by mixing independent reset domains cannot be modeled accurately in simulation.

So what can be done?

“Simple”: leverage time-tested structural and formal CDC analysis techniques!

First, reset signaling networks look a lot like clock trees, giving rise to the concept of reset domains like is done for multiple clock signals in a CDC analysis. Plus, as with CDC verification, automated, exhaustive formal analysis can be applied. Building on over a decade of experience with formal-based CDC verification, Mentor has developed a fully automated formal solution that exhaustively verifies your reset signaling network.

Questa Reset Check app block diagram
Questa Reset Check app block diagram

Specifically, taking RTL as input, the Questa Reset Check app automatically performs an exhaustive, bottom-up reset tree analysis – inferring all the reset structures of a design, including gating and control logic. The app then automatically generates and proves assertions that cover numerous reset-specific structural checks. No knowledge of formal or property specification languages is required.

The results include reports on the synchronicity, polarity and set/reset functionality of all reset tree register nodes; any reset domain crossing (RDC) signals of adjacent registers within the same clock domain; and the complete matrix of clock and reset signals’ relationships.

Back to our original example above, consider the following results where the detected violations are tabulated by the app:

Questa Reset Check results example

The first class of error shown is the case where the IPs are in the same clock domain, but they are connected to different asynchronous reset signals. This is “BAD” and will lead to chip killing metastability on the reset signals.

Another violation reported by the tool in the lower part of the image are the cases where asynchronous and synchronous resets are wired up in the same clock domain. Depending on the design configuration, this might not be a bad thing – if this is OK the tool supports the creation of a “waiver” so this connection is no longer marked as an error.

Like with CDC verification’s many layers of complexity, these two cases are only the starting point of a proper reset signaling analysis — there are many other second order effects that Questa Reset Check detects and reports. Only an exhaustive formal analysis can verify all of this with mathematical certainly, and thus the Questa Reset Check app was created to help customers address these challenges.

 

Are you seeing “reset domain crossing” (RDC) verification issues appearing in your SoC projects?  Please share your thoughts in the comments below, or contact me offline.

Until next time, may your coverage be high and your power consumption be low,

Joe Hupcey III

 

 

References:

DVCon USA 2016: Reset and Initialization, the Good, the Bad and the Ugly, Ping Yeung Design & Verification Technology Mentor Graphics, Fremont, U.S.A. Kaowen Liu Design Technology Division MediaTek Inc, San Jose, U.S.A.

 

DVCon USA 2015: Addressing the Challenges of Reset Verification in SoC Designs, Chris Kwok, Priya Viswanathan, Ping Yeung, Design Verification and Technology, Mentor Graphics Corporation

 

Video from Verification Academy at DAC 2016: Orange is the New Black, Reset Verification is the New CDC

Comments

2 thoughts on “How to Avoid Metastability on Reset Signal Networks, a/k/a Reset Check is the New CDC
  • Kapil Nagdive

    I am getting following error while trying to compile library files. Error is encountered while using “netlist load lib” command.

    # Command: netlist load lib
    # Command arguments:
    # LICENSE: ERROR: Transaction request failed
    # // License request for zncompccl feature failed
    # FLEX ERROR: -15, feature: zncompccl, checkout none, server: 27000@10.100.2.12
    # , reason: Cannot connect to license server system.
    # FLEX ERROR: -15, feature: zncompccl, checkout none, server: 27000@10.100.2.14
    # , reason: Cannot connect to license server system.
    # FLEX ERROR: -15, feature: zncompccl, checkout none, server: 29000@10.100.2.20
    # , reason: Cannot connect to license server system.
    # Summary: 0 Fatals, 0 Errors, 0 Warnings in netlist load lib
    # Fatal : No fatal messages have been queued but command returned failure. [command-32]
    # Final Process Statistics: Max memory 281MB, CPU time 0s, Total time 120s
    # End of log Sun Aug 9 07:37:51 2020

  • Joseph V Hupcey III

    Hi Kapil,

    Please file a support request at http://supportnet.mentor.com/ so the technical Support Team can follow-up with you on this issue.

    Joe

Leave a Reply