Is Engineering Best Judgement Really Best? (Part 2)

By Consulting Services

by Chuck Battikha, Solutions Architect for Verification & Functional Safety, Siemens EDA Services

Building on what we discussed in last week’s entry, Is Engineering Best Judgement Really Best? (Part 1), let’s look at a real example, a register file (RF). For this example, the registers are protected by parity, a logical choice given the large number of flips flops and relatively continuous write/read operations.
The following is a block diagram.

The design appears simple enough and is probably one line item in a FMEDA.

Now, one engineer might consider this RF to be dominated by the registers to be memory and, thus, use Part 11-5.1.13.6 of the standard as a justification and allocate a LOW DC (60%) to the block. Meanwhile, another engineer may create their own rationale and argue for a MEDIUM DC (90%); after all, for a single bit fault (whether transient or stuck-at), parity is pretty good. While it’s weak against multiple bit faults, remember we’re only checking for single faults. Also, because it’s regularly read, the time to detect a fault will be short.

So, which of these is correct?

Getting to the answer…

Let’s find out. We can easily set up our Kaleidoscope fault simulator tool to run a fault injection campaign on the design. Our test case for the block simulates a read/write sequence to the register file module, which are then captured in VCD (Value Change) format and used in the Kaleidoscope tool. A list of faults is then generated (using our SafetyScope tool) and provided to the fault simulator, with the parity error being the alarm. The simulator injects faults and determines if the fault propagates to the output of the module and whether a parity alarm is generated due to the fault. When a fault propagates, the fault is assumed to be unsafe and would impact a safety goal of the design. If fault does not propagate, it is classified as “safe”.

Running through this scenario, the initial results were poor. This happened for a couple of reasons:

The test case was not nearly as robust as it should have been. The quality of the fault campaign is only good as the stimulus being supplied.

There is debug logic that was part of the design that should not be considered in the fault campaign, as they have no impact on the design’s overall safety goals.

Armed with this knowledge, we then give it another spin!

And the results are still poor.

As it turns out, the coverage for permanent faults is certainly well below the HIGH that one engineer may have estimated and doesn’t even reach 60% (LOW); it is between 30%-50% for this module. On the other hand, coverage for transients would be closer to 95%.

Why did THAT happen?

The reasons for these outcomes are clearer when you dive into the details of the design. While it is true that the flip flops protected with parity comprise the 95% of the storage elements in the design (1024/1076 flip flops), there are a significant number of additional gates/transistors in the design that will pull down the permanent fault coverage dramatically. The additional (unprotected) logic on each register includes:

● Address staging

● Data stating

● Muxing and selection logic (for directing writes and reads to/from the proper register)

● Pipelining control

● Parity Generation (Safety Mechanism)

● Parity Checking (Safety Mechanism)

The following table shows a typical breakdown of protected and unprotected logic. Synthesis for timing, area, power, etc. will clearly change actuals and the final transistor count (TC) will be based on the specific technology used. However, the table illustrates the key point: that Engineering Best Judgement can easily miss details.

Gate	Instances	TC per Instance	Total TC
FF	52	34	1,768
FF	1024	34	34,816
2:1 MUX	2244	12	26,928
NOT	8	2	16
AND	15	6	90
OR	96	4	384
XOR	1057	12	12,684
*Total*	4,502		76,722

This is but one (intentionally simplistic) example that both captures key points and illustrates the challenges of Engineering Best Judgement, most notably that it:

Is highly subjective
Is non-deterministic
Requires deep analysis to capture all details to ensure the accuracy
Is effort intensive

The conclusion is that fault injection campaigns should be used routinely to provide the unbiased answers that will accurately reflect the design details and that the process of safety analysis is repeatable and deterministic.

If you need assistance in building and executing a fault campaign or optimizing your functional safety methodology, Siemens EDA Consulting and Learning services are here to help. With years of Functional Safety expertise, coupled with extensive verification methodology and tool knowledge, we are positioned to help you meet your business and technical targets.

We invite you to contact us at: SiemensEDAServices.sisw@siemens.com.

In the meantime, you can learn more about easing the FMEDA process in this free whitepaper: PUSH-BUTTON FMEDAS FOR AUTOMOTIVE SAFETY AUTOMATING A TEDIOUS TASK, written by our Siemens Consulting and Learning Services experts

Getting to the answer…

Why did THAT happen?

Leave a Reply Cancel reply