RISC-V – It’s not just about the core, it’s also about the system

By Huw Geddes

As more companies design new many-core architectures to gain an advantage over competitors, these new devices bring more complexity to both hardware and software, which all needs to be verified, debugged and optimized.

With more shared resources and complex systems including 2.5D and 3D integrations, unexpected behaviour such as deadlock and performance problems may occur due to memory-locking interactions or processors that spend millions of cycles waiting for data. Precise optimization of both hardware and software is required to integrate custom compute and additional software libraries.

Classic CPU run control is not sufficient when debugging and optimizing code running on these many-core systems. In a single core environment, the speed that a debugger can issue a breakpoint command, receive a response and then inform the host application, may be fast enough for engineers to root-cause problems. But in a many-core environment, the breakpoint may need to be distributed over many cores running in parallel and the command has to perform the same round trip for every core in the distribution, which will always be much slower. Multiple levels of logic hierarchy will inevitably increase the response time and interdependent processes will add additional unpredictability into the mix.

Combining these factors together, there is a very low probability of determining the root cause of many software issues using classical CPU-based run control debug techniques.

Functional analysis

Critically, the hardware-software optimization of a many-core device requires a lot more functional analysis due to the complexities and dependencies created when combining processors, hardware blocks, memory, peripherals and software into a system. New and existing toolchains must be optimized so that third party libraries and embedded application code can take maximum advantage of the target architecture.

Delivering on the promised compute performance requires iterations of hardware and software optimization and testing that start in SoC emulation and continue through the test chip and into the field. System architects must be able to run functional analysis of the chip to answer questions like how effectively data is shared across the main interconnect, is the NoC balanced, how efficient is the branch predictor, is the code partitioning optimal, are there any potential SRAM bottlenecks. As well as debugging their code, software engineers need to understand functional behavior such as how events across all SoC components correlate if an application crashes, whether any threads overrun a timing window and which block of memory to use for data.

The need to debug CPUs is a well-defined activity, but it will only give an indication of a complex many-core device’s potential performance. All the other components in the system must work together to take advantage of the extra processing performance delivered by the latest CPUs. Analyzing data captured about the system’s behavior by monitoring the system interconnects and NoCs while running under real-life stimulus and workloads can provide deep insights into the actual device behavior and likely causes of unexpected behavior.

Engineers can use the data to identify many hard to track bugs, long-tail issues that occur intermittently and performance optimizations that would not be visible using traditional run control debug techniques.

The Tessent Embedded Analytics solution

Tessent Embedded Analytics Bus Monitor can be used to capture the data required to analyze many-core systems including complete, transaction-level visibility of SoC activity across AXI and ACE buses, even where there are multiple outstanding simultaneous transactions in-flight that may complete out of order. The NoC Monitor provides transaction-level visibility of the SoC bus activity for devices using the Arm® AMBA® 5 Coherent Hub Interface (CHI). The Status Monitor provides visibility and monitoring of any circuitry within an SoC with an on-device logic analyzer.

Combined with smart software, the Embedded Analytics platform of hardware IP provides a unique functional analysis solution that goes far beyond monitoring of on-chip process parameters to provide full system-level visibility which enables optimization throughout the lifecycle of the device.  


Tessent Embedded Analytics from Siemens EDA offers an integrated range of hardware and software tools that accelerate debug of RISC-V based SoCs.

Leave a Reply

This article first appeared on the Siemens Digital Industries Software blog at