Running CoreMark on NaxRiscv Simulator

benchmark-naxriscv-simulator

We have created a Verilator simulator for NaxRiscv, an out-of-order (OoO) superscalar RISC-V CPU. We ran CoreMark using the created simulator and confirmed 4.70 CoreMark/MHz as the nominal value of NaxRiscv.

The current NaxRiscv repository is still WIP, but you can run Linux on the simulator. It is also integrated into LiteX and allows you to create gateware for FPGA boards.

The feature image is a visualization of the log output from the simulator with Konata, an instruction pipeline visualizer.

NaxRiscv

NaxRiscv is a RISC-V CPU being developed by Charles Papon, the developer of 32-bit RISC-V VexRiscv. Like VexRiscv, it is written in a hardware description language called SpinalHDL.

We thought the difference between VexRiscv and NaxRiscv was an in-order scalar and an out-of-order superscalar, like Rocket and BOOM (Berkeley Out-of-Order Machine) at University of California, Berkeley (UCB). However, new attempts such as 64-bit (RV64) support, which VexRiscv did not have, are being made.

In terms of performance, the nominal value of 32-bit NaxRiscv is as follows, compared to the nominal value of 6.2 CoreMark/MHz of BOOMv3 (SonicBOOM).

  • 2 execution unit: 4.22 CoreMark/MHz
  • 3 execution unit: 4.70 CoreMark/MHz

CoreMark on Verilator Simulator

The NaxRiscv repository on GitHub has instructions on how to create a simulator, and a pre-built CoreMark is also available, so we tried to reproduce it.

$ ./obj_dir/VNaxRiscv --load-elf ../../../../ext/NaxSoftware/baremetal/coremark/build/rv32im/coremark.elf --pass-symbol=pass
2K performance run parameters for coremark.
CoreMark Size    : 666
Total ticks      : 2129108
Total time (secs): 2129108.000000
Iterations/Sec   : 0.000005
Iterations       : 10
Compiler version : GCC11.1.0
Compiler flags   : -DPERFORMANCE_RUN=1  -march=rv32im -mabi=ilp32 -Wno-pointer-to-int-cast -Wno-int-to-pointer-cast -I../driver -O3 -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-reg-struct-return -fno-rename-registers --param case-values-threshold=8 -fno-crossjumping -freorder-blocks-and-partition -fno-tree-loop-if-convert -fno-tree-sink -fgcse-sm -fno-strict-overflow -DCORE_DEBUG=0  -lgcc -lc -nostartfiles -ffreestanding -Wl,-Bstatic,-T,../common/app.ld,-Map,coremark.map,--print-memory-usage
Memory location  : STACK
seedcrc          : 0xe9f5
[0]crclist       : 0xe714
[0]crcmatrix     : 0x1fd7
[0]crcstate      : 0x8e3a
[0]crcfinal      : 0xfcaf
Correct operation validated. See README.md for run and reporting rules.
CoreMark 1.0 : 0.000005 / GCC11.1.0 -DPERFORMANCE_RUN=1  -march=rv32im -mabi=ilp32 -Wno-pointer-to-int-cast -Wno-int-to-pointer-cast -I../driver -O3 -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-reg-struct-return -fno-rename-registers --param case-values-threshold=8 -fno-crossjumping -freorder-blocks-and-partition -fno-tree-loop-if-convert -fno-tree-sink -fgcse-sm -fno-strict-overflow -DCORE_DEBUG=0  -lgcc -lc -nostartfiles -ffreestanding -Wl,-Bstatic,-T,../common/app.ld,-Map,coremark.map,--print-memory-usage / STACK
4.70 Coremark/MHz
SUCCESS ???

The result of CoreMark built with GCC 11.1.0 was also 4.70 CoreMark/MHz.

Summary

We have created a Verilator simulator for NaxRiscv, an OoO superscalar RISC-V CPU. We ran CoreMark using the created simulator and confirmed 4.70 CoreMark/MHz as the nominal value of NaxRiscv.