Running Debian on FPGA with RISC-V OoO CPU


We have successfully built the gateware for RV64GC NaxRiscv, a RISC-V OoO CPU, for Digilent FPGA board and run Debian.

Click here for related articles.


NaxRiscv is an out-of-order execution (OoO) superscalar RISC-V CPU. NaxRiscv is integrated into LiteX, an SoC builder.

As introduced in the article Benchmarks on RISC-V OoO Simulator, NaxRiscv supports 32-bit and 64-bit RISC-V.

This time, we have built SoC gateware with RV64GC NaxRiscv for Digilent’s FPGA board Nexys Video.

Debian on FPGA with LiteX-NaxRiscv

We use a microSD card to run Debian. The Debian microSD card was set up with the kernel, rootfs, etc. downloaded from the link in Hardware — NaxRiscv documentation.

If you change the SoC configuration or use an FPGA board other than Nexys Video, you need to adjust dts/dtb.

Benchmarks on Debian

We loaded the gateware on the FPGA board and ran the benchmarks CoreMark and Whetstone on Debian.


The following shows the console output when running CoreMark built with almost the same options as NaxSoftware.

root@sid-rv64:~# ./coremark_nax64gc
2K performance run parameters for coremark.
CoreMark Size    : 666
Total ticks      : 13289
Total time (secs): 13.289000
Iterations/Sec   : 451.501242
Iterations       : 6000
Compiler version : GCC11.1.0
Compiler flags   : -DPERFORMANCE_RUN=1 -mcmodel=medany -Wno-pointer-to-int-cast -Wno-int-to-pointer-cast -O3 -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-crossjumping -freorder-blocks-and-partition   -lrt
Memory location  : Please put data memory location here
			(e.g. code in flash, data on heap etc)
seedcrc          : 0xe9f5
[0]crclist       : 0xe714
[0]crcmatrix     : 0x1fd7
[0]crcstate      : 0x8e3a
[0]crcfinal      : 0xa14c
Correct operation validated. See readme.txt for run and reporting rules.
CoreMark 1.0 : 451.501242 / GCC11.1.0 -DPERFORMANCE_RUN=1 -mcmodel=medany -Wno-pointer-to-int-cast -Wno-int-to-pointer-cast -O3 -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-crossjumping -freorder-blocks-and-partition   -lrt / Heap

CoreMark scores 450-452 for nearly the same options as NaxSoftware. Since the operating frequency is 100MHz, the CoreMark/MHz is 4.50-4.52. NaxRiscv’s RV64GC simulator had a CoreMark/MHz of 4.59, giving similar scores.

The CoreMark score built with only the -O3 -funroll-loops option is 360-370, so the option seems to have a large impact.


The following shows the console output when running Whetstone built with the -O2 option.

root@sid-rv64:~# ./whetstone 100000

Loops: 100000, Iterations: 1, Duration: 59 sec.
C Converted Double Precision Whetstones: 169.5 MIPS

Whetstone WMIPS is 169.5-172.4 and WMIPS/MHz is 1.69-1.72. WMIPS/MHz of NaxRiscv’s RV64GC simulator is 0.976, so there is a big difference.


We have successfully built the gateware for RV64GC NaxRiscv, a RISC-V OoO CPU, for an FPGA board Nexys Video and run Debian.