Vortex: OpenCL Compatible RISC-V Based GPGPU (Part 1)
This article introduces an overview of Vortex, an open source RISC-V based GPGPU, and how to run the OpenCL program using the Vortex simulator.
Vortex is a single instruction, multiple threads (SIMT) execution model GPGPU processor that adds custom instructions for GPGPU to RISC-V ISA. The
README.md of the Vortex repository has the following description as specifications.
- Support RISC-V RV32IMF ISA
- 1024 total threads running at 250 MHz
- 128 Gflops of compute bandwidth
- 16 GB/s of memory bandwidth
- Scalability: up to 64 cores with optional L2 and L3 caches
- Software: OpenCL 1.2 Support
- Supported FPGAs:
- Intel Arria 10
- Intel Stratix 10
docs/microarchitecture.md of the Vortex repository introduces the following diagram as a microarchitecture. The upper part of the figure below represents the Vortex core. Features for threads and warps are added to each stage. The GPGPU unit in the Execute stage handles GPGPU instructions.
A group of Vortex cores is a Vortex cluster, and a group of Vortex clusters is a Vortex processor. Vortex cores and Vortex clusters can share L2 and L3 caches, respectively.
Vortex Simulation Methods
Vortex simulation methods include
rtlsim for RTL simulation using Verilator,
simx for cycle-approximate simulation, and
fpga for FPGA simulation using FPGA board.
The Vortex simulation run is integrated into the shell script
blackbox.sh in the
ci directory. The above four simulation methods can be switched using command line arguments.
Similarly, using command line arguments of
blackbox.sh, you can change the configuration of the Vortex processor: number of clusters, number of cores, number of warps, number of threads, enable/disable of L2 and L3 cache. The default configuration is clusters: 1, cores: 4, warps: 4, threads: 4, L2 and L3 caches: disabled.
sgemm on Vortex RTL Simulator
We ran the OpenCL program
sgemm in the
tests/opencl directory with different configurations of Vortex processor. The
sgemm is a simplified version of single-precision GEMM (GEneral Matrix-to-matrix Multiply).
$ cd $VORTEX $ ./ci/blackbox.sh --driver=rtlsim --cores=[1|2|4|8] [--l2cache] \ --app=sgemm --args="-n[4|8|16|32|64|128]"
The featured image shows performance (FLOP/cycle) calculated from the simulation results.
This article introduces an overview of Vortex, an open source RISC-V based GPGPU, and how to run the OpenCL program
sgemm in the
tests/opencl directory using the Vortex simulator.