This work-in-progress document outlines the Hwacha decoupled vector-fetch microarchitecture in detail. We first discuss how we modified the open-source Rocket Chip SoC generator to provide a system framework comparable to commercially available data-parallel accelerators. We exploit the generator’s RTL libraries, including an in-order core implementing the RISC-V instruction set, multiple levels of coherent caches, and a standardized accelerator interface we used to attached the Hwacha vector accelerator. The vector accelerator executes the Hwacha instruction set architecture described in the Hwacha vector-fetch architecture manual. We present the overall machine organization, then describe the details of the vector frontend and the scalar unit, vector execution unit (VXU), vector memory unit (VMU), and the vector runahead unit (VRU).
Publications
Tags
2D
Accelerators
Algorithms
Architectures
Arrays
Big Data
Bootstrapping
C++
Cache Partitioning
Cancer
Careers
Chisel
Communication
Computer Architecture
CTF
DIABLO
Efficiency
Energy
FPGA
GAP
Gaussian Elimination
Genomics
GPU
Hardware
HLS
Lower Bounds
LU
Matrix Multiplication
Memory
Multicore
Oblivious
Open Space
OS
Parallelism
Parallel Reduction
Performance
PHANTOM
Processors
Python
Research Centers
RISC-V
SEJITS
Tall-Skinny QR
Technical Report
Test generation