Application | Pulp |
Technology | 12 |
Manufacturer | GF |
Type | Research |
Package | QFN56 |
Dimensions | 2500μm x 2000μm |
Gates | 1 |
Voltage | 0.8 V |
Power | 10 pW @GHz mW |
Clock | 720 MHz |
Heartstream is a 64-core RISC-V based building block for our Mempool and Terapool based shared-memory architectures that can efficiently scale up to 256/1024 cores.
The MemPool architecture is a flexible and parametric manycore architecture that can scale up to hundreds of individually programmable cores. It utilizes a low-latency L1 scratch pad memory (SPM) which allows 256 independent cores to access a distributed shared L1 memory with low latency. This allows MemPool to scale and efficiently implement a wide range of applications.
Heartstream incorporates two levels of the MemPool hierarchy starting with tiles that combine cores, instruction cache levels, and a subset of shared L1 memory banks connected by a fully connected crossbar. Each tile has remote ports for requesting data from other tiles' SPM and incoming request ports to serve memory requests from remote tiles. Multiple tiles are grouped, and four groups make up the complete Heartstream cluster with interconnected remote interconnects. The design includes additional pipeline stages to minimize latency. The Snitch cores, used as Heartstream's processing elements (PEs), are small 32-bit cores that support the RISC-V RV32IMAFXpulpimg instruction set architecture (ISA). These cores have a pipelined DSP unit that supports Xpulpimg ISA and floating point extensions. The Snitch cores allow for handling multiple outstanding instructions, which helps hide latencies and maintain high throughput.
We have previously published the MemPool architecture:
Heartstream improves on this design in two important aspects
The name of the chip is a bit of a mystery and does not necessarily follow one of the established naming traditions.
This design has been generously supported by the Globalfoundries University Partnership Program.