The IIS Chip Gallery

Fulmine (2015)

Additional pictures below, click to see larger versions

Francesco Conti, Robert Schilling, Davide Schiavone, Antonio Pullini, Davide Rossi, Andreas Traber, Igor Loi, Michael Gautschi, Michael Muehlberghuber, David Bellasi

Main Details

Application	Pulp
Technology	65
Manufacturer	UMC
Type	Research Project
Package	QFN64
Dimensions	2626μm x 2626μm
Gates	2500 kGE
Voltage	1.2 V
Power	13 mW @ 0.8 V, 104 MHz
Clock	400 MHz

Description

A four core PULP implementation using third generation or10n cores. This chip has many improvements over Mia Wallace, which was manufactured using the same technology and has a similar size. It has 192 Kbytes of L2 memory and 64 Kbytes of TCDM.

New instructions for vector processing, and fixed point arithmetic using Q15 and Q31
- Dot product and accumulate between two vectors
- Multiply, accumulate and shift
- Multiply and subtract
- Clip, used for saturation
- Add subtract with normalization
- Bit set, clear, extract
- Shuffle
Improvements to the DMA
- Added support for multiple transfers IDs on the same private queue.
- Added separate queues for linear and 2d transfers to optimize area of the command queue.
- Added support for non-incrementing bursts.
- Added support for 2D transfers.
Improvements to the HW Convolution Engine
- Support for multiple input/output features
- Vectorized convolutions with reduced precison weights. For example, the HWCE can compute 1 pixel/cycle for four features with 4-bit weigths, or 2 features with 8-bit weights, or a single feature with 16-bit weights.
- Optimized bandwidth utilization
- Optimized power consumption by fine-grained architectural clock gating
Added a Cryptographic accelerator
Developed an I/O DMA to enable direct memory transfers from peripherals to L2 memory when the cluster is idle. On the peripherals side the I/O DMA connects the I2S (master and slave), I2C, SPI (master and slave) and UART. The I/O DMA is connected to the L2 memory through a high priority port, avoiding the need for large internal FIFOs.
Improved power management architecture. Several power modes have been implemented for cluster and SoC.

Related Publication

Francesco Conti, Robert Schilling, Davide Schiavone, Antonio Pullini, Davide Rossi, Frank K. Gurkaynak, Michael Muehlberghuber, Michael Gautschi, Igor Loi, Germain Haougou, Stefan Mangard, Luca Benini, "An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics", IEEE Transactions on Circuits and Systems I: Regular Papers, Vol: 64, Issue: 9, Sept. 2017, pp 2481 - 2494, DOI: 10.1109/TCSI.2017.2698019

Created by make_cg.pl on Mon Feb 12 10:46:46 2018