|Dimensions||2500μm x 2500μm|
Architectures designed for high throughput are generally considered to be power hungry. A successful design tries to achieve the best possible compromise between power consumption, silicon area and operation speed.
In this project, a recently designed high-performance complex-number ALU has been re-designed and implemented. The initial ALU was not designed under power consumption considerations and therefore delivered a good comparative platform throughout the design process.
The ALU is designed to be used in a DSP capable of performing high-speed FFT operations. An architecture using four parallel execution units was considered to be the most optimal for an ALU tailored for FFT operations. Each execution unit houses a 16 bit complex multiplier, a configurable 32 bit ALU and a 32 bit adder.
Optimizations in the datapath structure have reduced the critical path by nearly 50 %. Significant power savings were obtained by using extensive clock gating and by setting the inputs of inactive modules to constant values.
The implemented design has a complexity of 120 kGate equivalents and is able to run at a clock frequency of 125 MHz. With these parameters the new design is nearly twice as fast and consumes only 28% of the energy of the initial design while computing a 1024 point complex FFT.