QIMING 910 AI Chip (MUSEv1 Architecture)

QIMING-910 was my first solo tape-out project completed at Archip Lab during my senior year. I was in charge of all the development, from architecture design to the final RTL netlist, which was then shifted to UMC for the backend design and fabrication in the summer of 2019. Finally, with the help of some engineers in our team, we successfully verified this chip in December 2019! This experiment gave me insight into IC development, and part of this work was presented in DAC 2020.
The name of the architecture, MUSE, means MUlti-grained Sparsity Engine. The architecture of the chip leverages ternary quantization (-1, 1, 0) for weights, with which we can simply implement multiplication with MUX-based ALUs instead of the complicated logic. Moreover, using the ternary quantization significantly saves memory overhead by 16
Fabricated in the UMC 55nm SP CMOS process, QIMING-910 achieves peak performance and power efficiency of 204.8 GOPS and 3.16 TOPS/W respectively. With the help of ternary-quantization, multiplier-free ALU design, and zero-aware processing, the power efficiency is 2