Abstract
The performance of modern digital signal processing (DSP) systems is inherently affected by the variability tolerance of their main arithmetic units. As CMOS technology approaches nanometer scales, numerous threats for the reliability of DSP designs emerge. A large portion of these phenomena are related to threshold voltage \(V_{th}\) variations, resulting in timing failures due to an overall increase of the arithmetic unit delay. In this work, we employ linear regression techniques to accelerate transistor variability estimation using static timing analysis (STA) tools. By identifying the variation-critical part of an arithmetic circuit, we reduce the transistor inventory that needs to be tracked by the STA solver. We substantiate the efficiency of the proposed framework for realistic designs. For the main logic blocks of the modulo \(2^n-1\) add–multiply (AM) operation, we capture the variability-induced degradation of the Modified Booth (MB) encoding stage with negligible accuracy losses. We also exploit our methodology as a useful aid for variability-aware design techniques and we investigate the variability-tolerance of novel MB recoding schemes against conventional designs.
Similar content being viewed by others
References
Chaves, R., & Sousa, L. (2003). RDSP: A RISC DSP based on residue number system. In Proceedings of the Euromicro symposium on digital system design, 2003 (pp. 128–135).
Jeong, W., An, S., Kim, M., Heo, S., Kim, Y., Moon, S., & Lee, Y. (1999). Design of a combined processor containing a 32-bit RISC microprocessor and a 16-bit fixed-point DSP on a chip. In 6th international conference on VLSI and CAD, 1999. ICVC ’99 (pp. 305–308).
Bernocchi, G.L., Cardarilli, G.C., Del Re, A., Nannarelli, A., & Re, M. (2007). Low-power adaptive filter based on RNS components. In IEEE international symposium on circuits and systems, 2007. ISCAS 2007 (pp. 3211–3214).
Efstathiou, C., & Vergos, H.T. (2000). Modified Booth 1’s complement and modulo 2n–1 multipliers. In The 7th IEEE international conference on electronics, circuits and systems, 2000. ICECS 2000 (Vol. 2, pp. 637–640).
Zimmermann, R. (1999). Efficient VLSI implementation of modulo (2n plusmn;1) addition and multiplication. In Proceedings of the 14th IEEE symposium on computer arithmetic, 1999 (pp. 158–167).
Kouretas, I., & Paliouras, V. (2012). Residue arithmetic for designing multiply-add units in the presence of non-gaussian variation. In 2012 IEEE international symposium on circuits and systems (ISCAS) (pp. 1231–1234).
Sun, L., Mathew, J., Pradhan, D.K., & Mohanty, S.P. (2011). Statistical blockade method for fast robustness estimation and compensation of nano-CMOS arithmetic circuits. In 2011 international symposium on electronic system design (ISED) (pp. 194–199).
Wang, W., Yang, S., Bhardwaj, S., Vrudhula, S., Liu, F., & Cao, Y. (2010). The impact of NBTI effect on combinational circuit: modeling, simulation, and analysis. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 18(2), 173–183.
Bruguera, J. D., & Lang, T. (1996). Implementation of the FFT butterfly with redundant arithmetic. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 43(10), 717–723.
Efstathiou, C., Vergos, H. T., & Nikolos, D. (2004). Modified Booth modulo 2n–1 multipliers. IEEE Transactions on Computers, 53(3), 370–374.
Muralidharan, R., & Chang, C.-H. (2012). Area-power efficient modulo and modulo multipliers for based RNS. IEEE Transactions on Circuits and Systems I: Regular Papers, 59(10), 2263–2274.
Kouretas, I., & Paliouras, V. (2010). Residue arithmetic bases for reducing delay variation. In Proceedings of 2010 IEEE international symposium on circuits and systems (ISCAS) (pp. 3885–3888).
Yeh, W.-C., & Jen, C.-W. (2003). High-speed and low-power split-radix FFT. IEEE Transactions on Signal Processing, 51(3), 864–874.
Kouretas, I., & Paliouras, V. (2013). Delay-variation-tolerant FIR filter architectures based on the residue number system. In 2013 IEEE international symposium on circuits and systems (ISCAS) (pp. 2223–2226).
Schinianakis, D.M., Kakarountas, A.P., & Stouraitis, T. (2006). A new approach to elliptic curve cryptography: An RNS architecture. In Electrotechnical conference, 2006. MELECON 2006. IEEE Mediterranean (pp. 1241–1245).
Tsoumanis, K., Xydis, S., Efstathiou, C., Moschopoulos, N., & Pekmestzi, K. (2014). An optimized modified Booth recoder for efficient design of the add-multiply operator. IEEE Transactions on Circuits and Systems I: Regular Papers, 61(4), 1133–1143.
Zimmermann, R., & Tran, D.Q. (2003). Optimized synthesis of sum-of-products. In Conference record of the thirty-seventh Asilomar conference on signals, systems and computers, 2004 (Vol. 1, pp. 867–872).
Tsoumanis, K., Pekmestzi, K., & Efstathiou, C. (2014). Fused modulo \(2^n-1\) add-multiply unit. In 2014 21st IEEE international conference on electronics, circuits and systems (ICECS) (pp. 40–43).
Bowman, K. A., Duvall, S. G., & Meindl, J. D. (2002). Impact of die-to-die and within-die parameter fluctuations on the maximum clock frequency distribution for gigascale integration. IEEE Journal of Solid-State Circuits, 37(2), 183–190.
Onaissi, S., Taraporevala, F., Liu, J., & Najm, F. (2011). A fast approach for static timing analysis covering all PVT corners. In 2011 48th ACM/EDAC/IEEE design automation conference (DAC) (pp. 777–782).
Goel, A., & Vrudhula, S. (2008). Statistical waveform and current source based standard cell models for accurate timing analysis. In 45th ACM/IEEE design automation conference, 2008. DAC 2008 (pp. 227–230).
Stamoulis, D., Rodopoulos, D., Meyer, B.H., Soudris, D., Catthoor, F., & Zilic, Z. (2015). Efficient reliability analysis of processor datapath using atomistic BTI variability models. In 2015 25th ACM great lakes VLSI symposium (GLSVLSI).
Weckx, P., Kaczer, B., Toledano-Luque, M., Raghavan, P., Franco, J., Roussel, P. J., et al. (2014). Implications of BTI-induced time-dependent statistics on yield estimation of digital circuits. IEEE Transactions on Electron Devices, 61(3), 666–673.
Ganapathy, S., Canal, R., Gonzalez, A., & Rubio, A. (2010). Circuit propagation delay estimation through multivariate regression-based modeling under spatio-temporal variability. In Design, automation test in Europe conference exhibition (DATE), 2010 (pp. 417–422).
Rodopoulos, D., Stamoulis, D., Lyras, G., Soudris, D., & Catthoor, F. (2014). Understanding timing impact of BTI/RTN with massively threaded atomistic transient simulations. In 2014 IEEE international conference on IC design technology (ICICDT) (pp. 1–4).
Tang, Q., Rodriguez, J., Zjajo, A., Berkelaar, M., & van der Meijs, N. (2014). Statistical transistor-level timing analysis using a direct random differential equation solver. Transactions on Computer-Aided Design of Integrated Circuits and Systems, 33(2), 210–223.
Raja, S., Varadi, F., Becer, M., & Geada, J. (2008). Transistor level gate modeling for accurate and fast timing, noise, and power analysis. In 45th ACM/IEEE design automation conference, 2008. DAC 2008 (pp. 456–461).
NanoTime. (2012). User guide, Version G-2012.06. Synopsys Inc.
Predictive Technology Model. (2006). http://ptm.asu.edu/.
Reisinger, H., Grasser, T., Gustin, W., & Schlunder, C. (2010). The statistical analysis of individual defects constituting NBTI and its implications for modeling DC- and AC-stress. In 2010 IEEE international reliability physics symposium (IRPS) (pp. 7–15).
Stamoulis, D., Rodopoulos, D., Meyer, B.H., Soudris, D., & Zilic, Z. (2014). Linear regression techniques for efficient analysis of transistor variability. In 2014 21st IEEE international conference on electronics, circuits and systems (ICECS) (pp. 267–270).
Rodopoulos, D., Weckx, P., Noltsis, M., Catthoor, F., & Soudris, D. (2014). Atomistic pseudo-transient BTI simulation with inherent workload memory. IEEE Transactions on Device and Materials Reliability, 14(2), 704–714.
Design Compiler. (2009). User guide, Version C-2009.06. Synopsys Inc.
Pelgrom, M.J.M., & Duinmaijer, A.C.J. (1988). Matching properties of MOS transistors. In Fourteenth European solid-state circuits conference, 1988. ESSCIRC ’88 (pp. 327–330).
Fernandez, P. G., & Lloris, A. (2003). RNS-based implementation of 8 times; 8 point 2d-DCT over field-programmable devices [image compression]. Electronics Letters, 39(1), 21–23.
Acknowledgments
The work presented in this paper is partially supported by the Greek State Scholarship Foundation (IKY), the European Social Fund and the FP7-612069-HARPA EU Project. The authors would also like to acknowledge CMC Microsystems for the provision of products and services that facilitated this research.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Stamoulis, D., Tsoumanis, K., Rodopoulos, D. et al. Efficient variability analysis of arithmetic units using linear regression techniques. Analog Integr Circ Sig Process 87, 249–261 (2016). https://doi.org/10.1007/s10470-016-0712-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10470-016-0712-6