CMOL/CMOS Implementations of Bayesian Inference Engine: Digital and Mixed-Signal Architectures and Performance/Price – A Hardware Design Space Exploration

  • Dan Hammerstrom
  • Mazad S. Zaveri
Part of the Analog Circuits and Signal Processing book series (ACSP)


In this chapter, we focus on aspects of the hardware implementation of the Bayesian inference framework within the George and Hawkins’ computational model of the visual cortex. This framework is based on Judea Pearl’s Belief Propagation. We then present a “hardware design space exploration” methodology for implementing and analyzing the (digital and mixed-signal) hardware for the Bayesian (polytree) inference framework. This particular methodology involves: analyzing the computational/operational cost and the related micro-architecture, exploring candidate hardware components, proposing various custom architectures using both traditional CMOS and hybrid nanotechnology CMOL, and investigating the baseline performance/price of these hardware architectures. The results suggest that hybrid nanotechnology is a promising candidate to implement Bayesian inference. Such implementations utilize the very high density storage/computation benefits of these new nano-scale technologies much more efficiently; for example, the throughput per 858 mm2 (TPM) obtained for CMOL based architectures is 32–40 times better than the TPM for a CMOS based multiprocessor/multi-FPGA system, and almost 2000 times better than the TPM for a single PC implementation. The assessment of such hypothetical hardware architectures provides a baseline for large-scale implementations of Bayesian inference, and in general, will help guide research trends in intelligent computing (including neuro/cognitive Bayesian systems), and the use of radical new device and circuit technology in these systems.


Bayesian Inference Pearl - belief propagation Cortex CMOS CMOL Nanotechnology Nanogrid Digital Mixed-signal Hardware Nanoarchitectures Methodology Performance Price 



Useful discussions with many colleagues, including Prof. K.K. Likharev, Dr. Changjian Gao, and Prof. G.G. Lendaris are gratefully acknowledged.


  1. 1.
    M.S. Zaveri, D. Hammerstrom, CMOL/CMOS implementations of Bayesian polytree inference: digital & mixed-signal architectures and performance/price. IEEE Trans. Nanotechnology 9(2), 194–211 (2010). DOI:  10.1109/TNANO.2009.2028342
  2. 2.
    D. Hammerstrom, M.S. Zaveri, Prospects for building cortex-scale CMOL/CMOS circuits: a design space exploration, in Proceedings of IEEE Norchip Conference (Trondheim, Norway, 2009)Google Scholar
  3. 3.
    C. Gao, D. Hammerstrom, Cortical models onto CMOL and CMOS – architectures and performance/price. IEEE Trans Circ. Syst-I 54, 2502–2515 (2007)MathSciNetCrossRefGoogle Scholar
  4. 4.
    S. Borkar, Electronics beyond nano-scale CMOS, in Proceedings of 43rd Annual ACM/IEEE Design Automation Conf. (San Francisco, CA, 2006), pp. 807–808Google Scholar
  5. 5.
    R.I. Bahar, D. Hammerstrom, J. Harlow, W.H.J. Jr., C. Lau, D. Marculescu, A. Orailoglu, M. Pedram, Architectures for silicon nanoelectronics and beyond, IEEE Computer 40, 25–33 (2007)CrossRefGoogle Scholar
  6. 6.
    D. Hammerstrom, A survey of bio-inspired and other alternative architectures, in Nanotechnology: Information Technology-II, ed. by R. Waser, vol. 4 (Wiley-VCH Verlag GmbH: Weinheim, Germany, 2008), pp. 251–285Google Scholar
  7. 7.
    Intel, 60 years of the transistor: 1947–2007, Intel Corp., Hillsboro, OR (2007),
  8. 8.
    V. Beiu, Grand challenges of nanoelectronics and possible architectural solutions: what do Shannon, von Neumann, Kolmogorov, and Feynman have to do with Moore, in Proceedings of 37th IEEE International Symposium on Multiple-Valued Logic, Oslo, Norway, 2007Google Scholar
  9. 9.
    D.B. Strukov, K.K. Likharev, CMOL FPGA: a reconfigurable architecture for hybrid digital circuits with two-terminal nanodevices. Nanotechnology 16, 888–900 (2005)CrossRefGoogle Scholar
  10. 10.
    Ö. Türel, J.H. Lee, X. Ma, K. K. Likharev, Architectures for nanoelectronic implementation of artificial neural networks: new results, Neurocomputing 64, 271–283 (2005)Google Scholar
  11. 11.
    K.K. Likharev, D.V. Strukov, CMOL: devices, circuits, and architectures, in Introduction to molecular electronics, ed. by G. Cuniberti, G. Fagas, K. Richter (Springer, Berlin, 2005), pp. 447–478Google Scholar
  12. 12.
    D.B. Strukov, K.K. Likharev, Reconfigurable hybrid CMOS/nanodevice circuits for image processing. IEEE Trans. Nanotechnol. 6, 696–710 (2007)CrossRefGoogle Scholar
  13. 13.
    G. Snider, R. Williams, Nano/CMOS architectures using a field-programmable nanowire interconnect. Nanotechnology 18, 1–11 (2007)Google Scholar
  14. 14.
    NAE, Reverse-engineer the brain, Grand challenges for engineering (The U.S. National Academy of Engineering (NAE) of The National Academies, Washington, DC, [online], 2008), Accessed 15 February 2008
  15. 15.
    R. Ananthanarayanan, D.S. Modha, Anatomy of a cortical simulator, in ACM/IEEE Conference on High Performance Networking and Computing: Supercomputing, Reno, NV, 2007Google Scholar
  16. 16.
    D. George, J. Hawkins, A hierarchical Bayesian model of invariant pattern recognition in the visual cortex, in Proceedings of International Joint Conference on Neural Networks (Montreal, Canada, 2005), pp. 1812–1817Google Scholar
  17. 17.
    T.S. Lee, D. Mumford, Hierarchical Bayesian inference in the visual cortex. J. Opt. Soc. Am. A. Opt. Image Sci. Vis. 20, 1434–1448 (July 2003)CrossRefGoogle Scholar
  18. 18.
    T. Dean, Learning invariant features using inertial priors, Annals of Mathematics and Artificial Intelligence 47, 223–250 (2006)Google Scholar
  19. 19.
    G.G. Lendaris, On Systemness and the problem solver: tutorial comments. IEEE Trans. Syst. Man Cy. 16, 604–610 (1986)Google Scholar
  20. 20.
    M.S. Zaveri, CMOL/CMOS hardware architectures and performance/price for Bayesian memory – The building block of intelligent systems, Ph.D. dissertation, Department of Electrical and Computer Engineering, Portland State University, Portland, OR, October 2009Google Scholar
  21. 21.
    K.L. Rice, T.M. Taha, C.N. Vutsinas, Scaling analysis of a neocortex inspired cognitive model on the Cray XD1, J. Supercomput. 47, 21–43 (2009)Google Scholar
  22. 22.
    D. George, A mathematical canonical cortical circuit model that can help build future-proof parallel architecture, Workshop on Technology Maturity for Adaptive Massively Parallel Computing (Intel Inc., Portland, OR, March 2009),
  23. 23.
    C. Gao, M.S. Zaveri, D. Hammerstrom, CMOS / CMOL architectures for spiking cortical column, in Proceedings of IEEE World Congress on Computational Intelligence – International Joint Conference on Neural Networks, Hong Kong, 2008, pp. 2442–2449Google Scholar
  24. 24.
    E. Rechtin, The art of systems architecting, IEEE Spectrum 29, 66–69, 1992Google Scholar
  25. 25.
    D. Hammerstrom, Digital VLSI for neural networks, in The Handbook of Brain Theory and Neural Networks, ed. by M.A. Arbib (MIT Press, Cambridge, MA, 1998), pp. 304–309Google Scholar
  26. 26.
    J. Bailey, D. Hammerstrom, Why VLSI implementations of associative VLCNs require connectionmultiplexing, in Proceedings of IEEE International Conference on Neural Networks (San Diego, CA, 1988), pp. 173–180Google Scholar
  27. 27.
    J. Schemmel, J. Fieres, K. Meier, Wafer-scale integration of analog neural networks, in Proc. IEEE World Congress on Computational Intelligence – International Joint Conference on Neural Networks (Hong Kong, 2008), pp. 431–438Google Scholar
  28. 28.
    K.A. Boahen, Point-to-point connectivity between neuromorphic chips using address events. IEEE Trans. Circ. Syst. II: Anal. Dig. Sig. Process. 47, 416–434 (2000)MATHCrossRefGoogle Scholar
  29. 29.
    D. George, B. Jaros, The HTM learning algorithms (Numenta Inc., Menlo Park, CA, Whitepaper, March 2007),
  30. 30.
    K.L. Rice, T.M. Taha, and C.N. Vutsinas, Hardware acceleration of image recognition through a visual cortex model, Optics Laser Tech. 40, 795–802 (2008)Google Scholar
  31. 31.
    C.N. Vutsinas, T.M. Taha, K.L. Rice, A neocortex model implementation on reconfigurable logic with streaming memory, in IEEE International Symposium on Parallel and Distributed Processing (Miami, FL, 2008), pp. 1–8Google Scholar
  32. 32.
    R.C. O’Reilly, Y. Munakata, J.L. McClelland, Computational Explorations in Cognitive Neuroscience: Understanding the Mind by Simulating the Brain, 1st edn. (MIT Press, Cambridge, MA, 2000)Google Scholar
  33. 33.
    J. Hawkins, D. George, Hierarchical temporal memory: Concepts, theory and terminology (Numenta Inc., Menlo Park, CA, Whitepaper, March 2007),
  34. 34.
    J. Hawkins, S. Blakeslee, On Intelligence (New York: Times Books, Henry Holt, 2004)Google Scholar
  35. 35.
    D. Hammerstrom, M.S. Zaveri, Bayesian memory, a possible hardware building block for intelligent systems, AAAI Fall Symp. Series on Biologically Inspired Cognitive Architectures (Arlington, VA) (AAAI Press, Menlo Park, CA, TR FS-08–04, Nov. 2008), p. 81Google Scholar
  36. 36.
    J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (Morgan Kaufmann, San Francisco, CA, 1988)Google Scholar
  37. 37.
    R. Genov, G. Cauwenberghs, Charge-mode parallel architecture for vector–matrix multiplication, IEEE Trans. Circ. Syst.-II 48, 930–936 (2001)Google Scholar
  38. 38.
    B. Murmann, Digitally assisted analog circuits, IEEE Micro 26, 38–47 (2006)Google Scholar
  39. 39.
    C. Johansson, A. Lansner, Towards cortex sized artificial neural systems, Neural Networks 20, 48–61 (2007)Google Scholar
  40. 40.
    R. Granger, Brain circuit implementation: high-precision computation from low-precision components, in Replacement Parts for the Brain, ed. by T. Berger, D. Glanzman (MIT Press, Cambridge, MA, 2005), pp. 277–294Google Scholar
  41. 41.
    S. Minghua, A. Bermak, An efficient digital VLSI implementation of Gaussian mixture models-based classifier, IEEE Trans.VLSI Syst. 14, 962–974 (2006)Google Scholar
  42. 42.
    D.B. Strukov, K.K. Likharev, Defect-tolerant architectures for nanoelectronic crossbar memories, J. Nanosci. Nanotechnol. 7, 151–167 (2007)Google Scholar
  43. 43.
    K.K. Likharev, D.B. Strukov, Prospects for the development of digital CMOL circuits, in Proceedings of International Symposium on Nanoscale Architectures (San Jose, CA, 2007), pp. 109–116Google Scholar
  44. 44.
    J.M. Rabaey, Digital Integrated Circuits: A Design Perspective (Prentice Hall, Upper Saddle River, NJ, 1996)Google Scholar
  45. 45.
    N. Weste, D. Harris, CMOS VLSI Design - A Circuits and Systems Perspective, 3rd edn. (Addison Wesley/Pearson, Boston, MA, 2004)Google Scholar
  46. 46.
    M. Haselman, M. Beauchamp, A. Wood, S. Hauck, K. Underwood, K.S. Hemmert, A comparison of floating point and logarithmic number systems for FPGAs, in 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (Napa, CA, 2005), pp. 181–190Google Scholar
  47. 47.
    K.K. Parhi, VLSI Digital Signal Processing Systems: Design and Implementation (Wiley, New York, 1999)Google Scholar
  48. 48.
    K. Seungchul, L. Yongjoo, J. Wookyeong, L. Yongsurk, Low cost floating point arithmetic unit design, in Proceedings of IEEE Asia-Pacific Conference on ASIC (Taipei, Taiwan, 2002), pp. 217–220Google Scholar
  49. 49.
    D.M. Lewis, 114 MFLOPS logarithmic number system arithmetic unit for DSP applications, IEEE J. Solid-St. Circ. 30, 1547–1553 (1995)Google Scholar
  50. 50.
    P.C. Yu, H.-S. Lee, A 2.5-V, 12-b, 5-MSample/s pipelined CMOS ADC, IEEE J. Solid-St. Circ. 31, 1854–1861 (1996)Google Scholar
  51. 51.
    T.E. Williams, M.A. Horowitz, A Zero-overhead self-timed 160-ns 54-b CMOS divider. IEEE J. Solid-St. Circ. 26, 1651–1662 (1991)CrossRefGoogle Scholar
  52. 52.
    G. Gielen, R. Rutenbar, S. Borkar, R. Brodersen, J.-H. Chern, E. Naviasky, D. Saias, C. Sodini, Tomorrow’s analog: just dead or just different? in 43rd ACM/IEEE Design Automation Conference (San Francisco, CA, 2006), pp. 709–710Google Scholar
  53. 53.
    J.N. Coleman, E.I. Chester, A 32-Bit logarithmic arithmetic unit and its performance compared to floating-point, in Proceedings of 14th IEEE Symposium on Computer Arithmetic (Adelaide, Australia, 1994), pp. 142–151Google Scholar
  54. 54.
    C. Gao, Hardware architectures and implementations for associative memories – The building blocks of hierarchically distributed memories, Ph.D. dissertation, Department of Electrical and Computer Engineering, Portland State University, Portland, OR, Nov 2008Google Scholar
  55. 55.
    P. Narayanan, T. Wang, M. Leuchtenburg, C.A. Moritz, Comparison of analog and digital nanosystems: Issues for the nano-architect, in Proc. 2nd IEEE International Nanoelectronics Conference (Shanghai, China, 2008), pp. 1003–1008Google Scholar
  56. 56.
    D. George, J. Hawkins, Belief propagation and wiring length optimization as organizing principles for cortical microcircuits (Numenta Inc., Menlo Park, CA, 2007),
  57. 57.
    K.K. Likharev, Hybrid CMOS/nanoelectronic circuits: opportunities and challenges. J. Nanoelectron. Optoelectron. 3, 203–230 (2008)Google Scholar
  58. 58.
    C. Gao, D. Hammerstrom, CMOL based cortical models, in Emerging brain-inspired nano-architectures, ed. by V. Beiu and U. Rückert (Singapore: World Scientific, 2008 in press)Google Scholar
  59. 59.
    M. Holler, S. Tam, H. Castro, R. Benson, An electrically trainable artificial neural network (ETANN) with 10240 “floating gate” synapses, in International Joint Conference on Neural Networks (San Diego, CA, 1989), pp. 191–196Google Scholar
  60. 60.
    S.H. Jo, K.-H. Kim, W. Lu, Programmable resistance switching in nanoscale two-terminal devices. Nano Lett. 9, 496–500 (2009)Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2010

Authors and Affiliations

  1. 1.Department of Electrical and Computer Engineering, Associate Dean – Maseeh College of Engineering and Computer SciencePortland State UniversityPortlandUSA
  2. 2.Dhirubhai Ambani Institute of Information and Communication TechnologyGandhinagarIndia

Personalised recommendations