Skip to main content

mMPU—A Real Processing-in-Memory Architecture to Combat the von Neumann Bottleneck

  • Chapter
  • First Online:
Applications of Emerging Memory Technology

Part of the book series: Springer Series in Advanced Microelectronics ((MICROELECTR.,volume 63))

Abstract

Data transfer between processing and memory units in modern computing systems is their main performance and energy-efficiency bottleneck, commonly known as the von Neumann bottleneck. Prior research attempts to alleviate the problem by moving the computing units closer to the memory that has had limited success since data transfer is still required. In this chapter, we present mMPU memristive memory processing unit, which relies on a memristive memory to perform computation using the memory cells, and therefore directly tackles the von Neumann bottleneck. In mMPU, the operation is controlled by a modified controller and peripheral circuit without changing the structure of the memory cells and arrays. As the basic logic element, we present Memristor-Aided loGIC (MAGIC), a technique to compute logical functions using memristors within the memory array. We further show how to extend basic MAGIC primitives to execute any arbitrary Boolean function and demonstrate the microarchitecture of the memory. This process is required to enable data computing using MAGIC. Finally, we show how to build the computing system using mMPU, which performs computation using MAGIC to enable a real processing-in-memory machine.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. R. Balasubramonian, B. Grot, Near-data processing. IEEE Micro 36(1), 4–5 (2016). https://doi.org/10.1109/MM.2016.1

    Article  Google Scholar 

  2. B. Black, Die Stacking is Happening! Proceedings of the International Symposium on Microarchitecture (2013)

    Google Scholar 

  3. M.N. Bojnordi, E. Ipek, Memristive Boltzmann machine: a hardware accelerator for combinatorial optimization and deep learning. In: 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA) (2016), pp. 1–13. https://doi.org/10.1109/HPCA.2016.7446049

  4. Y. Cassuto, S. Kvatinsky, E. Yaakobi, Sneak-path constraints in memristor crossbar arrays. In: Proceedings of the IEEE International Symposium on Information Theory (ISIT) (2013), pp. 156–160

    Google Scholar 

  5. S. Chakraborti, P.V. Chowdhary, K. Datta, I. Sengupta, Bdd based synthesis of boolean functions using memristors. In: 2014 9th International Design and Test Symposium (IDT) (2014), pp. 136–141. https://doi.org/10.1109/IDT.2014.7038601

  6. Y.C. Chen et al., An access-transistor-free (0T/1R) non-volatile resistance random access memory (RRAM) using a novel threshold switching, self-rectifying chalcogenide device. In: IEEE International on Electron Devices Meeting IEDM ’03 Technical Diges (2003), pp. 37.4.1–37.4.4

    Google Scholar 

  7. P. Chi, S. Li, C. Xu, T. Zhang, J. Zhao, Y. Liu, Y. Wang, Y. Xie, PRIME: A novel processing-in-memory architecture for neural network computation in ReRAM-based main memory. In: 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) (2016), pp. 27–39. https://doi.org/10.1109/ISCA.2016.13

  8. L. De Moura, N. Bjørner, Z3: an efficient SMT solver. In: Tools and Algorithms for the Construction and Analysis of Systems (2008), pp. 337–340

    Google Scholar 

  9. P. Dlugosch, D. Brown, P. Glendenning, M. Leventhal, H. Noyes, An efficient and scalable semiconductor architecture for parallel automata processing. IEEE Trans. Parallel Distrib. Syst. 25(12), 3088–3098 (2014). https://doi.org/10.1109/TPDS.2014.8

    Article  Google Scholar 

  10. Y. Eckert, N. Jayasena, G.H. Loh, Thermal feasibility of die-stacked processing in memory. In: Proceedings of the 2nd Workshop Near-Data Processing (2014)

    Google Scholar 

  11. D.G. Elliott, M. Stumm, W.M. Snelgrove, C. Cojocaru, R. Mckenzie, Computational RAM: implementing processors in memory. IEEE Des. Test Comput. 16(1), 32–41 (1999). https://doi.org/10.1109/54.748803

    Article  Google Scholar 

  12. M. Gokhale, B. Holmes, K. Iobst, Processing in memory: the Terasys massively parallel PIM array. Computer 28(4), 23–31 (1995). https://doi.org/10.1109/2.375174

    Article  Google Scholar 

  13. L. Guckert, E.E. Swartzlander, MAD gates: Memristor logic design using driver circuitry. IEEE Trans. Circuits Syst. II Exp. Briefs 64(2), 171–175 (2017). https://doi.org/10.1109/TCSII.2016.2551554

    Article  Google Scholar 

  14. Q. Guo, X. Guo, Y. Bai, E. Ipek, A resistive TCAM accelerator for data-intensive computing. In: Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture. ACM (2011), pp. 339–350

    Google Scholar 

  15. Q. Guo, X. Guo, R. Patel, E. Ipek, E.G. Friedman, AC-DIMM: associative computing with STT-MRAM. ACM SIGARCH Comput. Arch. News 41(3), 189–200 (2013)

    Article  Google Scholar 

  16. HSA Foundation: Harmonizing the Industry Around Heterogeneous Computing, http://www.hsafoundation.com/

  17. J.J. Huang, Y.M. Tseng, W.C. Luo, C.W. Hsu, T.H. Hou, One selector one resistor (1s1r) crossbar array for high-density flexible memory applications. IEEE (2011), pp. 31.7.1–31.7.4

    Google Scholar 

  18. R.B. Hur, S. Kvatinsky, Memristive memory processing unit (MPU) controller for in-memory processing. In: 2016 IEEE International Conference on the Science of Electrical Engineering (ICSEE) (2016), pp. 1–5. https://doi.org/10.1109/ICSEE.2016.7806045

  19. R.B. Hur, N. Talati, S. Kvatinsky, Algorithmic considerations in memristive memory processing units (MPU). In: CNNA 2016 15th International Workshop on Cellular Nanoscale Networks and their Applications (2016), pp. 1–2

    Google Scholar 

  20. R.B. Hur, N. Wald, N. Talati, S. Kvatinsky, SIMPLE MAGIC: synthesis and in-memory MaPping of logic execution for memristor-aided loGIC. In: Proceeding of the IEEE International Conference on Circuits Aided Design (2017)

    Google Scholar 

  21. Hybrid Memory Cube Consortium, Hybrid Memory Cube Specification 1.0 (2013)

    Google Scholar 

  22. JEDEC Solid State Technology Association: High Bandwidth Memory (HBM) DRAM, http://www.jedec.org/standards-documents/results/jesd235

  23. S. Kvatinsky, G. Satat, N. Wald, E.G. Friedman, A. Kolodny, U.C. Weiser, Memristor-based material implication (imply) logic: design principles and methodologies. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 22(10), 2054–2066 (2014). https://doi.org/10.1109/TVLSI.2013.2282132

    Article  Google Scholar 

  24. S. Kvatinsky, N. Wald, G. Satat, A. Kolodny, U.C. Weiser, E.G. Friedman, MRL–memristor ratioed logic. In: 2012 13th International Workshop on Cellular Nanoscale Networks and their Applications (2012), pp. 1–6. https://doi.org/10.1109/CNNA.2012.6331426

  25. S. Kvatinsky, E.G. Friedman, A. Kolodny, U.C. Weiser, The desired memristor for circuit designers. IEEE Circuits Syst. Mag. 13(2), 17–22 (2013). https://doi.org/10.1109/MCAS.2013.2256257

    Article  Google Scholar 

  26. S. Kvatinsky, D. Belousov, S. Liman, G. Satat, N. Wald, E.G. Friedman, A. Kolodny, U.C. Weiser, MAGIC - memristor-aided logic. IEEE Trans. Circuits Syst. II Express Briefs 61(11), 895–899 (2014). https://doi.org/10.1109/TCSII.2014.2357292

    Article  Google Scholar 

  27. J. Lee, M. Jo, D. Jun Seong, J. Shin, H. Hwang, Materials and process aspect of cross-point RRAM (invited). Microelectron. Eng. 88(7), 1113–1118 (2011)

    Article  Google Scholar 

  28. Y. Levy, J. Bruck, Y. Cassuto, E.G. Friedman, A. Kolodny, E. Yaakobi, S. Kvatinsky, Logic operations in memory using a memristive akers array. Microelectron. J. 45(11), 1429–1437 (2014)

    Article  Google Scholar 

  29. H. Li et al., Write disturb analyses on half-selected cells of cross-point rram arrays. In: Proceedings of the IEEE International Reliability Physics Symposium (2014), pp. MY.3.1–MY.3.4

    Google Scholar 

  30. S. Li, C. Xu, Q. Zou, J. Zhao, Y. Lu, Y. Xie, Pinatubo: a processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories. In: Design Automation Conference (DAC) (2016), pp. 1–6. https://doi.org/10.1145/2897937.2898064

  31. W. Lynch, Worst-case analysis of a resistor memory matrix. IEEE Trans. Comput. C–18(10), 940–942 (1969)

    Article  Google Scholar 

  32. A. Mishchenko, ABC: a system for sequential synthesis and verification (2012), http://www.eecs.berkeley.edu/~alanmi/abc/

  33. M. Oskin, F.T. Chong, T. Sherwood, Active pages: a computation model for intelligent memory. SIGARCH Comput. Archit. News 26(3), 192–203 (1998). https://doi.org/10.1145/279361.279387

    Article  Google Scholar 

  34. G. Papandroulidakis, I. Vourkas, N. Vasileiadis, G.C. Sirakoulis, Boolean logic operations and computing circuits based on memristors. IEEE Trans. Circuits Syst. II Exp. Briefs 61(12), 972–976 (2014). https://doi.org/10.1109/TCSII.2014.2357351

    Article  Google Scholar 

  35. D. Patterson, T. Anderson, N. Cardwell, R. Fromm, K. Keeton, C. Kozyrakis, R. Thomas, K. Yelick, A Case for Intelligent RAM. IEEE Micro 17(2), 34–44 (1997). https://doi.org/10.1109/40.592312

    Article  Google Scholar 

  36. D. Patterson, T. Anderson, N. Cardwell, R. Fromm, K. Keeton, C. Kozyrakis, R. Thomas, K. Yelick, Intelligent RAM (IRAM): chips that remember and compute. In: 1997 IEEE International Solids-State Circuits Conference. Digest of Technical Papers (1997), pp. 224–225. https://doi.org/10.1109/ISSCC.1997.585348

  37. J. Reuben, R. Ben-Hur, N. Wald, N. Talati, A.H. Ali, P.E. Gaillardon, S. Kvatinsky, Memristive logic: a framework for evaluation and comparison. In: International Symposium on Power and Timing Modeling, Optimization, and Simulation (PATMOS) (2017) (in press)

    Google Scholar 

  38. S. Shin, K. Kim, S.M. Kang, Analysis of passive memristive devices array: data-dependent statistical model and self-adaptable sense resistance for RRAMs. Proc. IEEE 100(6), 2021–2032 (2012)

    Article  Google Scholar 

  39. N. Talati, S. Gupta, P. Mane, S. Kvatinsky, Logic design within memristive memories using memristor-aided loGIC (MAGIC). IEEE Trans. Nanotechnol. 15(4), 635–650 (2016). https://doi.org/10.1109/TNANO.2016.2570248

    Article  Google Scholar 

  40. K. Wang, Y. Qi, J.J. Fox, M.R. Stan, K. Skadron, Association rule mining with the micron automata processor. In: 2015 IEEE International Parallel and Distributed Processing Symposium (2015), pp. 689–699. https://doi.org/10.1109/IPDPS.2015.101

  41. H.S.P. Wong, H.Y. Lee, S. Yu, Y.S. Chen, Y. Wu, P.S. Chen, B. Lee, F.T. Chen, M.J. Tsai, Metal oxide RRAM. Proc. IEEE 100(6), 1951–1970 (2012). https://doi.org/10.1109/JPROC.2012.2190369

    Article  Google Scholar 

  42. W. Woods, M.M.A. Taha, S.J.D. Tran, J. Brger, C. Teuscher, Memristor panic: a survey of different device models in crossbar architectures. In: Proceedings of the 2015 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH15) (2015), pp. 106–111. https://doi.org/10.1109/NANOARCH.2015.7180595

  43. L. Xie, H.A.D. Nguyen, M. Taouil, S. Hamdioui, K. Bertels, Fast boolean logic mapped on memristor crossbar. In: International Conference on Computer Design (2015), pp. 335–342. https://doi.org/10.1109/ICCD.2015.7357122

  44. C.T. Yang, C.L. Huang, C.F. Lin, Hybrid cuda, openmp, and mpi parallel programming on multicore gpu clusters. Comput. Phys. Commun. 182(1), 266–269 (2011)

    Article  Google Scholar 

  45. L. Yavits, S. Kvatinsky, A. Morad, R. Ginosar, Resistive associative processor. IEEE Comput. Arch. Lett. 14(2), 148–151 (2015). https://doi.org/10.1109/LCA.2014.2374597

    Article  MATH  Google Scholar 

  46. Y. Zha, J. Li, Reconfigurable in-memory computing with resistive memory crossbar. In: International Conference on Computer-Aided Design (2016), pp. 1–8. https://doi.org/10.1145/2966986.2967069

  47. M.A. Zidan, H.A.H. Fahmy, M.M. Hussain, K.N. Salama, Memristor-based memory: the sneak paths problem and solutions. Microelectron. J. 44(2), 176–183 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nishil Talati .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Talati, N., Ben-Hur, R., Wald, N., Haj-Ali, A., Reuben, J., Kvatinsky, S. (2020). mMPU—A Real Processing-in-Memory Architecture to Combat the von Neumann Bottleneck. In: Suri, M. (eds) Applications of Emerging Memory Technology. Springer Series in Advanced Microelectronics, vol 63. Springer, Singapore. https://doi.org/10.1007/978-981-13-8379-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-8379-3_8

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-8378-6

  • Online ISBN: 978-981-13-8379-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics