Abstract
Recent advances in developing beyond von Neumann architectures have moved the memristive devices to the forefront as one of the key enablers to realizing memristive computing-in-memory (mCIM) structures, which shows a great promise to boost the energy-efficiency and the performance of artificial intelligence (AI) chips. In this study, by considering the interactions between devices, circuits, and systems in the mCIM design, we propose several cross-layer design techniques, including (1) the BL-SL interactive forming protection (BSIFP) circuit that can reduce the voltage drop on the selected transistor, suppress the current overshoot by 65.96%, and improve the bit-cell density by more than 10.19%, (2) the clamping transistor trimming scheme (CTTS) to prevent the multiply-and-accumulate (MAC) signal margin degradation from chip-to-chip resistance variations, and (3) dynamic input-parallelism and output-precision (DIPOP) that can reduce the energy cost by 22.92% in a typical inference task with negligible accuracy loss. The results demonstrate the significant role of the cross-layer-interactive approach and provide a preliminary guideline for highly-efficient mCIM design.
References
Patterson D. 50 years of computer architecture: from the mainframe CPU to the domain-specific TPU and the open RISC-V instruction set. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, 2018. 27–31
Sze V. Designing hardware for machine learning: the important role played by circuit designers. IEEE Solid-State Circuits Mag, 2017, 9: 46–54
Xu X, Ding Y, Hu S X, et al. Scaling for edge inference of deep neural networks. Nat Electron, 2018, 1: 216–222
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521: 436–444
Amodei D, Hernandez D, Sastry G, et al. AI and compute. 2019. https://openai.com/blog/ai-and-compute/
Dou C, Xu X, Zhang X, et al. Enabling RRAM-based brain-inspired computation by co-design of device, circuit, and system. In: Proceedings of IEEE International Electron Devices Meeting (IEDM), San Francisco, 2021. 21–24
Yu S, Chen P-Y. Emerging memory technologies: recent trends and prospects. IEEE Solid-State Circuits Mag, 2016, 8: 43–56
Zidan M A, Strachan J P, Lu W D. The future of electronics based on memristive systems. Nat Electron, 2018, 1: 22–29
Ielmini D, Wong H S P. In-memory computing with resistive switching devices. Nat Electron, 2018, 1: 333–343
Dou C-M, Chen W-H, Xue C-X, et al. Nonvolatile circuits-devices interaction for memory, logic and artificial intelligence. In: Proceedings of IEEE Symposium on VLSI Technology, Honolulu, 2018. 171–172
Wan W, Kubendran R, Eryilmaz S B, et al. A 74 TMACS/W CMOS-RRAM neurosynaptic core with dynamically reconfigurable dataflow and in-situ transposable weights for probabilistic graphical models. In: Proceedings of IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, 2020. 498–500
Li Z, Wang Z, Xu L, et al. RRAM-DNN: an RRAM and model-compression empowered all-weights-on-chip DNN accelerator. IEEE J Solid-State Circuits, 2021, 56: 1105–1115
Su F, Chen W-H, Xia L, et al. A 462GOPs/J RRAM-based nonvolatile intelligent processor for energy harvesting IoE system featuring nonvolatile logics and processing-in-memory. In: Proceedings of Symposium on VLSI Technology, Kyoto, 2017. 260–261
Mochida R, Kouno K, Hayata Y, et al. A 4M synapses integrated analog ReRAM based 66.5 TOPS/W neural-network processor with cell current controlled writing and flexible network architecture. In: Proceedings of IEEE Symposium on VLSI Technology, Honolulu, 2018. 175–176
Jiang Y, Huang P, Zhu D, et al. Design and hardware implementation of neuromorphic systems with RRAM synapses and threshold-controlled neurons for pattern recognition. IEEE Trans Circuits Syst I, 2018, 65: 2726–2738
Cai F, Correll J M, Lee S H, et al. A fully integrated reprogrammable memristor-CMOS system for efficient multiply-accumulate operations. Nat Electron, 2019, 2: 290–299
Chen W H, Dou C, Li K X, et al. CMOS-integrated memristive non-volatile computing-in-memory for AI edge processors. Nat Electron, 2019, 2: 420–428
Liu Q, Gao B, Yao P, et al. A fully integrated analog ReRAM based 78.4 TOPS/W compute-in-memory chip with fully parallel MAC computing. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, 2020. 500–502
Xue C-X, Hung J-M, Kao H-Y, et al. A 22 nm 4 Mb 8b-precision ReRAM computing-in-memory macro with 11.91 to 195.7 TOPS/W for tiny AI edge devices. In: Proceedings of IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, 2021. 245–247
Zhou K, Zhao C, Fang J, et al. An energy efficient computing-in-memory accelerator with 1T2R cell and fully analog processing for edge AI applications. IEEE Trans Circuits Syst II, 2021, 68: 2932–2936
Song T, Chen X, Zhang X, et al. BRAHMS: beyond conventional RRAM-based neural network accelerators using hybrid analog memory system. In: Proceedings of the 58th ACM/IEEE Design Automation Conference (DAC), San Francisco, 2021. 1033–1038
Yoon J H, Chang M, Khwa W S, et al. A 40-nm, 64-Kb, 56.67 TOPS/W voltage-sensing computing-in-memory/digital RRAM macro supporting iterative write with verification and online read-disturb detection. IEEE J Solid-State Circuits, 2022, 57: 68–79
Hung J-M, Huang Y-H, Huang S-P, et al. An 8-Mb DC-current-free binary-to-8b precision ReRAM nonvolatile computing-inmemory macro using time-space-readout with 1286.4–21.6TOPS/W for edge-AI devices. In: Proceedings of IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, 2022. 1–3
Li W, Sun X, Huang S, et al. A 40-nm MLC-RRAM compute-in-memory macro with sparsity control, on-chip write-verify, and temperature-independent ADC references. IEEE J Solid-State Circuits, 2022, 57: 2868–2877
Zhang W, Gao B, Tang J, et al. Neuro-inspired computing chips. Nat Electron, 2020, 3: 371–382
Zou X Q, Xu S, Chen X M, et al. Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology. Sci China Inf Sci, 2021, 64: 160404
Yu S, Jiang H, Huang S, et al. Compute-in-memory chips for deep learning: recent trends and prospects. IEEE Circuits Syst Mag, 2021, 21: 31–56
Wan W, Kubendran R, Schaefer C, et al. A compute-in-memory chip based on resistive random-access memory. Nature, 2022, 608: 504–512
Zhu H, Jiao B, Zhang J, et al. COMB-MCM: computing-on-memory-boundary NN processor with bipolar bitwise sparsity optimization for scalable multi-chiplet-module edge machine learning. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, 2022. 1–3
Portal J M, Bocquet M, Onkaraiah S, et al. Design and simulation of a 128 kb embedded nonvolatile memory based on a hybrid RRAM (HfO2)/28 nm FDSOI CMOS technology. IEEE Trans Nanotechnol, 2017, 16: 677–686
Beckmann K, Holt J, Manem H, et al. Nanoscale hafnium oxide RRAM devices exhibit pulse dependent behavior and multilevel resistance capability. MRS Adv, 2016, 1: 3355–3360
Sekar D C, Bateman B, Raghuram U, et al. Technology and circuit optimization of resistive RAM for low-power, reproducible operation. In: Proceedings of IEEE International Electron Devices Meeting (IEDM), San Francisco, 2014. 21–24
Wan W, Kubendran R, Schaefer C, et al. Edge AI without compromise: efficient, versatile and accurate neurocomputing in resistive random-access memory. 2021. ArXiv:2108.07879
Xue X Y, Jian W X, Yang J G, et al. A 0.13 µm 8 Mb logic based CuxSiyO resistive memory with self-adaptive yield enhancement and operation power reduction. In: Proceedings of IEEE Symposium on VLSI Circuits, Honolulu, 2012. 42–43
Jain P, Arslan U, Sekhar M, et al. A 3.6 Mb 10.1 Mb/mm2 embedded non-volatile ReRAM macro in 22 nm FinFET technology with adaptive forming/set/reset schemes yielding down to 0.5 V with sensing time of 5 ns at 0.7 V. In: Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, 2019. 212–214
Acknowledgements
This work was supported by National Key R&D Program of China (Grant Nos. 2018YFA0701500, 2018YFB-2202900), National Natural Science Foundation of China (Grant Nos. 61904197, 61934005, 61825404, 61732020, 61821091, 61888102), Strategic Priority Research Program of the Chinese Academy of Sciences (Grant No. XDB44000000), and Project of MOE Innovation Platform.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
An, J., Wang, L., Ye, W. et al. Design memristor-based computing-in-memory for AI accelerators considering the interplay between devices, circuits, and system. Sci. China Inf. Sci. 66, 182404 (2023). https://doi.org/10.1007/s11432-022-3627-8
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-022-3627-8