1 Introduction

Since their very first introduction, the performance improvement of Flash memory technologies was long achieved thanks to an uninterrupted scaling process that led to a nand Flash cell feature size as small as 14 nm in 2015 [1]. However, as the size of the single memory cell was shrinked down to decananometer dimensions, some fundamental issues related to the increasingly complex fabrication techniques and to inherent physical limitations due to the discrete nature of charge and matter emerged, undermining both the manufacturing and the proper operation of Flash memory arrays [2, 3]. For this reason, an alternative integration paradigm has been adopted to break the classical trade-off between single-cell area and array storage density, consisting in stacking many layers of memory cells along the direction orthogonal to the wafer plane. Figure 1a, shows a schematic of a possible implementation of 3-D NAND Flash memory arrays, featuring vertical polycrystalline silicon (poly-Si) channels, contacted at the bit-line (BL) and source-line (SL) sides by \(n^+\) regions. At the intersection between each poly-Si channel and a horizontal word-line (WL) plane, a gate-all-around (GAA) memory cell with macaroni structure is formed, as schematically depicted in Fig. 1b, with the oxide-nitride-oxide (ONO) stack adopted to store charge in the middle layer. Due to the lack of any p-doped regions to access the string channel in such architecture, the poly-Si channels cannot be contacted similarly to the case of planar nand technologies. While this feature does not affect the read and program operations, the employment of a novel voltage scheme is required to increase the channel potential and trigger the emission of electrons from the storage layer or the injection of holes into it during erase. To this purpose, the voltage scheme displayed in Fig. 1c is adopted. A positive voltage ramp is applied to the BL and SL of the string while keeping to ground the WLs and the selector gates (SGs); the strong electric fields at the inner edge of the SGs are large enough to trigger the generation of electron/hole pairs by band-to-band tunneling (BTBT) [4, 5]. While electrons are swept towards the BL/SL contacts, giving rise to the so-called GIDL current, the BTBT-generated holes are directed towards the center of the string, where they accumulate and contribute to increase the channel potential. In this framework, Sect. 2 is devoted to the study of the GIDL-assisted erase in 3–D nand Flash memory arrays by TCAD simulations first (Sect. 2.1), and, then, to the development of a compact model able to predict the string behavior and the threshold-voltage \(V_T\) evolution during erase (Sect. 2.2). All the presented results are from [6] and [7].

Fig. 1
figure 1

a Schematic of a VC 3–D nand Flash memory array and b of a GAA memory cell [1] (from [8] under CC BY 4.0 license). c Schematic of the string condition during a GIDL-assisted erase operation (a section of the string close to the BL is shown)

On the other hand, NOR Flash memory cells have never been scaled beyond the feature size of 40 nm as research efforts for embedded applications have been focused on different technologies, such as phase-change memories [9, 10]. Despite this, in the last years nor Flash memory arrays attracted some interest also for their employment in the implementation of spiking neural networks (SNNs), representing a promising solution to outclass conventional CMOS systems based on the Von-Neumann architecture in problems dealing with unstructured data, such as image recognition and classification [11]. A mandatory condition to be met by memory arrays employed in SNNs is the possibility to tune the threshold voltage (\(V_T\)) of each cell independently of the others in both directions, meaning that single-cell selectivity not only during program and but also during erase operation is needed, with the block erase typical of Flash technologies clearly representing an obstacle for neuromorphic applications. To overcome this issue, some works have suggested design adjustments either to the cell or to the array level, with the drawback, however, of a larger array area occupancy and more complex manufacturing process [12,13,14,15]. Taking inspiration from the GIDL-assisted erase employed in 3–D VC nand Flash memory arrays investigated in Sect. 2, the idea of moving from the classical erase scheme based on Fowler-Nordheim (FN) tunneling [16] to a novel single-cell selective one that exploits BTBT-generated HHI at the drain side is presented in Sect. 3. Exploiting such scheme, the operation of a SNN based on the STDP learning rule [17,18,19] exploiting a mainstream NOR Flash memory array with no modification either to the cell or to the array design is successfully demonstrated. The results presented in Sect. 3 are from [20,21,22,23].

2 GIDL–Assisted Erase in 3-D NAND Memory Arrays

2.1 Overview on String Dynamics

In order to investigate the erase operation in 3-D nand Flash memory arrays when GIDL is triggered at the SGs, TCAD simulations were performed using a commercial device simulator (see [6] for more information about the simulation environment). As a starting point, no charge exchange between the channel and the storage layer is included. Simulation results are displayed in Fig. 2a and Fig. 2b, which report the variations of the string potential \(\Delta V_B\) (the average value in the radial direction at the center of the string is considered) and the BL/SL currents \(I_{BL}=I_{SL}\) during erase for different values of the BL/SL ramps rise time \(t_r\). Results reveal that three different phases of the transient can be identified: I) \(I_{BL} \approx \) constant and \(\Delta V_B \approx 0\); II) \(I_{BL}\) increases steeply and the same does \(V_B\), with rate larger than that of \(V_{BL}\); III) \(I_{BL}\) reaches a peak and then saturates to a constant value while \(V_B\) continues to increase but at the same rate of \(V_{BL}\). Figure 3 shows how the net charge density in the nand string evolves during the erase transient. By comparing the former figure with Fig. 2a and Fig. 2b, it is easy to relate the charge distribution with the \(\Delta V_B\) and \(I_{BL}\) transients: during phase I, BTBT-generated holes are a few and the string is approximately depleted of charge; during phase II, holes start to rule the string electrostatics, but they are confined in the central part of the string (that is, the SGs regions are still depleted); during phase III BTBT-generated holes spread also under the SGs, thus ruling their electrostatics.

Fig. 2
figure 2

a Comparison between the string potential \(V_B\) during the erase transient resulting from the TCAD simulations and that computed with the developed compact model. b Same as a but \(I_{BL}\) is shown. \(\copyright \) 2018 IEEE [6]

Fig. 3
figure 3

TCAD results displaying the net charge density (normalized to the elementary charge) at different times during the same transients of Fig. 2a and Fig. 2b. Below each snapshot the \(V_{BL}\) at which it is taken is reported. Note that the net charge density in the central region of the string is due to holes, while that at the bottom of the \(n^+\)-doped regions is due to ionized donors. \(\copyright \) 2018 IEEE [6]

Fig. 4
figure 4

a Schematic of the capacitive couplings considered in the developed compact model (only the upper region of the string close to the BL is showed, \(\copyright \) 2019 Springer Nature [7]). b Schematic of the developed compact model. c Evolution of \(V_B\) and \(\Delta V_T\) (with respect to the neutral \(V_T\)) during erase when also the charge exchange between the string channel and the nitride storage layer is included in the compact model of Fig. 4b. (\(\copyright \) 2019 Springer Nature [7])

2.2 Compact Model

Figure 4a shows the compact model developed to reproduce the results of the TCAD simulations. Holes distribution in the string is approximated to be uniform (orange region) over an equipotential region that extends from the center of the string to a distance \(\Delta L\) within the channel of the SG, which is variable during the transient. The electrostatics in the region of the SGs that is depleted of holes (\(L_x\)) is modeled through \(C_{f,in}\) and \(C_{NB}\); \(C_{ONO}\cdot \Delta L\) is the remaining capacitive component between the bulk orange region and the longitudinal face of the SG; \(C_{f,out}\) accounts for the fringing fields between the transverse face of the SG and the \(n^+\) region while \(C_{f,B}\) between the transverse face of the WLs and the central region of the string; \(C_{dep}\) simply models the variations of charge in the depleted portion of the \(n^+\) region. Finally, the series between \(C_{G1}\) and \(C_{G2}\) represents the capacitance of the ONO stack, with the former calculated from the silicon/oxide interface to the middle of the nitride layer, and the latter from the middle of the nitride layer to the WL. The resulting compact model is shown in Fig. 4b, with the addition of the current generator \(I_{GIDL}\) that reproduces the GIDL current (\(C_{B,SG}\) and \(C_{B,WL}\) are the overall capacitances between the orange region and the SGs and WLs, respectively). Please refer to [6] and [7] for the calculation of \(I_{GIDL}\) and of all the capacitive contributions mentioned so far. Figure 2a and Fig. 2b show the \(V_B\) and \(I_{BL}\) transients computed with the compact circuit of Fig. 4b (red line). Model results nicely reproduce those from TCAD simulations confirming the validity of the developed compact model; refer to [6] for a similar analysis also for different string geometries and different electrical waveforms applied to the string contacts.

Finally, in [7] the developed compact model was improved to account also for the variation in the cell \(V_T\) due to the emission of electrons from the nitride layer or by injection of holes into it; for compact modeling purposes, the net charge was assumed to be stored in the node between \(C_{G1}\) and \(C_{G2}\). Figure 4c shows the evolution of \(V_T\) during the GIDL-assisted erase operation and the impact of the charge exchange between the silicon channel and the nitride layer on \(V_B\).

3 NOR Flash–Based Spiking Neural Networks

Hardware neural networks (HNNs) are computing systems in which memory and computing units are not distinct entities exchanging data through a communication bus but rather they are distributed in a way that resembles the organization of synapses and neurons in the human brain [24]. A convenient way to implement HNNs consists in exploiting non-volatile memory arrays as synaptic arrays connecting adjacent layers of artificial neurons: each memory cell acts like a biological synapse, that is, an electrical connection of variable strength [11]. For example, Fig. 5a shows schematically a two layers nor Flash-based HNN. The voltage signals coming from the presynaptic neurons (pre) are applied to the WLs of the memory arrays; then, as result of the input signals and the state (\(V_T\)) of each memory cell, a current flows through each BL, corresponding to the output signals that are sent to the postsynaptic neurons (post). In particular, each memory cell is operated in subthreshold regime [25, 26], in which the drain-to-source current \(I_{DS}\) can be expressed as a function of the WL voltage \(V_{WL}\) as \(I_{DS} = I_0 \cdot \exp \left[ \alpha _G\left( V_{WL}-V_T^{ref}\right) /(mkT)\right] \cdot \exp \left[ \alpha _G\Delta V_T/(mkT)\right] \), where \(\Delta V_T\) is the cell \(V_T\) shift from a reference condition \(V_T^{ref}\) (see [23] for the remaining parameters). In the previous equation the scaling factor \(w=\exp \left[ \alpha _G\Delta V_T/(mkT)\right] \), which is a function of \(V_T\) but not of \(V_{WL}\), plays the role of the synaptic weight and the remaining one corresponds to the input signal. A HNN specializes its behavior to perform a well defined task after a learning phase, during which the weights of all the memory cells are tuned according to specific learning algorithms or learning rules. Spiking Neural Networks (SNNs) are particular HNNs for which learning is carried out according to biologically inspired learning rules without external supervision, such as STDP; they take their name from the integrate-and-fire behavior of the artificial neurons, delivering asynchronous spikes during network operation [11, 27].

Fig. 5
figure 5

a Schematic of a NOR Flash memory array used as a synaptic arrays connecting two layers of neurons. b Schematic a NOR Flash cell subjected to the bias scheme used to trigger HHI and c the resulting experimental \(V_T\) erase transients (\(\copyright \) 2018 IEEE [23])

3.1 Implementing STDP and Unsupervised Learning

Figure 5b shows a schematic of the erase scheme devised to overcome to achieve single-selectivity during erase to enable the adoption of mainstream NOR Flash memory arrays in SNNs. By applying simultaneously a positive \(V_{BL}\) and a negative \(V_{WL}\), holes, generated by BTBT and accelerated by the horizontal electric field, become energetic enough to overcome the \(\mathrm{Si}/\mathrm{SiO}_2\) energy barrier and to be injected into the cell floating-gate FG, leading to \(\Delta V_T<0\). Figure 5c displays the resulting \(V_T\) transients, measured for \(V_{BL}=\)4.5 V and different values of \(V_{WL}\), confirming the feasibility of the suggested erase scheme.

Fig. 6
figure 6

a Voltage scheme suggested to implement STDP in a NOR Flash cell exploiting CHEI and HHI (the \(\Delta t>0\) case is shown) and b the resulting experimental STDP waveform. c Evolution during the learning phase of the weights of the memory cells in the prototype SNN; the blue curve is the average of the weights belonging to the input pattern, while the red one is the average of the remaining ones. \(\copyright \) 2019 IEEE [21]

Once single-cell selectivity during erase is achieved, it is possible to implement STDP in the nor Flash array, according to which w variations of each memory cell must depend only on the timing between the presynaptic spike and the postsynaptic one (\(\Delta t\)). To that purpose, the voltage scheme displayed in Fig. 6a was devised, exploiting HHI for erase and the classically adopted channel hot-electron injection (CHEI) [28] for program. A presynaptic spike triggers a double-triangular WL pulse of duration \(t_{WL}\); the postsynaptic spike, instead, results in the application of a rectangular pulse to the BL of amplitude equal to 4.5 V, duration much shorter than \(t_{WL}\) and delayed with respect to the fire event of \(t_{WL}/2\). According to such scheme, if \(\Delta t>0\), the BL pulse is applied in correspondence of a negative \(V_{WL}\), thus triggering HHI that results in \(\Delta V_T<0\) (\(\Delta w>0\)). In the opposite case, CHEI results in \(\Delta V_T>0\) (\(\Delta w<0\)). The experimental STDP waveform resulting from the implementation of the scheme of Fig. 6a is reported in Fig. 6b, displaying that the final weight \(w_f\) after a fire event depends on \(\Delta t\) similarly to what observed on biological synapses [11, 17].

Finally, starting from the STDP scheme of Fig. 6a, a prototype SNN with 8 input signals e 1 output was implemented and tested in a pattern learning problem. The input pattern was encoded in the activity of the input neurons, meaning that neurons that are part of the pattern continuously deliver input spikes, otherwise their outputs are kept to ground. The input pattern is correctly learned by the SNN if the weights of the cells belonging to it are potentiated and the remaining ones are depressed, as demonstrated in Fig. 6c for the implemented SNN. Please refer to [21, 23] for a full discussion.

Besides, it is worth mentioning that when the employment of NOR Flash memory arrays in HNNs is considered, the impact of their non-idealities during \(V_T\)-tuning processes and their typical reliability issues must be carefully assessed. As a matter of example, in [22] the impact of program noise [29] and random telegraph noise [30, 31] on the performance of a neuromorphic digit classifier is investigated in detail. From the suggested analysis, also some quantitative criteria to determine how scaled NOR Flash cells can be when targeting neuromorphic applications are provided.

4 Conclusions

In this chapter, the GIDL-assisted erase operation in 3–D nand Flash memory arrays has been investigated by means of TCAD simulations and a compact model to reproduce the evolution of \(I_{BL}\), \(V_B\) and the cell \(V_T\) has been presented. Thanks to its simplicity and accuracy, the model represents a valuable tool for the optimization of the array performance during erase assisted by GIDL. Then, a similar erase scheme has been employed also in NOR Flash memory arrays, exploiting BTBT-generated HHI to enable single-cell selectivity during erase and allowing the adoption of mainstream NOR Flash memory arrays in SNNs without any modification either to the cell or to the array design. The presented results pave the way towards the development of neuromorphic systems based on cost-effective and highly-reliable memory arrays.