Implementation of non-uniform mesh in non-equilibrium Green’s function simulations of quantum cascade lasers

The non-equilibrium Green’s function formalism method is used to simulate electronic transport in a quantum cascade laser. Calculations are performed in a real-space basis defined by the grid points. Implementation of a non-uniform mesh greatly improves the effectiveness of the method, allowing for realistic mapping of the device structure while keeping the numerical load achievable for personal computers. The results of simulations obtained employing a non-uniform mesh are shown to fit the experimental data much better than those using constant grid sampling.


Introduction
The non-equilibrium Green's function (NEGF) formalism is a powerful semiconductor device simulation method, which allows for the simultaneous consideration of carrier scattering and quantum coherence. However, this method is highly demanding, both conceptually and computationally, so in order to model real devices, a few simplifications must be made. They must be introduced in order to keep the numerical load, in terms of both time and memory, at a size achievable by currently available computers. For example, existing approaches implementing NEGF to simulate quantum cascade lasers (QCLs) that involve the basis being cut to several quantum states per QCL period [1] do not fully resolve for in-plane momentum [1], and all limit the analysis to at most three device modules [2][3][4][5]. As QCLs are purely unipolar devices, they all use effective-mass Hamiltonians limited to at most two bands [6], so the effect of band mixing is also simplified. QCL NEGF simulators utilize either the eigenfunction-like basis [2] or the real-space basis [1]. The former uses several quantum states per QCL period and thus benefits from a relatively low numerical load, making this approach numerically efficient. Because of this, many important results, both theoretical and application-oriented, have been obtained with this approach [7][8][9][10][11]. The major simplification of the approach is that in reality, the basis is field-dependent, which is not taken into account when obtaining a self-consistent Schrödinger-Poisson solution. Using a real-space basis does not have this limitation. However, obtaining results with quantitative accuracy requires a very dense grid, making for a huge numerical load with this approach. This is a concern for both mid-infrared (mid-IR) devices, whose structures utilize fine layers of sub-nanometer width, and THz devices, which have a high well-tobarrier width ratio. In both cases, non-uniform sampling of device potential in the real space can significantly limit the size of the discretized Hamiltonian and make the method numerically efficient. Non-uniform sampling of real space within NEGF implementation has already been reported [12]. This study concerns ballistic, scattering-free transport in a high-electron-mobility transistor (HEMT). Our paper aims to implement a non-uniform mesh to study scattering transport in QCLs. We provide the detailed formulation for current density and scattering energies for the case of nonuniform mesh discretization (Sect. 3), which can also be used in other device-oriented NEGF solvers. In Sect. 4, the results obtained for uniformly and non-uniformly sampled THz QCL devices are compared in order to demonstrate the benefits one can gain when using smart non-uniform sampling of the device potential in real space.

Device and model
The THz QCL selected for the simulations was described in [13]. This laser emits radiation at 2.4 THz frequency and uses an indirect phonon-photon pump scheme to achieve the population inversion [14]. Each module of the cascade consists of four wells and four barriers, with the widths tuned to tenths-of-nanometer precision. The design utilizes a GaAs/ AlGaAs material system and contains layers with widths of 4.4/6.45/1.62/7.15/2.79 /10.455/0.6/4.965 nm, where the bold font indicates the barriers. The first well is delta-doped with Si to n s = 3.45 × 10 10 cm −2 near its center. The cascade in the THz QCL counts several hundred identical modules (260 for our device). Typically, to make the simulations feasible, 1-3 modules are considered, and the remainder are mimicked by appropriate boundary conditions. It is reasonable to assume that any position/energy-dependent quantity q remains unchanged after simultaneous space-energy shift, i.e., q(z + L, E, k) = q(z, E − eFL, k) , where L is the size of the single module and F is the average electric field due to the external bias. Such an assumption allows one to reduce the size of the structure, subjected to the calculations, to approximately one QCL module.
Precise mapping of the QCL layers in the growth direction requires the use of a grid with mesh size Δz of no more than the material's monolayer spacing. In the case of a uniform grid with Δz = 0.2 nm, one THz QCL module consists of N = 193 sites. The inclusion of several additional wells/barriers, which are necessary for the application of periodic boundary conditions, increases the number of sites to N = 256 , which makes it nearly impossible to obtain the solution in the self-consistent Born approximation and in acceptable time. At the same time, even for dense mapping, accurate thickness of the layers is not preserved. For instance, for Δz = 0.2 nm, the layer mapping error is approximately 1% . For Δz = 0.6 nm, the deviations from original layer widths can be as much as 10% . As discussed in the Introduction, a way to avoid these limitations is to keep the calculations in the real space and reduce the basis by non-uniform sampling. Our non-uniform sampling strategy is as follows: First, exact positions z int of the interfaces are found, and sites are located symmetrically to the left/right of z int in half-monolayer distance. The remaining space in each layer is then uniformly sampled, with spacing not exceeding some arbitrarily assumed maximum mesh size Δz max . This strategy allows one to save many sites per QCL module, keeping the widths of subsequent layers (and thus the size of the whole device) unchanged. In Table 1, the number of sites for different values of the parameters Δz max (non-uniform sampling) or Δz (uniform sampling) and the deviations of the QCL's intersubband transition energies from their reference values (calculated for a very dense uniform grid Δz = 0.005 nm) are gathered and compared. For the energy E 32 = E 3 − E 2 , which  Table 1 Energy level shifts from reference values (column 2) resulting from the change in discretization grid method for non-uniform and uniform grid cases (see Fig. 1  describes optical transition, the difference in the deviations for different grid size and sampling strategies can be huge: for example, for Δz max = 1 nm (non-uniform sampling), the change is almost 20 times smaller than for Δz = 0.6 nm (uniform sampling). If we further increase Δz max to 2 nm, the change is still four times smaller than for the Δz = 0.6 nm case. At the same time, the basis is almost twice as small as for uniform sampling. This improvement is achieved despite the fact that for the non-uniform finite difference scheme, the truncation error is only first order in Δz max ( O(Δz max ) ) as compared with second-order accuracy characteristic of a homogeneous grid. One must remember, however, that the increase in Δz max cannot go to arbitrary large values: It is well known that a onedimensional (1D) discrete lattice has a cosine dispersion a is the lattice constant, and m * is the effective mass. When a increases, the difference between the discretized (cosine) and continuous (parabolic) dispersions becomes large even for low energies. For this reason, a should be kept at a value that maintains the analysis close to the band edge. Numerically, it is equivalent to the condition E max << 4t , where E max is the maximum energy involved in the simulations. For example, for a GaAs conductor ( m * = 0.067m 0 ) and for a = 2 nm, the energy cosine bandwidth is 4t ≅ 0.57 eV. As can be seen in Fig. 2, a reasonable approximation can be obtained for energies lower than E max ≅ 0.2 eV ≈ 1.5t . This limitation has been retained for the maximum grid size, i.e., Δz max ≤ √ 0.75ℏ 2 ∕m * E max ≈ 0.2 min , where min is the particle minimum wavelength.

Theory
So far, all QCLs are purely unipolar n-type devices, so the model uses a single-band effective-mass Hamiltonian, which accounts for mixing with the valence bands through energy-dependent effective mass where E is the total energy, E c (z) and E g (z) are the conduction band edge and band gap profiles, and z is the growth direction. The in-plane dynamics are included by kinetic energy terms with the same energy-dependent effective mass. Such a choice, which is not ultimate, preserves the in-plane non-parabolicity, comparable to the results predicted by the 8-band k·p method [15]. The full non-interacting Hamiltonian is expressed as where the potential energy term V(z) = E c (z) + V sc (z) comprises the conduction band edge profile E c (z) and the electron mean field term V sc (z) , calculated self-consistently by the solution of the Poisson equation. As already mentioned, the calculations are made in real space, so the Hamiltonian of Eq. (1) is discretized and the grid points (non-uniformly spaced on the z-axis) define its base vectors; the discretized Hamiltonian can then be evaluated in the matrix form following [16] where discretization site i has coordinate z i , sites i, j are at the distance |z j − z i | ≡ z i,j , z i± = (z i±1 + z i )∕2 are halfway between z i and z i+1 or z i−1 and z i , and m(E, z i± ) ≡ 1 2 (m(E, z i±1 ) + m(E, z i )) . It stems from Eq. (2) that matrix is non-Hermitian. The Dyson and Keldysh equations take the forms [17] where is the unity matrix, and s and s are the selfenergy and Green's function matrices with the elements (i, j) linking the discretization sites at z i and z j . The matrix in Eq.  In our case, the Green's functions are per energy×length. The necessary integration over space must not be omitted, in order to preserve the consistency. For instance, for the quasi-elastic approximation of scattering with acoustic phonons, the formulation for the self-energy in [18] is expressed as where k B is the Boltzmann constant, v s , , D are the sound velocity, the density, and the deformation potential in the host material, and a is the grid size. In our formulation, the integration over the spatial coordinate cancels term a, so that the correct formula for this self-energy is where ,< are calculated from Eqs. (3) and (4). A similar, straightforward transformation can be applied to other selfenergies with diagonal-only entries, such as alloy disorder or interface roughness. For the interactions, which give nonlocal self-energies, i.e., ionized impurities and polar optical phonons, the transformations are less intuitive. Self-energies for polar optical phonons should be calculated as [18] and where n B is the Bose-Einstein factor for energy ℏ , = e 2 ℏ ( −1 ∞ − −1 0 )∕2 , and I lo is the integral calculated as ,< , For the impurity scattering, the self-energies are [4] where N D (z) is the ionized impurity concentration profile, is the angle between vectors k and q, q 0 is the inverse Debye screening length, and is the static dielectric constant. In the case of non-uniform sampling, the calculation of integrals I lo and I imp is associated with a huge increase in computational effort and use of computer memory. However, as these integrals do not change when equations are iterated, they can be executed only once, tabulated and stored in computer memory. This is one more advantage of the chosen basis.
With the NEGF method, boundary conditions are applied through the contact self-energies. In our approach, the contact self-energy matrix has only two nonzero elements, namely, for the device's left boundary where is the Green's function of the (uncoupled) lead [19], and LD , DL are coupling elements. The method for calculating the function , which imitates the boundary conditions appropriate for cascade structures, was described in [5,20]. With a non-uniform grid, the coupling elements are calculated according to Eq. (2), i.e., where a L ≡ |z 0 − z −1 | , a 0 ≡ |z 1 − z 0 | ( z 0 , z −1 are the coordinates of the first and second site in the lead adjacent to the device), and m L = m(z 0 ) , m 1 = m(z 1 ) . The right contact self-energy N,N can be similarly calculated. With the diagonal elements of the Green's functions matrices, the spatial momentum-resolved densities of states (DOS) and densities of electrons (DOE) can be calculated as [19] (9) k)).

3
The integration over gives z-resolved DOS and DOE. The formulation for current density requires reformulation with respect to [17]. The particle current flowing between internal site i and i + 1 (for some E, ) [9] is expressed as where A is the cross-sectional area. Optical gain can be calculated with the perturbation method of [7], adopted for the case of energy-dependent effective mass [21]. In that approach, the absorption coefficient is obtained from the real part of the complex conductivity, i.e., ≅ ℜ( ( ))∕c 0 √ r (the symbols have the usual meanings). The complex conductivity is obtained as where J is the current perturbation and F is the external radiation field. The current perturbation is calculated as where is the perturbing potential, which can be calculated in the Coulomb gauge as = −eF( ) ∕ℏ , ≡ −1 ( − ) , and is the diagonal matrix ( i,i = z i ). The perturbed lesser Green's function < in the first approximation is given by [7]

Results
In order to illustrate the advantages of the method that uses non-uniform mesh, calculations for the THz QCL were carried out using uniform and non-uniform mesh. In the case of uniform sampling, Δz = 0.6 nm was assumed, which gave 87 sites of the simulation model. For non-uniform sampling, several values of Δz max = 0.8, 1, 1.2 nm were tried. In Table 2, the current density and the maximum gain peak, calculated for the field F = 19.8 kV/cm, and different values of the parameters Δz max are gathered and compared. As can be seen, the result "saturates" for Δz max ≥ 1 nm, so the latter was used in the following simulations. For Δz max = 1 nm, the model counts 74 sites. For the discretization in E and k 2 spaces, the uniform grids were assumed with spacings of dE = dE k = ℏ 2 dk 2 ∕2m * = 1 meV. There were also included scattering self-energies for the interactions with (1) acoustic phonons in energy-averaged approximation [4], (2) screened, dispersionless optical phonons in the full non-local approximation (all off-diagonal elements of self-energy included) [18], (3) alloy disorder, (4) interface roughness with a Gaussian correlation function with correlation length Λ = 9 nm and asperity height Δ = 0.19 nm [4], and (5) ionized impurities [4]. Calculations were performed for the temperature T = 50 K. The NEGF-Poisson solver [5]  The results of the simulations are presented in Figs. 3 and 4. In Fig. 3, the current-voltage characteristics for uniform and non-uniform grid sampling are shown and compared with the experiment [13]. One can see that the characteristics for non-uniform grid sampling are much closer to the experimental data than those for a uniform grid.
A comparison of the simulated gain for non-uniform and uniform grid cases is shown in Fig. 4. In both cases, the gain was calculated for the value of the field in which the gain peak reached its maximum value, i.e., 19.8 and 19.2 kV/cm for non-uniform and uniform sampling, respectively. For the non-uniform grid case, the gain peak appears at 2.67 THz ( h = 11 meV) and reaches a value of ≈ 60 cm −1 , which slightly exceeds all optical losses [13]. This frequency is  [13] quite close to the experimental lasing frequency value of 2.4 THz. In the case of a uniform grid, a distinct peak of the gain cannot be observed. In this case, a broad gain peak ( ≈ 40 cm −1 ) appears at 1.93 THz ( h = 8 meV), which is significantly lower than the experimental laser frequency. Moreover, the gain does not reach losses, so lasing is not predicted with this model. The theoretical multi-valley I-V characteristic calculated with our implementation of the NEGF method is not observed in the experiment, which qualitatively reproduces the simulated behavior only for the lowest and the highest fields. For the medium-value fields, a plateau rather than oscillatory behavior is observed. This discrepancy is most probably caused by the evolving electric field domains [22]. In structures like the THz QCL, such domains are very likely to form due to the incomplete relaxation of carrier energy within one module [4]. Another difference between the simulated and experimental I-V curves is the shift of calculated resonance peaks to lower voltages. This shift can be attributed at least in part to the voltage drop observed at the Schottky-like junction between the top metal and top n + GaAs contact layer [22]. With respect to other simulation methods, it is worth mentioning that our I-V characteristic is in excellent agreement with the NEGF simulation performed with a Wannier state basis [13]. Better agreement between simulated and experimental I-V curves was achieved with the rate equation (RE) model combined with the density matrix (DM) method [13]. However, the gain peak achieved at 2.9 THz with the RE/DM method agrees substantially less with the experimental data than our value. To our knowledge, no other simulation methods used to model QCLs (see Refs. [1,23,24] for recent reviews) have been investigated for this particular design.

Conclusion
In this paper, the technical details for simulations of quantum cascade lasers in real space with a non-uniform mesh and non-equilibrium Green's function method are presented. With this approach, accurate mapping of the structure of the simulated device onto its numerical model is preserved. The results of the simulations thus fit the experimental data much better than those that use constant grid sampling. In addition, this method allows the size of the numerical task to be reduced, enabling the achievement of satisfactory results much more quickly.