Ab initio calculation of valley splitting in monolayer δ-doped phosphorus in silicon

The differences in energy between electronic bands due to valley splitting are of paramount importance in interpreting transport spectroscopy experiments on state-of-the-art quantum devices defined by scanning tunnelling microscope lithography. Using vasp, we develop a plane-wave density functional theory description of systems which is size limited due to computational tractability. Nonetheless, we provide valuable data for the benchmarking of empirical modelling techniques more capable of extending this discussion to confined disordered systems or actual devices. We then develop a less resource-intensive alternative via localised basis functions in siesta, retaining the physics of the plane-wave description, and extend this model beyond the capability of plane-wave methods to determine the ab initio valley splitting of well-isolated δ-layers. In obtaining an agreement between plane-wave and localised methods, we show that valley splitting has been overestimated in previous ab initio calculations by more than 50%.


I. INTRODUCTION
Quantum devices in silicon have been the subject of concentrated interest, both experimental and theoretical, in recent years.Efforts to make such devices have led to atomically precise fabrication methods which incorporate phosphorus atoms in a single monolayer of a silicon crystal [1][2][3][4] .These dopant atoms can be arranged into arrays 5 , or geometric patterns for wires 6 and associated tunnel junctions 7 , gates, and quantum dots 8,9 -all of which are necessary components of a functioning device 10 .The patterns themselves define atomically abrupt regions of doped and undoped silicon.While silicon, bulk-doped silicon, and the physics of the phosphorus incorporation 11 are well-understood, models of this quasi-two-dimensional phosphorus sheet are still in their initial stages.In particular, it is critical in many applications to understand the effect of this confinement on the conduction band valley degeneracy, inherent in the band structure of silicon.For example, the degeneracy of the valleys has the potential to cause decoherence in a spin-based quantum computer 12,13 , and the degree of valley degeneracy lifting (valley splitting) defines the conduction properties of highly confined planar quantum dots 10 .
The importance of understanding valley splitting in monolayer δ-doped Si:P structures has led to a number of theoretical works in recent years spanning several techniques, from pseudopotential theories via planar Wannier orbital (PWO) bases 14 , density functional theory (DFT) via linear combination of atomic orbital (LCAO) bases 15,16 , to tight-binding (TB) models [17][18][19][20][21] , and effective mass theories (EMT) [22][23][24] .We note that several of these papers are based upon the assumption that the effective masses of δ-doped P in Si remain unchanged from bulk-doped values; 22,23 an assumption which has been challenged. 14,17Others assume doping over a multiatomic plane band 17,22 which no longer represents the state of the art in fabrication.We also note that Ref. 15 represents the first attempt to model these devices by considering explicitly doped δ-layers with DFT, using a relatively small localized basis set with the assumption that a basis set sufficient to describe bulk silicon would also adequately describe P-doped Si.There is currently little agreement between the valley splitting values obtained using these methods, with predictions ranging between 5 to 270 meV, depending on the arrangement of dopant atoms within the δ-layer.Density functional theory has been shown to be a useful tool in predicting how quantum confinement and/or doping perturbs the bulk electronic structure in silicon-and diamond-like structures, [25][26][27][28][29] and it might be expected that the removal of the basis set assumption will lead to the best estimate of the valley splitting available.
In this paper we determine a consistent value of the valley splitting in explicitly δ-doped structures by obtaining convergence between distinct DFT approaches in terms of basis set and system sizes.We perform a comparison of DFT techniques, involving localized numerical atomic orbitals and delocalized plane-wave (PW) basis sets.Convergence of results with regard to the amount of Si "cladding" about the δ-doped plane is studied.This corresponds to the normal criterion of supercell size, where periodic boundary conditions may introduce artificial interactions between replicated dopants in neighboring cells.A benchmark is set via the delocalized basis for DFT models of δ-doped Si:P against which the localized basis techniques are assessed.Implications for the type of modeling being undertaken are discussed, and the models extended beyond those tractable with plane-wave techniques.Using these calculations, we obtain converged values for properties such as bandstructures, energy levels, valley splitting, electronic densities of state and charge densities near the δ-doped layer.
The paper is organized as follows: Sec.II outlines the parameters used in our particular calculations; we present the results of our calculations in Sec.III; and conclusions are drawn in Sec.IV.An elucidation of effects modifying the bulk bandstructure follows in App.A & B to provide a clear contrast to the properties deriving from the δ-doping of the silicon discussed in the paper.

II. METHODOLOGY
Density functional theory calculations have been carried out using both plane-wave and LCAO basis sets.For the plane-wave (PW) basis set, the Vienna ab initio simulation package (vasp) 30 software was used with projector augmented wave (PAW) 30,31 pseudopotentials for Si and P. Due to the nature of the plane-wave (PW) basis set, there exists a simple relationship between the cutoff energy and basis set completeness.For the structures considered in this work, the calculations were found to be converged for PW cutoffs of 450 eV.
Localized basis set calculations were performed using the Spanish Initiative for Electronic Simulations with Thousands of Atoms (siesta) 32 software.In this case, the P and Si ionic cores were represented by norm-conserving Troullier-Martins pseudopotentials. 33he Kohn-Sham orbitals were expanded in the default single-ζ polarized (SZP) or double-ζ polarized (DZP) basis sets, which consist of 9 and 13 basis functions per atom respectively.Both the SZP and DZP sets contain s-, p-, and d-type functions.These calculations were found to be converged for a mesh grid energy cutoff of 300 Ry.In all cases, the generalized gradient approximation (GGA) PBE 34 exchange-correlation functional was used.
The lattice parameter for bulk Si was calculated using an 8-atom cell, and found to be converged for all methods with a 12×12×12 Monkhorst-Pack (MP) k -point mesh 35 .The resulting values are presented in Table I, and were used in all subsequent calculations.In modeling δ-doped Si:P, as used in Ref. 10, we adopted a tetragonal supercell description of the system, akin to that of Refs.14 & 15.In accordance with experiment, we inserted the P layer in a monatomic (001) plane as one atom in four to achieve 25% doping.This will henceforth be referred to as 1/4 monolayer (ML) doping.In this case, the smallest repeating in-plane unit had four atoms per monolayer (to achieve 1 in 4 doping), and was a square with sides parallel to the [110] and [ 110] directions.The square had side length a √ 2 (see Fig. 1), where a is the simple cubic lattice constant of bulk silicon.The phosphorus layers had to be separated by a considerable amount of silicon due to the large Bohr radius of the hydrogen-like orbital introduced by P in Si (∼2.5 nm).Ref. 15 showed that this far exceeded the sub-nanometer cell side length.If desired, cells with a lower in-plane density of dopants may be constructed, by lengthening the cell in the x-and y-directions, such that more Si atoms occupy the doped monolayer in the cell.A collection of tetragonal cells comprised of 4, 8, 16, 32, 40, 60, 80, 120, 160 and 200 monolayers were constructed, having four atomic sites per monolayer and oriented with faces in the [110], [ 110] and [001] directions (see Fig. 2).Cells used in PW calculations began at 4 layers and ran to 80 layers; larger cells were not computationally tractable with this method.SZP and DZP models began at 40 layers to overlap with PW for the converging region, and were then extended to their tractable limit (200 and 160 layers, respectively) to study convergence past the capability of PW.For the tetragonal cells the k -point sampling was set as a 9 × 9 × N Γ-centred MP mesh, as we have found that failing to include Γ in the mesh can lead to anomalous placement of the Fermi level on bandstructure diagrams.N varied from 12 to 1 as the cells became more elongated (see Table IV

in App. A).
Although it has been previously found that relaxing the positions of the nuclei gave negligible differences (< 0.005 Å) to the geometry, 15 this was for a 12-layer cell and may not have included enough space between periodic repetitions of the doping plane for the full effect to be seen.We have performed a test relaxation on a 40layer cell using the PW basis (vasp).The maximum subsequent ionic displacement was 0.05 Å, with most being an order of magnitude smaller.The energy gained in relaxing the cell was less than 37 meV (or 230 µeV/atom).We therefore regarded any changes to the structure as negligibly small, and proceeded without ionic relaxation.
Single-point energy calculations were carried out with both software programs; for vasp the electronic energy convergence criterion was set to 10 −6 eV, and the tetrahedron method with Blöchl correction 36 was used.For siesta a two-stage process was carried out: Fermi-Dirac electronic smearing of 300 K was applied in order to converge the density matrix within a tolerance of 1 part in 10 −4 ; the calculation was then restarted with smearing of 0 K and a new electronic energy tolerance criterion of 10 −6 eV applied (except for the 120-and 160-layer DZP models for which this was intractable; a tolerance of 10 −4 eV was used in these cases).This two-stage process aided convergence as well as ensuring that the energy levels obtained were comparably accurate across methods.In addition, for each doped cell thus developed and studied, an undoped bulk Si cell of the same dimensions was constructed to aid in isolating those features primarily due to the doping.

A. Analysis of bandstructure
Once converged charge densities were obtained, bandstructures were calculated along the M -Γ-X highsymmetry pathway (as shown in Fig. 10 in Appendix A), using at least 20 k -points between high-symmetry points.For comparative purposes, the bandstructures have all been aligned at the valence band maximum (VBM).
Figure 3 contrasts bulk and doped bandstructures for the 40-layer PW calculation.DZP and SZP results are similar on this scale and are omitted in the interest of clarity in the diagram.As discussed in App.B, it is evident from the bulk values that the elongated cells have led to the folding of two CBM valleys towards the Γpoint.Also visible is the difference the doping potential makes to the system; what was the lowest unoccupied orbital in the bulk is now dragged down in energy by the extra ionic potential.It is of note that the region near Γ, corresponding to the k z valleys which can be modeled as having different effective masses to the k x,y valleys, 14 is brought lower than the region corresponding to the k x,y valleys and is non-degenerate.The second band behaves in a similar fashion.The third band appears to maintain a minimum away from the Γ-point in the Σ TET -direction (which is equivalent to the ∆ FCC -direction; see App.A), but in a less-parabolic fashion than the lower two; its minimum is similar to the value at Γ.This band is nondegenerate along this particular direction in k-space, but due to the supercell symmetry it is actually 4-fold degenerate, in contrast to the other bands.The Fermi level for the doped system is also shown, clearly being crossed by all three of these bands which are therefore able to act as open channels for conduction.As mentioned above, the bandstructures are similar across all methods, but upon detailed inspection important differences come to light.A closer look at the ∆ band shows a qualitative difference between the predic-tions using SZP (Fig. 4c) and the PW and DZP results (Figs. 4a & 4b): the models with a more complete basis predict the band minimum to occur in the Σ TET (∆ FCC ) direction, below the value at Γ, while the SZP bandstructure shows the reverse -the minimum at Γ, a similar amount below a secondary minimum in the Σ TET direction.The difference between the energies of the first two band minima (Γ 1 -Γ 2 , illustrated in Fig. 5), or the valley splitting, from the PW and DZP calculations agree with each other to within ∼6 meV.Significantly, the value obtained using the SZP basis set differs by 52 meV, some 55% larger than the value obtained using the PW basis set.The importance of this discrepancy cannot be overstated; this valley splitting is directly relatable to experimentally observable resonances in transport spectroscopy of devices made with this δ-doping technology (see Ref. 10).
In the smallest cells (< 16 layers), less than three bands are observed.This is likely due to the lack of cladding in the z-direction, leading to significant interaction between the dopant layers, raising the energy of each band.Whilst the absolute energy of each level still varies somewhat, even with over 100 layers incorporated, we find that the Γ 1 -Γ 2 values are well-converged with 80 layers of cladding for all methods (see Fig. 5).Indeed, they may be considered reasonably converged even at the 40-layer level (0.5 meV or less difference to the largest models considered).The differences between the energies of the second and third band minima (Γ 2 -∆ splittings) are also shown in Fig. 5, and show good convergence (within 1 meV) for cells of 80 layers or larger.
The Fermi level follows a similar pattern to the Γ-and ∆-levels.In particular, the gap between the Fermi level and Γ 1 level does not change by more than 1 meV from 60 to 160 layers.
Given that the properties of interest are the differences between the energy levels, rather than their absolute values (or position relative to the valence band), in the interest of computational efficiency we observe that using the DZP basis with 80 layers of cladding is sufficient to achieve consistent, converged results.

B. Valley splitting
Table II summarizes the valley splitting values of 1/4 ML P-doped silicon obtained using different techniques, showing a large variation in the actual values.In order to make sense of these results, it is important to note two major factors that affect valley splitting: the doping method and the arrangement of phosphorus atoms in the δ-layer.As the results from Ref. 16 show, the use of implicit doping causes the valley splitting value to be much smaller than in an explicit case (∼7 meV vs. 120 meV).It is also shown that the use of random P coverage on the δlayer reduces the valley splitting value by only 40-50 meV compared to the fully ordered placement, leaving a large discrepancy between the valley splitting results from implicit and explicit doping.This massive decrease in valley splitting due to implicit doping can be explained by the smearing of the doping layer in the direction normal to the δ-layer, thereby decreasing the quantum-confinement effect responsible for breaking the degeneracy in the system.Ref. 16 also shows that the arrangement of the phosphorus atoms in the δ-layer strongly influences the valley splitting value.In particular, they showed that there is a difference of up to 220 meV between P doping along the [110] direction and along the [100] direction.It must be noted, however, that most of their patterns are not yet physically realizable due to the P incorporation mechanism currently employed.Our results show that valley splitting is highly sensitive to the choice of basis set.Due to the nature of PW basis set, it is straightforward to improve its completeness by increasing the plane wave cutoff energy.In this way, we establish the most accurate valley splitting value within the context of density functional theory.Using this benchmark value, we can then establish the validity and accuracy of other basis sets, which can be used to extend the system sizes to that beyond what is practical using PW basis set.As seen in Table II, the valley splitting value converges to 93 meV using 80-layer cladding.The DZP localized basis set gives an excellent agreement at 99.5 meV using 80-layer cladding (representing a 7% difference).On the other hand, the SZP localized basis set (similar to what was used in Refs.15 and 16) gave a value of 145 meV using the same amount of cladding.This represents a significant difference of 55% over the value obtained using PW basis set, and demonstrates that the SZP basis set is unsuitable for accurate determination of valley splitting in these systems.

C. Density of states
The electronic density of states (eDOS) was calculated for each cell.Figure 6 compares the unscaled eDOS for bulk 80-layer cells to that of doped cells varying from 40 to 80 layers.The bulk bandgap is visible, with the conduction band rising sharply to the right of the figure.The doped eDOS exhibits density in the bulk bandgap, although the features of the spectra differ slightly according to the basis set used.The Fermi energy exhibits convergence with respect to the amount of cladding, as reported above.It is also notable that the eDOS within the bandgap are nearly identical regardless of the cell length (in z).This indicates that layer-layer interactions are negligibly affecting the occupied states, and therefore that the applied "cladding" is sufficient to insulate against these effects.

D. Electronic width of the plane
In order to quantify the extent of the donor-electron distribution, we have integrated the local density of states (LDOS) between the VBM and Fermi level and taken the planar average with respect to the z-position.Figure 7 shows the planar average of the donor electrons (a sum of both spin-up and spin-down channels) for the 80-layer cell calculated using the DZP basis set.After removing the small oscillations related to the crystal lattice to focus on the physics of the δ-layer, by Fourier transforming, a Lorentzian function was fitted to the distribution profile.(Initially, a three-parameter Gaussian fit similar to that used in Ref. 24 was tested, but the Lorentzian gave a better fit to the curve.)Table III summarizes the maximum donor-electron density and the full width at half maximum (FWHM) for the 1/4 ML doped cells, each calculated from the Lorentzian fit.Both of these properties are remarkably consistent with respect to the number of layers, indicating that they have converged sufficiently even at 40 layers.
Our results differ from a previous DFT calculation 16 which cited a FWHM of 5.6 Å for a 1/4 ML doped, 80-layer cell calculated using the SZP basis set (and 10×10×1 k-points).We note that those values were taken from the unfitted, untransformed donor-electron distribution, and represent a ∼15% underestimation of the DZP result.

IV. CONCLUSION
In this article, we have studied the valley splitting of monolayer δ-doped Si:P, using a density functional theory model with a plane-wave basis to establish firm grounds for comparison with less computationally-intensive localized basis ab initio methods.We found that the best current descriptions of these systems (by density functional theory, using SZP basis functions) overestimate the valley splitting by over 50%, due to an assumption made early in their methodology.We show that DZP basis sets are complete enough to deliver values within 10% of the plane-wave values, and due to their localized nature, are capable of calculating the properties of models twice as large as is tractable with plane-wave methods.These DZP models are converged with respect to size well before their tractable limit, which approaches that of SZP models.
Valley splittings are important in interpreting transport spectroscopy experiment data, where they relate to families of resonances, and in benchmarking other theoretical techniques more capable of actual device modeling.It is therefore pleasing to have an ab initio description of this effect which is fully-converged with respect to basis completeness, as well as the usual size effects and k-point mesh density.
We have also studied the bandstructures with all three methods, finding that the DZP correctly determines the ∆-band minima away from the Γ point, where the SZP method does not.We show that these minima occur in the Σ direction for the type of cell considered, not the ∆ direction as has been previously reported.Having established the DZP methodology as sufficient to describe the physics of these systems, we then calculated the electronic density of states, and the electronic width of the δlayer.We found that previous SZP descriptions of these layers underestimate the width of the layers by almost 15%.
We have shown that the properties of interest of δdoped Si:P are well-converged for 40-layer supercells using a DZP description of the electronic density.We recommend the use of this amount of surrounding silicon, and technique, in any future DFT studies of these and similar systems -especially if inter-layer interactions are to be minimized.
Appendix A: Subtleties of bandstructure Regardless of the type of calculation being undertaken, a bandstructure diagram is inherently linked to the type (shape and size) of cell being used to represent the system under consideration.For each of the 14 Bravais lattices available for three-dimensional supercells, a particular Brillouin zone (BZ) with its own set of high-symmetry points exists in reciprocal space 37 .Similarly, each BZ has its own set of high-symmetry directions.Some of these BZs share a few high-symmetry point labels (or directions), such as X or L (∆ or Σ), and they all contain Γ, but these points are not always located in the same place in reciprocal space.
A simple effect of this can be seen by increasing the size of a supercell.This has the result of shrinking the BZ, and the coordinates of high-symmetry points on its boundary, by a corresponding factor.Consider the conduction band minimum (CBM) found at the ∆ valley in the Si conduction band.This is commonly located at k 0 ∼ 0.85 2π a in the ∆ direction towards X.Should we increase the cell by a factor of 2, the BZ will shrink (BZ→BZ ′ ), placing the valley outside the new BZ boundary (past X ′ ); but a valid solution in any BZ must be a solution in all BZs.This results in the phenomenon of band folding, whereby a band continuing past a BZ boundary reenters the BZ on the opposite side.Since the X direction in a face-centred cubic (FCC) BZ is 6-fold symmetric, a solution near the opposite BZ boundary is also a solution near the one we are focussing on.This results in the appearance that the band continuing past the BZ boundary is "reflected", or folded, back on itself into the first BZ.Since the new BZ boundary in this direction is now at k ′ BZ = X ′ = 0.5 2π a , the location of the valley will be at a , as mentioned in Ref. 15.Each further increase in the size of the supercell will result in more folding (and a denser bandstructure).Care is therefore required to distinguish between a new band and one which has been folded due to this effect when interpreting bandstructure.
Continuing with our example of silicon, whilst the classic bandstructure 38 is derived from the bulk Si primitive FCC cell (containing two atoms), it is often more convenient to use a simple cubic (SC) supercell (8 atoms) aligned with the 100 crystallographic directions.In this case, we experience some of the common labelling; the ∆ direction is defined in the same manner for both BZs, although we see band folding (in a similar manner to that discussed above) due to the size difference of the reciprocal cells (see Fig. 8).We also see a difference in that although the Σ direction is consistent, the points at the BZ boundaries have different symmetries and therefore, labels (K FCC , M SC ).(The L FCC -point and Λ FCC -direction have no equivalent for tetragonal cells, and hence we do not consider bandstructure in that direction here) Consider now the δ-doping case discussed above (see Sec. II), where we wish to align our cell with the [110]   and [ 110] directions (by rotating the cell 45 • anticlockwise about z; this will also require a resizing of the cell in the plane to maintain periodicity -see Fig. 9), to allow us to include precisely four atoms per monolayer (as required for the minimal representation of 1/4 ML doping).We now have a situation where the X TET point in the new tetragonal BZ (see Fig. 10) is no longer in the direction of the X SC point in the simple cubic BZ, despite both X points being in the centre of a face of their BZ.Due to the rotation, what was the ∆ SC direction in the simple cubic BZ is now the Σ TET direction (pointing towards M , at the corner of the BZ in the k z = 0 plane) in the tetragonal BZ.The tetragonal CBM, while physically still the same as the CBM in the FCC or simple cubic BZ, is not represented in the same fashion (see Fig. 11).(BZ) shrinks along k z (see App. A).This has the effect of shifting the conduction band minima in the ±k z directions closer and closer to the Γ point (see Fig. 8(a)) and making the bandstructure extremely dense when plotting along k z .This results in the value of the lowest unoccupied eigenstate at Γ being lowered as what were originally other sections of the band are successively mapped onto Γ, and after a sufficient number of folds the value at Γ is indistinct from the original conduction band minimum (CBM) value.The effects of this can be seen in Table IV, which describes increasingly elongated tetragonal cells of bulk Si.When we then plot the bandstructure in a different direction, e.g.along k x , the translation of the minima from ±k z onto the Γ-point appear as a new band with two-fold degeneracy.The degeneracy of the original band drops from 6-to 4-fold, in line with the reduced symmetry (we only explicitly calculate one, and the other three occur due to symmetry considerations).This is the origin of the "Γ-bands" discussed in Refs.14 & 15.Once the k z valleys are sited at Γ, parabolic dispersion corresponding to the transverse kinetic energy terms is observed along k x and k y , at least close to the band minimum (see Fig. 11).
All methods considered in Table IV show the LUMO at Γ (folded in along ±k z ) approaching the CBM value as the amount of cladding increases; at 80 layers, the LUMO at Γ is within 1 meV of the CBM value.It is also of note that the PW indirect bandgap agrees well with the DZP value, and less so with the SZP model.This is an indication that, although the behaviour of the LUMO with respect to the cell shape is well-replicated, the SZP basis set is demonstrably incomplete.Conversely, pairwise comparisons between the PW and DZP results show agreement to within 5 meV.
It is important to distinguish effects indicating convergence with respect to cladding for doped cells (i.e.elimination of layer-layer interactions) from those mentioned above which derive from the shape and size of the supercell.Strictly, the convergence (with respect to the amount of encapsulating Si) of those results we wish to study in detail, such as the differences in energy between occupied levels in what was the bulk bandgap, provide the most appropriate measure of whether sufficient cladding has been applied.

FIG. 1 .
FIG. 1. (001) planar slice of the c (2 × 2) structure, at the 1/4 ML doped monolayer.One of the Si sites has been replaced by a P atom (shown in dark gray).The periodic boundaries are shown in black.

FIG. 6 .
FIG.6.Electronic densities of states for tetragonal systems with 0 and 1/4 ML doping, calculated using the DZP (siesta) basis set.The Fermi level is indicated by a solid vertical line with label.50 meV smearing was applied for visualization purposes.

FIG. 7 .
FIG. 7. Planar average of the donor-electron density as a function of z-position for the 1/4 ML doped, 80-layer cell calculated using the DZP basis set.The fitted Lorentzian function is also shown.

TABLE IV .
Energy levels of tetragonal bulk Si structures.(For details of calculation parameters, see Sec.II)