On the creation of near-surface nitrogen-vacancy centre ensembles by implantation of type Ib diamond

Dense, near-surface (within ∼10\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim 10$$\end{document} nm) ensembles of nitrogen-vacancy (NV) centres in diamond are moving into prominence as the workhorse of many envisaged applications, from the imaging of fast-fluctuating magnetic signals to enacting nuclear hyperpolarisation. Unlike their bulk counterparts, near-surface ensembles suffer from charge stability issues and reduced formation efficiency due to proximity to the diamond surface. Here we examine the prospects for creating such ensembles by implanting nitrogen-rich type Ib diamond, aiming to exploit the high bulk nitrogen density to combat surface-induced band bending. This approach has previously been successful at creating deeper ensembles, however we find that in the near-surface regime there are fewer benefits over nitrogen implantation into pure diamond substrates. Our results suggest that control over diamond surface termination during annealing is key to successfully creating high-yield near-surface NV ensembles generally and implantation into type Ib diamond may be worth revisiting once that has been accomplished.


I. INTRODUCTION
Shallow nitrogen-vacancy (NV) centres in diamond have been shown to be useful as sensors of weak fluctuating magnetic signals [1][2][3][4][5][6][7][8][9] and as a potential vehicle for enabling hyperpolarisation of nuclear spins external to the diamond [10][11][12][13].Much work to-date has focussed on the use (and production) of near-surface single NVs that can sometimes exist in the required negative charge state within a few nanometres of the surface in spite of the unfavourable local Fermi level position [14].Increasingly, however, applications such as scaled-up hyperpolarisation [15,16] and imaging of AC magnetic fields [7,[17][18][19][20][21] demand high-density ensembles of stable near-surface NVs.When sampling a large number of NVs, the impact of surface-induced band-bending becomes clear with the NV − depth distribution cutting off at 6-7 nm from the surface [15,22].Taken in combination with the expectation that vacancies produced by near-surface implants, required to form NV centres during a subsequent annealing process, will tend to out-diffuse to the surface [23,24], NV − yields in such ensembles are much lower than their bulk counterparts.
In the bulk-like regime (mean ensemble depth d NV order 100 nm or more), where the creation of highdensity NV ensembles is comparatively well-developed, it has been shown that starting with N-rich (type Ib) diamond grown via the high-pressure high-temperature (HPHT) method and implanting with arbitrary ions is successful in producing a well-localised sensing layer with * alexander.healey@unimelb.edu.auquantum properties that compete well with other methods based on high-quality chemical vapour deposition (CVD) growth [25][26][27][28].Extending these results to the near-surface regime is attractive due to the relative costefficiency and accessibility of this technique.Additionally, one may wonder whether the high nitrogen density of the bulk crystal is effective in combating the nearsurface band-bending.However, the issue of vacancy diffusion during annealing looms as an impediment to high-yield ensemble formation: under typical annealing temperatures (800-900 • C) various studies have shown that the vacancy diffusion length may extend as high as 300 nm [24,29,30].Using a random walk model, Räcke et al. [24] showed that the reduced near-surface NV yield typically observed is largely explained by vacancy out-diffusion, even without taking the surface to be a vacancy attractor.Additionally, diffusion into the diamond is also problematic as the envisaged applications require that NVs be confined within ≈ 10 nm of the surface, although in this case we would hope that the actual diffusion length would fall well short of the theoretical upper bound due to substitutional N acting as efficient vacancy sinks.
In this study, we examine the merits of creating near-surface NV ensembles through ion implantation of commercially-sourced type Ib HPHT diamond, in view of the factors outlined above.By implanting diamonds containing distinct growth sectors (each with a characteristic native N density), we are able to control for the effect of the bulk N density to determine the role this plays on ensemble surface proximity and yield.We also implant at multiple depths (set by the implant energies) and with two levels of vacancy production (given by the implantation dose and atomic species) to assess the prac-tical role of vacancy diffusion in the high-N regime.The quality of the ensembles produced is assessed by making measurements of the NV yield and their quantum coherence.We conclude with a discussion of the limitations of the study and the prospects for future work.

A. Diamond preparation
A series of type Ib HPHT diamond substrates (purchased from Delaware Diamond Knives) containing sectors with varying levels of native nitrogen were subjected to ion implantation processes (InnovION) to form NV ensembles.To control for as many variables as possible and ensure comparisons between sectors are valid, only two different diamonds were used (initial size 4×4 × 0.1 mm).These diamonds were then laser cut into smaller pieces to undergo different preparation.The implant parameters were chosen to create vacancy profiles peaking at approximately 3, 4, and 5 nm from the diamond surface.The vacancy profiles were predicted using stopping range in matter (SRIM) simulations, shown in Fig. 1(a).Neglecting charge state and vacancy diffusion considerations, we expect to produce a uniform NV layer of width w SRIM , which extends from the surface to the depth where the vacancy production decreases below 50 ppm [dotted line in Fig. 1(a)], an approximate NV creation saturation threshold previously identified in the bulk regime [28].SRIM simulations assume an amorphous substrate so we implanted our samples with a sample tilt of 7 • to minimise ion channelling.The first set of implants chosen to meet this criteria were 16 O at a dose of 1×10 12 cm −2 at energies of 2.5, 4, and 6 keV respectively.A second set of implants designed to produce an order of magnitude more vacancies with similar depth profiles were 31 P implants at a dose of 5×10 12 cm −2 with energies of 4, 7, and 11 keV respectively.In both cases the implant species were chosen to be electron donors to the diamond lattice in an attempt to further offset the band bending from the surface.
For the first set of implants we also controlled for the surface preparation.The as-purchased diamonds arrived with a polished surface finish (Ra < 5 nm) and an oxygen reactive ion etching (RIE) process can be used to remove polishing damage.For the O implants we only performed RIE on some of the substrates to see if the process made a difference to NV yield or quantum properties.Following implantation, all samples were annealed in a vacuum furnace (pressure held below 10 −5 hPa) using a ramp sequence that culminated with one hour at 800 • C (2 h ramp to 400 • C, 3 h at 400 • C, 3 h ramp to 800 • C, 1 h at 800 • C, 2 h ramp to room temperature).The one hour plateau was chosen in an attempt to maximise NV yield while minimising vacancy diffusion into the diamond, in practice there is expected to be a tradeoff between these two factors.The diamonds were then cleaned in a boiling mixture of sulphuric and nitric acid to achieve a standardised, oxygen-terminated surface.

B. NV yield
To determine the NV yield in our samples, we used a confocal microscope to measure the photoluminescence (PL) count rate per unit area (filtering with a 660-735 nm band pass filter) and translated this to an areal NV density σ NV by dividing by the PL given by a single NV centre under the same excitation and collection conditions.We can then consider two yield metrics: the conversion of native nitrogen and created vacancies to NV centres (dubbed N-to-NV and V-to-NV yields respectively).The N-to-NV yield is given by the ratio [NV]/[N], where [NV]= σ NV /w SRIM and [N] is the native N density of a given growth sector.
[N] was deduced by measuring the Hahn echo T 2 and taking the relationship determined by Bauch et al. [31], where the T 2 was measured away from the influence of the surface where possible.We note that [N] could be overestimated if the nitrogen bath is not the dominant source of decoherence (most relevant for the less dense sectors) and that σ NV could be overestimated by the presence of background fluorescence or PL due to the neutral NV charge state.
An example xz confocal scan is shown in Fig. 1(b).A well-defined NV layer is present at the diamond surface although the resolution of the scan is not high enough to determine if the layer's extent matches the vacancy distribution predicted by SRIM.In this image we can see two growth sectors containing variable amounts of nitrogen: the right hand sector (estimated nitrogen density [N]=50 ppm compared to 8 ppm for the left hand sector) has significant background PL away from the surface and the PL of the near-surface sensing layer also varies with the native nitrogen density.In both cases, however, the near-surface sensing layer PL greatly exceeds the background for a given sector, indicating locally increased NV conversion as expected.
Figure 1(c) shows the computed N-to-NV yields plotted against the inferred native N density of a given diamond sector, with the marker colouring indicating the implant energies.The highest yields are close to 2.5%, however a majority of regions have yields of less than 1%, particularly for higher-N sectors and shallower implants.As we filter the PL for the negatively-charged NV centre, these yields are not necessarily reflective of the total NV conversion but rather conversion to the charge state useful for sensing and hyperpolarisation applications.The reduced yields compared to deeper implants [28] therefore could be band-bending-induced or due to reduced creation efficiency independent of charge state.The yields in these samples are comparable to typical N-implants [15] but do not appear to offer an advantage in general.
Looking at the V-to-N yield ("vacancy yield"), plotted in Fig. 1(d), may give a clue as to the origin of the poor conversion.Taking the vacancy creation predicted by SRIM for each implant, we find that around 10 −3 NVs are created per vacancy implanted in most cases, consistent with the modelling of Räcke et al. [24] for the case of the diamond surface acting as a vacancy sink.The spread in vacancy yield may be due to variable surface termination, motivating further study into maintaining high-quality surface termination during annealing so as to keep more vacancies within the diamond.No obvious trends were present within our data based on the two surface preparations carried out, however.The lacklustre vacancy conversion observed for the oxygen implants motivated additional implants to be carried out, using 5 × 10 13 cm −3 31 P implants at energies designed to match the vacancy production profile of the oxygen implants.These phosphorus implants are expected to have produced an order of magnitude more vacancies, however we find that the N-to-NV yield is not improved, meaning that the useful vacancy creation threshold identified in previous work [28] of around 50 ppm appears to be retained in this near-surface regime, despite overall lower NV creation efficiency.The interpretation may be that in this high vacancy production regime, the formation of multi-vacancy clusters is more predominant, which either anneal out or add a source of spin noise [32], and therefore the number of vacancies available to form NVs is not much greater.

C. NV depth
The mean depths of the ensembles created can be measured by taking NV nuclear magnetic resonance (NMR) measurements of a hydrogen target deposited on the diamond surface (in this case viscous immersion oil) [33], see example spectrum showing the appearance of the hydrogen ( 1 H) resonance in Fig. 2(a).For these measurements (and all to follow), we use a widefield microscope optimised for high-sensitivity NV ensemble measurements [12,32], except where background fluorescence was problematic (in which case the confocal system was used).A permanent magnet was used to set a magnetic field of 45 mT and was aligned with one set of NV axes.
All samples studied contained natural 13 C abundance (1.1%), making accurate NV depth determination using XY8 sequences difficult due to the copresence of a 13 C harmonic with the fundamental 1 H resonance [34].
Where possible, we use the XY16 sequence as it is less sensitive to the problematic fourth 13 C harmonic [34].Even XY16 retains some sensitivity to this harmonic and so all depths quoted should be interpreted as lower bounds of the true mean depth of the ensembles.Correlation spectroscopy [35] verified that the 13 C harmonic was a relatively minor component of the resonance fit for the shallowest implants (see the FFT in the Fig. 2(a) inset), however was more significant for some of the deeper implants.
In almost all cases, it was possible to detect a hydrogen signal from immersion oil placed onto the diamond surface using the created layers, however using two of the 16 diamonds implanted could not, indicating that a shallow layer had not been successfully created.This failure could be due to vacancy diffusion into the diamond as increased PL was still observed.It is also possible that, in these diamonds, the yield enhancement from the implantation process was too poor for the hydrogen signal detected by the shallowest NVs to rise above the noise/background given by deeper NVs.Nevertheless, the fact that the majority of samples are able to detect a strong hydrogen signal indicates that the sensing layers are confined close to the surface.From this observation we infer that vacancy diffusion into the diamond under the chosen annealing conditions is not a major fac- versus nitrogen concentration, using the same colour coding as in Fig. 1.Depths quoted are measured using XY16-64 sequences and the error bars denote either the standard error from the fits or the spread in fit depths given by sequences ranging from 48 to 128 pulses, whichever is larger for a given data point.Note that most but not all samples studied were able to detect a hydrogen signal and those that could not are not included on this plot.
tor: substitutional nitrogen is an efficient enough vacancy attractor to dramatically reduce the vacancy diffusion length during annealing, which is important for the success of implantation into type Ib diamond as a method for creating shallow NV layers.Hydrogen signals were detected over the full range of nitrogen densities probed.
The results are summarised in the plot Fig. 2(b), with mean ensemble depths ranging from 7 to 11 nm.The depths quoted are given by a 64-pulse sequence in each case, which we take to be a measure of the peak of the NV − depth distribution [15].The error bars represent the larger of the the uncertainty from the fit and the spread in depth given by measurements with different numbers of pulses (ranging from 48 to 128).Errors due to the copresence of the 13 C harmonic resonance (particularly for deeper implants) and contributions from bulk NV fluorescence (for high-N sectors) are not accounted for, which would cause the underestimation and overestimation of the actual depth respectively.The shallowest implants (2.5 keV 16 O and 4 keV 31 P -represented by the burgundy points), with peak vacancy production predicted below 3 nm from the diamond surface, were measured to have depths between 6.5 and 8 nm, consistent with high-quality N implants of a similar energy [15,33].This result illustrates two things: firstly that vacancy diffusion into the diamond is much less than order 100 nm observed in the bulk [29,30], which would cause a much deeper mean ensemble depth that would preclude detection of the hydrogen signal.Instead these depths are consistent with the distribution predicted by the SRIM simulation, up to a cut-off introduced by band bending (the same interpretation as for N implants [15]).Secondly, however, that these ensembles are (at best) only as shallow as N-implanted ensembles (i.e.not shallower) suggests that the high bulk N density does not significantly alter the band bending.
The 4 keV 16 O and 9 keV 31 P implants (vacancy distribution peaking at 4 nm -green points) have deeper depth distributions, with most fit depths ranging from 8 to 9 nm.The deepest set of implants, 6 keV 16 O and 11 keV 31 P, (lavender points) had depths measured to be similar to the 4 keV implants, between 8 and 11 nm.As the deeper implants resulted in higher yields on average, these depths may still be in a useful regime and in practice both parameters should be considered alongside one another in determining which implant is appropriate for a particular application.

D. Ensemble sensitivity
Although the suitability of a sample to perform a given application will ultimately be heavily dependent on the precise nature of the measurement to take place, we can consider some general figures of merit to gauge the success of the approach.Since most applications of shallow NV ensembles will be concerned with AC signals whose detection can be in principle enhanced through dynamical decoupling, we first measure the Hahn echo T 2 of the shallow ensembles, summarised in Fig. 3(a).We see some evidence for surface-induced decoherence in the shallower implants, with the low-N sector T 2 values being longer for deeper implants.At the highest N densities, the various samples are more tightly grouped, implying a T 2 close to the N-limited value.The N-limited T 2 curve determined by Bauch et al. [31] is included as the black line in Fig. 3(a) to highlight the apparent impact of the surface, however again we stress that the determination of sector [N] may be imperfect and we assume that all sectors in a "group" have the same N density.
To gauge the overall sensing performance of an NV ensemble (crucially also taking into account the fluores-cence of the ensemble, scaling with [NV]), a common figure of merit is the photon shot noise-limited magnetic sensitivity, which for AC fields depends on T 2 [36][37][38].As we are concerned here with the detection of rapidly-decaying signals scaling as the cube of the distance between NV and target (e.g. a magnetic noise B 2 RMS ∝ d −3 NV [33]) and our ensembles feature different mean depths, we consider instead the minimal figure of merit T 2 d −3

√
αR, which is proportional to the signalto-noise ratio of a measurement for a given acquisition time.Here R ∝ [NV] is the photon count rate under continuous laser illumination and α is the laser duty cycle for a measurement of the optimal duration T 2 , both setup-dependent quantities (in this case we use a widefield microscope optimised to measure NV ensembles as a benchmark, as in Ref. [28]).We plot this quantity versus [N] in Fig. 3(b), finding that the spread is partly within error but with an overall tendency for lower-N sectors to perform better.The good performance of low-N sectors, buoyed by their longer T 2 , is partly a consequence of considering a widefield measurement requiring a long laser pulse duration (5 µs here) in contrast to confocal microscopy which will have α ≈ 1/T 2 [28], however also reflects the low yields obtained in higher-N sectors and confirms that high bulk N concentration does not appear to aid near-surface NV properties by compensating for electron traps at the surface.The shallowest implants do perform the best on average despite them being most affected by imperfections in the surface preparation which further motivates the pursuit of shallower, stable NV ensembles.We note also that the motivating application of NV-based hyperpolarisation does not rely on shot-noiselimited readout and so a figure of merit scales with [NV] rather than [NV] [15] and hence favours the use of more dense ensembles.

III. DISCUSSION
The main limitations of N-implantation for the creation of thick sensing layers is the vacancyoverproduction (e.g.peak vacancy production for a 100 keV N implant exceeds the number of implanted ions by a factor of up to 100 [28]) compared to lower dose implants into (for example) N-rich HPHT diamond and the inability to create layers of arbitrary thickness with a single implantation stage.In the shallow regime, neither of these issues are relevant as the nitrogen depth profiles optimal for the applications discussed are easily attainable with N implantation and the vacancy yield generally is low.Indeed, the localisation of vacancy production to the implanted ions may be beneficial for the purpose of curbing diffusion to the surface by converting vacancies to NV centres most efficiently, although the formation of multi-vacancy clusters may still be problematic [32].In view of the above, it would appear that N-implantation is the most suitable technique for creating near-surface NV ensembles.Beginning with a high-quality CVD diamond also carries the benefit of allowing the use of refined doping of the crystal so as to promote NV formation through mechanisms such as vacancy charging as well as Fermi level control [39], although these techniques have yet to be convincingly applied in the high-nitrogen, near-surface regime.
Nevertheless, in this work we have demonstrated that well-confined sensing layers can be produced within 15 nm of the diamond surface via implantation of HPHT diamond.This result shows that vacancy diffusion into the diamond bulk is not the limiting factor for N-to-NV yield near the diamond surface.The low yields may then instead be understood as vacancy diffusion to the nearby surface boundary that acts as a sink.If the surface can be engineered to be vacancy-reflecting, in line with the simulations of Räcke et al. [24] N-to-NV yields towards the bulk values of near 10% may be achievable.
We note that this surface needs to be maintained throughout the annealing process, with maximum temperatures typically ranging from 800-1100 • C [32].These temperatures overlap the removal temperatures for common termination species, with oxygen being removed above 600 • C [40] and hydrogen above 900 • C [41].We chose a maximum temperature of 800 • C to mitigate these effects, however even at this temperature and under high vacuum conditions of ∼ 1 × 10 −6 hPa, small amounts of oxygen present in the chamber could disrupt the surface termination.Annealing at higher temperatures comes with the benefit of improving spin properties [32], however the surface termination will be even less well controlled and we can expect vacancy diffusion into the diamond to be more significant in this regime, motivating further studies in this area.
The depths and yields measured in this work are broadly similar to typical values measured for shallow ensembles created by N-implantation, indicating that implantation of type Ib diamond could be a cost-effective method of creating shallow NV layers.However, the high bulk nitrogen density present in some sectors does not appear to significantly combat surface-induced band bending meaning the method does not provide any advantages over N-implantation in this regime.N-implantation of electronic-grade diamond is naturally well-suited to creating well-defined shallow NV layers and will not suffer from vacancy diffusion into the diamond, even though our results suggest this is not a major concern regardless.

IV. CONCLUSION
This work has shown that it is possible to create dense, well-confined, shallow NV ensembles via the ion implantation of type Ib HPHT diamond, with yields in the range of those typically achieved using N-implantation.Although we did not find strong evidence for high bulk nitrogen density improving near-surface NV − charge stability, these results do show that economical production of shallow ensembles is possible using this method.Along with near-surface band-bending, vacancy diffusion to the surface is likely limiting the yield by reducing NV formation efficiency and we speculate that the large spread in measured yields is due to variable surface termination in the diamond samples.A simple oxygen RIE process prior to implantation was not found to dramatically change results by itself, and so focussing on achieving high-quality surface termination during annealing is a logical next step.
The annealing processes conducted here are not expected to be optimal, and the relatively unknown role they have played motivates more systematic studies that could allow improved near-surface ensemble properties.For instance, the use of a higher-temperature anneal has previously been shown to improve the spin properties of shallow ensembles [32] and the N-to-NV yield could be improved through greater control over the diamond surface termination.Annealing during the implantation process is also an appealing option that has been shown to improve NV yields in the bulk [42].Charging vacancies during annealing by introducing shallow electron donors to the diamond crystal may also improve the vacancy yield by limiting the formation of multi-vacancy clusters and perhaps out-diffusion to the surface on electrostatic grounds [39].The areas for improvement identified in this work will hopefully allow the creation of shallow ensembles approaching bulk values to be feasible through ion implantation of both electronic grade and type Ib diamond in the future.

FIG. 1 .
FIG. 1. Creating shallow NV layers in type Ib diamond (a) SRIM simulations of the oxygen implants conducted, taking a 7 • angle of incidence.Phosphorus implants were also conducted at energies to approximately match the expected vacancy depth profile, but creating an order of magnitude more vacancies.(b) Confocal xz scan of one diamond sample, showing an NV layer localised at the surface.Two sectors are visible, with the left and right hand regions' nitrogen content estimated at 8 and 50 ppm respectively.(c) NV yield (N-to-NV) estimated as described in the text.Colouring shown in the legend matches the implants depicted in (a).(d) Plot of NVs created per vacancy, taking the NV yield as in (c) and comparing against the total vacancy production predicted by SRIM, again plotted versus nitrogen concentration.

FIG. 2 .
FIG.2.NV ensemble depth measurements (a) Example spin decoherence data obtained with an XY16-64 sequence (black dots) and fit (blue line) showing the hydrogen resonance.Inset: FFT of a correlation spectroscopy signal taken on-resonance, showing the hydrogen signal is dominant over the13 C harmonic.(b) Plot of mean ensemble NV depth dNV versus nitrogen concentration, using the same colour coding as in Fig.1.Depths quoted are measured using XY16-64 sequences and the error bars denote either the standard error from the fits or the spread in fit depths given by sequences ranging from 48 to 128 pulses, whichever is larger for a given data point.Note that most but not all samples studied were able to detect a hydrogen signal and those that could not are not included on this plot.

FIG. 3 .
FIG. 3. Assessing NV ensemble quality (a) Plot of Hahn echo T2 values for the shallow ensembles measured versus the nitrogen content of the sectors.Error bars are the standard errors from the fits to measured decay curves.The solid black line gives the N-limited T2 value given by the equation of Bauch et al. [31].(b) Plot of the figure of merit T2d −3 NV √ αR (see text) versus nitrogen content.Error bars are dominantly given by the uncertainty in T2 and dNV.