Introduction

Although there have been a few predecessors [1, 2], the invention of laser annealing [3] and flash lamp annealing (FLA) [4] is usually dated 1975 and 1978, respectively. The invention was done in the environment of semiconductor research, and further development was driven by the needs of microelectronics. With the continuously ongoing miniaturization of microelectronic devices, the tolerable diffusion lengths became shorter and shorter, which required an adequate reduction of annealing times [5]. Since the 1980s, rapid thermal processing (RTP) has been established as the dominant annealing technique in microelectronics with ca. 0.5 s as the shortest annealing time in case of spike annealing [6], but around 2000 FLA was gradually introduced for cases in which even this limit was not sufficient anymore. Along with this development, the technology of FLA has advanced, and there is now much better control over the annealing process and a much better adaption to industrial processes. Thus, FLA has moved into new areas of application such as photovoltaics [7], flexible electronics [8], printed electronics [9], and energy materials [10, 11].

Today, FLA is a modern annealing technique that offers a couple of advantages. Similar to ion implantation, it is a thermal non-equilibrium process, which is why FLA can achieve material properties that are not possible to achieve in thermal equilibrium. Examples are hyperdoping (doping above the solubility limit) or the synthesis of metastable phases. Metastable in this context means that the material is stable under room-temperature (RT) and operation conditions but is vulnerable to further annealing steps. Because of the short time scale, the bandwidth of temperature-sensitive materials, which nevertheless can be exposed to high temperature during FLA, is larger than for RTP and furnace annealing. Finally, FLA offers energy and process time savings, which makes this technique suitable for roll-to-roll and conveyor belt applications.

However, these advantages come with a price in form of additional issues to be considered and with challenges to be met. Temperature is now much more difficult to estimate as the temperature profile within a sample depends on the material properties, thermal stress has to be managed, and additional measures have to be taken in order to ensure a high reproducibility and homogeneity. Thus, the first part of this review gives a short overview of FLA technology concerning these issues. The second part discusses some special semiconductor application in which FLA was used to achieve specific material properties.

The technology of flash lamp annealing

In general, an FLA system consists of an energy storage system and a flash chamber. The energy storage system comprises a capacitance (made of one or several capacitors) to store the energy, an inductance to form the pulse, a charging device, and the required electronics including high-power switches for control. The flash chamber is mainly composed of one or several flash lamps, a reflector to harvest a maximum of light, and protection windows made of quartz to prevent the deposition of material onto the flash lamps and other harmful processes. Many tools are also equipped with a preheating system in order to minimize thermal stress or to achieve even higher temperatures. The preheating system itself can be a bank of flash lamps for longer pulses [12], a bank of halogen lamps similar to RTP, or a hot plate. The basic circuitry of an FLA system is given in Fig. 1a, and more details about the setup can be found in [13,14,15,16].

Fig. 1
figure 1

Basic scheme of an FLA circuitry comprising a capacitance C, and inductance L, the flash lamp F and a switch S (a), and the working scheme (b)

Figure 1b displays the working scheme of an FLA tool that roughly follows the flow of energy through the system and the sample. The charging voltage V0 and the capacitance C determine the stored energy, and a switch S usually separates the charging device. In case of active pulse shaping, i.e., for direct control of the lamp current, additional high power, controllable switches such as gate turn-off thyristors, metal-oxide semiconductor, field-effect transistors (MOSFETs), and insulated-gate bipolar transistors can be used [17,18,19,20]. If the capacitors are discharged, the circuitry of L, C, and F, called the pulse forming network, is a strongly damped, oscillating circuit that shapes the current pulse flowing through the flash lamp. Depending on the flash lamp characteristics, the current pulse is transformed into a light pulse of similar shape.

Figure 2a displays a comparison of the voltage (black), the current (blue), the electrical power (green), and the light pulse (red) for a 3 ms flash pulse. The current and the light intensity were measured via a Rogowski coil and a photodiode, respectively, whereas the voltage was measured with a 1:1000 oscilloscope probe. In this specific example, one relative unit corresponds to 1 kV, 0.5 kA, and 1.2 MW for the voltage, the current, and the electrical power, respectively. At first, the pulses are asymmetric with a shorter rising and a longer falling edge. For a small inductance and a large capacitance, the rising edge becomes steeper and the pulse length somewhat shorter. A steep rising edge, however, is disadvantageous for the lamp lifetime, which is why L and C should be chosen in such a way that the pulse becomes more symmetric. Secondly, the light pulse is somewhat shorter than the current pulse but follows the profile of the electrical power indicating a constant conversion efficiency. Only at the end of the falling edge, the light intensity fades out faster than the electrical power profile.

Fig. 2
figure 2

a Time profile of the voltage (black), the current (blue), the electrical power (green), and the light intensity (red) for a 3 ms flash pulse. b Time-integrated light spectrum of a Kr and a Xe lamp with a filling pressure of 300 Torr during a 3 ms flash performed with an applied voltage of 2 kV. After [15, 21]. More details are given in the text

The optical spectrum of the light pulse extends from the ultra-violet (UV) to the near-infrared (Fig. 2b) and is composed of the broad thermal emission of the hot plasma and a couple of discrete spectral lines originating from bound-to-bound emissions of the noble gas. As seen in Fig. 2b, a flash lamp filled with Kr gives a somewhat higher output in the UV (3–5 eV), whereas the Xe lamp has a higher output in the visible spectral region. In fact, lower atomic masses of the filling noble gas lead to higher plasma temperatures and, thus, to a blue-shift of thermal radiation, but at the expense of power conversion [13]. Furthermore, an increase in the applied voltage will also lead to higher plasma temperatures and a corresponding blue shift of the spectrum. The enhancement of the UV part is important if FLA is applied to materials that are nearly or fully transparent in the visible, such as TiO2 or transparent conducting oxides. Nevertheless, most flash lamps are filled with Xe due to the high conversion efficiency.

Finally, the design of the flash chamber and the reflector determines the light energy density ED, typically given in Jcm−2, which is delivered to the surface of the sample. From that point, the energy entry into and the temperature profile within the sample strongly depend on the sample properties. Many opaque samples are made of a thick substrate and thin films on top. In a rough approximation, the thin films mainly determine reflectivity and, thus, the amount of energy that is absorbed, whereas the thickness and thermal conductivity of the substrate govern the distribution of heat within the sample and, thus, the temperature profile. Figure 3 displays a typical temperature profile on the front and the backside of a sample as simulated by the COMSOL Multiphysics® software for a 3 ms flash with ED = 100 Jcm−2 applied to a 525-µm-thick Si wafer. The temperature at the surface starts at T0 (RT or the preheating temperature), strongly increases to a maximum Tmax, and decays to a quasi-equilibrium temperature Tequ. In contrast, the temperature on the backside monotonously increases to Tequ. During the flash, surface temperature rises as long as the entry of energy by absorption exceeds the outflow of energy from the surface region via heat conduction, which is why the rising time of the temperature peak and, thus, the effective annealing time can be longer than the pulse length. The temperature dependencies are discussed in detail in [13, 14]. In summary, the increase in sample thickness will usually lead to a decrease in Tmax and Tequ, whereas an increase in thermal conductivity causes a decrease in Tmax and a flattening of the temperature profile.

Fig. 3
figure 3

Simulated temperature profile of a 525 µm thick Si wafer when a 3 ms flash pulse with ED = 100 Jcm−2 is applied

Finally, there is the question of whether the temperature must be known, and if so, how to determine it. From an application point of view, the knowledge of temperature is not required as long as the FLA process can be optimized concerning certain material properties, e.g., to achieve a certain electrical conductivity or a certain degree of dopant activation without diffusion. However, if the temperature is of interest, it can be either measured by pyrometry or determined by a combination of calibration and simulation [22]. Temperature measurement during FLA by pyrometry is possible but has to meet a couple of challenges, namely the fast time scale, the change of the emissivity curve during FLA and the fact that the flash light outshines that of thermal emission by far. The latter problem can be solved by filtering the main water absorption band at usually 2.73 μm and placing the diagnostic wavelength of a pyrometer there. To do so, the flash light has to pass a quartz plate enriched with OH groups or a water film which concurrently cools the lamp. This concept was realized in a couple of tools [19, 20, 23] but remains a complex issue. Further details about temperature measurements during FLA can be found in [24].

Alternatively, the temperature can be estimated by a combination of calibration and simulation. In order to perform a simulation in the case of thin films on a substrate, the temperature-dependent heat capacity and thermal conductivity of the substrate as well as the absorbed energy density Eabs have to be known. The latter can be estimated by

$${E}_{\mathrm{abs}}=\left(1-{R}_{\mathrm{S}}-{T}_{\mathrm{S}}\right)\Gamma {E}_{\mathrm{D}},$$
(1)

where RS is the effective reflectance of the sample, TS is the effective transmittance of the sample, and Γ is a correction factor slightly higher than 1 that considers the contribution of multiple wall reflections. RS and TS can be determined by measuring the reflectance and the transmittance as a function of photon energy and convolute it with the flash lamp spectrum. If the latter is not known, it can be approximated by standard spectrum as given in [25]. Γ can be roughly estimated by a melting point comparison. In this case, a thin piece of Si or Ge (or another suitable material) is flashed with increasing intensity up to the point where first melt seeds appear at the surface. Γ is then deduced from the difference between the two values of ED where the melting temperature is reached in the experiment and simulation. Further details about this type of calibration can be found in [22].

Semiconductor applications

The development of FLA is intrinsically linked with the advancements in microelectronics, and for many years, semiconductors were the only materials that have been seriously treated with FLA.. Although today there are more and more applications beyond semiconductors, the latter are still the main application field. There is a long list of physical processes that can be promoted or suppressed. Thus, FLA is used to anneal defects, to activate dopants, to prevent dopant deactivation if further annealing steps are applied, to minimize diffusion, to suppress segregation, to crystallize amorphous materials, to trigger the formation of specific phases such as silicides and germanides, to avoid thermal decomposition etc. The following snapshots are examples that may not be the most prominent applications, but illustrate the large bandwidth of semiconductor applications in which FLA is utilized.

Hyperdoping of Si

Hyperdoping is a semiconductor application where non-equilibrium processing like FLA is needed. As seen in Fig. 4a, conventional doping covers the range of dopant concentrations below the corresponding equilibrium solid solubility limit, whereas hyperdoping describes dopant concentrations significantly above this limit. Most shallow dopants like Ga or P in Ge or B and P in Si have solubility limits in the 1019–1020 cm−3 range, and hyperdoping is used for concentrations at least one order of magnitude higher [26, 27]. In the case of deep dopants this limit is often much lower, and hyperdoping implies now exceeding this limit by several orders of magnitude. To realize hyperdoping, strong non-equilibrium processing is needed that can be performed either in situ during the growth of semiconductors [26] or ex situ by using ion implantation and thermal annealing. Ion implantation itself is a strong non-equilibrium process that allows the incorporation of dopants with concentrations exceeding the solubility limit by far. Most probably, the implanted layer is amorphized, and the post-implantation annealing step has to recrystallize the semiconductor material, to anneal out defects, to activate dopants, to suppress diffusion, and to avoid segregation of the dopants. Especially segregation requires a non-equilibrium annealing like FLA and pulsed laser annealing (PLA), if the dopant concentration is above the solubility limit.

Fig. 4
figure 4

Basic scheme of a the different concentration ranges for doping and b the model for impurity band formation in the case of Si:Se

These two annealing techniques have been compared in case of Si hyperdoped with Se [28], which is a deep donor creating an impurity level ca. 0.34 eV below the conduction band (CB) of Si. In this work, Se was implanted into Si forming a 100–150 nm thick amorphous layer with a maximum Se concentration of 2.3 at.%. It came out that the implanted layer is well crystallized after annealing, but that recrystallization takes place from the liquid and the solid phase for PLA and FLA, respectively. As a consequence, FLA preserves the Gaussian-like implantation profile, whereas PLA leads to a homogeneous, box-like profile of Se within the implanted layer. According to Rutherford backscattering spectrometry measurements, the substitutional fraction of Se is approximately 0.4 and 0.6 for PLA and FLA, respectively [29]. Both methods achieve RT carrier concentrations in the 1019 cm−3 range with somewhat better values for PLA. However, in the case of FLA, the carrier mobility was 2–3 times higher than in the case of PLA.

With increasing Se concentration, the impurity level broadens to an impurity band and finally merges with the CB of Si enabling a significant sub-bandgap photoresponse. This property was utilized to fabricate a RT short-wavelength infrared Si photodetector of which design is given in Fig. 5a [29]. The devices were fabricated by Se implantation into p-type Si, followed by FLA, which resulted in the formation of p-n junctions with Se concentrations between 1.1 and 3.1 at. % at the n-type side. Under these conditions, the impurity band has not yet merged with the Si CB, but the gap has decreased to values between 30 and 38 meV. The spectral photoresponse of these devices at a reverse bias of 1 V, using light from a 5 W tungsten halogen lamp that is filtered by a monochromator, is displayed in Fig. 5b. For short wavelength, electrons are excited under irradiation from the valence band (VB) to the CB as known from standard Si detectors. If the energy of the photons falls below that of the Si bandgap, photoresponse decreases and reaches a minimum when the photon energy is roughly equal to the energetic distance between VB and the gap between the impurity band and the CB. For even lower photon energies, electrons are excited from the VB to the impurity band, followed by thermal activation into the CB of Si, which manifests in an increase of photoresponse for longer wavelengths in Fig. 5b. Finally, the lower limit of absorption will be reached for photons with an energy smaller than the distance between VB and the lower edge of the impurity band. Despite not optimized, the devices show a responsivity of ca. 73 µAW−1 and an external quantum efficiency of 6 × 10–5 at a wavelength of 1550 nm, which is comparable to values reported for Au-supersaturated Si photodiodes [30]. More details can be found in [29]

Fig. 5
figure 5

Scheme of the Se-hyperdoped Si p–n junction (a), and corresponding normalized photocurrents at a reverse bias of 1 V (b). Reproduction from [29], which is licensed under creative commons (CC BY 4.0)

Crystallization of Ge and NiGe formation

Ge is compatible with Si and can be quite easily integrated into complementary metal–oxide–semiconductor (CMOS) fabrication processes. Ge offers a much higher hole mobility than Si and features promising optoelectronic properties, which is why Ge and SiGe already replace Si in the fields of high-speed optoelectronics and telecommunication [31]. Amorphous Ge has to be recrystallized after ion implantation or the deposition of thin amorphous Ge layers, and the tolerable thermal budget is often limited. In this context, FLA has been used for the formation of ultra-shallow junctions [32,33,34,35], for tri-gate poly-Ge junction-less p- and n-MOSFETs [36], GeSn n-MOSFETs [37], or for the crystallization of thin amorphous Ge films [38, 39].

However, FLA can also trigger other crystallization modes, namely solid- or liquid-phase explosive crystallization, and can control the crystallization dynamics depending on the FLA parameters. The latter point is demonstrated in the following example [40] of P-implanted Ge, which is either flashed from the front (f-FLA, Fig. 6a) or the rear side (r-FLA, Fig. 6b). After P implantation, there is a ca. 200 nm thick amorphous Ge (a-Ge) layer with a P peak concentration of up to 6 × 1020 cm−3. In the case of f-FLA (3 ms, 62 Jcm−2), the temperature within the amorphous layer rises rapidly, reaches a maximum of 700 °C after 2–3 ms, and decreases to 500 °C and below after a few ms (Fig. 6c). If a certain temperature threshold is exceeded during the rising edge, crystallization starts from the amorphous-crystalline interface, and a mono-crystalline crystallization frontier moves towards the surface. However, the high peak temperature also promotes the formation of seeds at the surface, and with some time delay, called the incubation time, a second crystallization frontier starts at the surface and moves downwards. This second crystallization results in a poly-crystalline (poly-Ge) film, and the final result is a recrystallized film which is mono- and poly-crystalline in its lower and upper parts, respectively. In the case of r-FLA, performed with 3 ms, 62 Jcm−2, and a preheating of 180 °C, the temperature within the amorphous layer rises more slowly and reaches its maximum after 8–10 ms, followed by a long cooling phase of several 100’s of ms. Thus, crystallization from the amorphous-crystalline interface starts later and the crystallization speed is lower due to the lower temperature. However, seed formation at the surface is even more delayed, and the first crystallization frontier can reach the surface before the incubation time is over. In consequence, the entire implanted layer has been recrystallized in a mono-crystalline way. The final result sensitively depends on the balance between the incubation time and the crystallization speed, and for thinner amorphous layers, a mono-crystalline regrowth can also be achieved for f-FLA. In the present case, this threshold thickness was found to be approximately 80 nm but will probably change with a change in FLA conditions. Further details can be found in [40].

Fig. 6
figure 6

Scheme of the recrystallization dynamics of ion-implanted Ge during f-FLA (a) and r-FLA (b), and the corresponding temperature profiles (c). Adapted from [40]

Another example of the use of FLA is the formation of NiGe in order to supply ohmic contacts with sufficiently low resistivity. In the present case [41], 10 nm of Ni has been deposited by magnetron sputtering on a-Ge, poly-Ge, and mono-crystalline Ge (mc-Ge). In the first case, a 130-nm-thick SiO2 layer was thermally grown on a 200-µm-thick Si substrate, followed by the deposition of 160 nm a-Ge using magnetron sputtering. The fabrication process for poly-Ge was similar, except for an additional FLA step prior to Ni deposition in order to recrystallize the a-Ge layer. For mc-Ge, a Sb-doped n-type (100) Ge wafer with a thickness of 520 µm and a specific resistivity in the order of 1 Ωcm was used. After the application of a 3 ms flash, the resulting sheet resistance RS as a function of the FLA peak temperature is given in Fig. 7. In general, RS increases at first with increasing peak temperature, reaches a maximum, and decreases to a final low value. Only at very high temperatures RS increases again. Based on Raman, X-ray diffraction, and transmission electron microscopy investigations, the formation process of NiGe was traced [41]. In the beginning, Ni diffuses into Ge and forms NiGe and Ni-rich germanides such as Ni3Ge or Ni2.74Ge2 that are characterized by a high specific resistivity. As RS is composed of contributions from the Ni layer, which is continuously consumed, NiGe, and the Ni-rich germanides, RS increases and reaches a maximum when the Ni layer is fully consumed. If the temperature continues to increase, Ni-rich germanides decompose to NiGe and Ni, and the excess Ni diffuses into the depth and completes the NiGe formation there. As a result, RS decreases to a minimum. Based on the determined NiGe thicknesses, specific resistivities of about 20 µΩcm (a-Ge) and 13–15 µΩcm (poly-Ge, mc-Ge) have been achieved. In the latter case, these values are comparable to the literature [42,43,44], where NiGe has been made by conventional annealing. See [41] for more details.

Fig. 7
figure 7

Sheet resistance of Ni/Ge structures with 10 nm Ni after 3 ms FLA. Reproduction from [41], which is licensed under creative commons (CC BY 4.0)

p-type doping of GaN by Mg ion implantation

The last example discusses the case of GaN where short-time annealing, and especially FLA, can help provide high annealing temperatures without the decomposition of GaN. Ion implantation is used to fabricate selective areas of n- and p-type doping. However, p-type doping faces several challenges. Mg is a suitable candidate but requires temperatures in excess of 1300 °C for a successful activation [45]. Concurrently, diffusion must be suppressed and defects, which are caused by implantation and preferably act as donors, must be annealed out. The main problem, however, is the fact that GaN under normal conditions already decomposes at temperatures as low as 800 °C. Several solutions have been proposed in the literature including ultra-high pressure annealing in N2 (UHPA), AlN capping layers, the co-implantation with N, and the implantation at high temperature. Sakurai et al. [46] implanted Mg and N with different concentration ratios, followed by either RTP at 1300 °C or UHPA at 1 GPa, and annealing temperatures and times between 1300 and 1400 °C and 0.5–15 min, respectively. The best results were obtained for an Mg/N ratio between 1 and 0.5 and for UHPA. The co-implantation of Mg and N, followed by RTP annealing at 1300 °C for 5 min, was also successfully applied to fabricate a vertical GaN planar MOSFET [47]. Finally, Takahashi et al. [48] implanted Mg at temperatures from RT to 1000 °C, followed by RTP at 1250 °C for 20 s, and showed that the best results were achieved for an implantation temperature of 1000 °C.

However, UHPA and high implantation temperatures are difficult to realize for industrial applications, and there seems to be a tendency to shorter annealing times, probably in combination with the multiple application of such annealing steps. Thus, multicycle RTP at 2 MPa N2 was used in which the temperature starts at a base value of 1300 °C and ramps repeatedly up to 1400 °C as a spike anneal [45]. It was found that the surface was heavily damaged already with the first spike if the effective anneal time of the spike is 31 s and longer. In contrast, the surface remained undamaged if the spike time is only 8.4 s, even if 40 cycles are applied. Further investigations revealed that a further reduction of the spike time to 3.87 s and 0.24 s leads to an increase of the degree of electrical activation to 8.2% [49]. Recently, laser annealing was used to improve the conductivity of Mg-doped GaN grown by metalorganic chemical vapor deposition [50]. Although reliable experimental results are not yet known, FLA with annealing times in the ms range is a promising alternative.

Summary

FLA is a short-time annealing method that can provide high temperatures on a time scale of ms at the surface or the near-surface region of a substrate, whereas the temperature in deeper regions and at the backside is significantly lower. Thus, FLA is especially suitable for thin-film and semiconductor applications. As a process in non-thermal equilibrium, it can improve specific material properties or synthesize new materials, which might not be possible with methods in thermal equilibrium. In addition, the use of FLA in industrial applications, namely in roll-to-roll or conveyor belt processing lines, can provide significant savings in energy and process time. However, there are some challenges to be considered in order to take full advantage of short-time annealing. This includes the fact that temperature depends on both the FLA and material parameters, that thermal stress has to be managed, and that a good homogeneity sometimes requires additional measures.