Determining Boundary-Layer Height from Aircraft Measurements

The height of the atmospheric boundary layer (ABL) is an important variable in both observational studies and model simulations. The most commonly used measurement for obtaining ABL height is a rawinsonde profile. Mesoscale or regional scale models use a bulk Richardson number based on profiles of the forecast variables. Here we evaluate the limitations of several frequently-used approaches for defining ABL height from a single profile, and identify the optimal threshold value for each method if profiles are the only available measurements. Aircraft measurements from five field projects are used, representing a variety of ABL conditions including stable, convective, and cloud-topped boundary layers over different underlying surfaces. ABL heights detected from these methods were validated against the ‘true’ value determined from aircraft soundings, where ABL height is defined as the top of the layer with significant turbulence. A detection rate was defined to denote how often the ABL height was correctly diagnosed with a particular method. The results suggest that the temperature gradient method provides the most reasonable estimates, although the detection rate and suitable detection criteria vary for different types of ABL. The Richardson number method, on the other hand, is in most cases inadequate or inferior to the other methods that were tried. The optimal range of the detection criteria is given for all ABL types examined in this study.


Introduction
The atmospheric boundary layer (ABL) is defined as the lowest layer of the troposphere that is affected by the presence of the underlying surface and responds to surface forcing on a time scale of 1h or less (Stull 1988). The depth of this layer, h, is a key variable in many applications such as air-pollution prediction and weather forecasting (Beyrich 1997). In environmental applications, h is one of the variables that defines the volume of air within which the pollution is mixed (Collier et al. 2005). Therefore, accurate specification of h is essential in numerical modelling of air quality, including pollutant transport, dispersion and removal. The ABL height is also an important scaling length for normalizing ABL variables, such as fluxes and vertical gradients of wind, potential temperature, and moisture for model and observational analyses. It is also used as a turbulence length scale in boundary-layer parametrizations involving turbulent kinetic energy (TKE, Therry and Lacarrere 1983), and for non-local turbulence closures in climate and weather forecasting models, e.g., the National Center for Atmospheric Research (NCAR) Community Climate Model (CCM3; Holtslag and Boville 1993) and in the National Centers for Environmental Prediction (NCEP) mediumrange forecasting model (Hong and Pan 1996).
The ABL is characterized by turbulence generated by shear and buoyancy forces (Stull 1988). Ideally, h can be identified as the depth of the turbulent layer adjacent to the surface. However, measurements at a sampling rate that is sufficient to resolve turbulence are often not available, particularly from rawinsondes, the most commonly used instrument for obtaining thermodynamic and wind profiles in the lower atmosphere. Thus, h is usually obtained from rawindsonde temperature, humidity or wind profiles based on the characteristics of their variations in the vicinity of z = h, z being height. For example, a temperature inversion and a moisture lapse often cap the ABL. However, the separation between the ABL and the free troposphere is not always clear from the available profiles. Hence, the choice of variable and the detection criteria may introduce large uncertainty in the estimate of h.
During the last decade, remote sensing systems (e.g. lidars, sodars, and wind profilers) that can operate continuously have been used for estimating h (Beyrich and Görsdorf 1995;Beyrich 1997;Emeis et al. 2004;Nielsen-Gammon et al. 2008). However, sometimes the interpretation of remotely-sensed structures used to estimate h can be ambiguous, and a comparison with sounding data can help to identify remotely-sensed structures in the ABL (Emeis et al. 2004). Therefore, rawinsonde-based h estimates are often used as a standard for evaluating ABL heights obtained from remote sensing data (Coulter 1979;Van Pul et al. 1994;Beyrich and Görsdorf 1995;Marsik et al. 1995;Beyrich 1997;Dupont et al. 1999;Bianco and Wilczak 2002;Lokoshchenko 2002;Hennemuth and Lammert 2006;Sicard et al. 2006;Martucci et al. 2007;Nielsen-Gammon et al. 2008). In order to improve ABL height detection from profiles, we have carried out a systematic evaluation of profile methods to understand their limitations and identify appropriate criteria for their application.
To evaluate the various ABL height detection schemes from the thermodynamic and wind profiles, it is desirable to use simultaneous measurements of both turbulence and thermodynamic/wind profiles. Turbulence measurements can be used to clearly identify the ABL top, and thus yield the 'true' ABL height, and are used as a reference to determine the optimal criteria for using vertical profiles of thermodynamic variables to estimate h. High-sample-rate aircraft observations of turbulence are ideal for this analysis. These turbulent fluctuations have the capability of determining h with an accuracy of about 10 m or less as the aircraft penetrates (ascents and descents) through the boundary-layer top (Wang et al. 1999). We use the smoothed temperature, humidity, and wind profiles from the same soundings as surrogates for the rawinsonde profiles, from which h can be obtained by various detection schemes. We provide a range of optimal h detetection criteria for several commonly-used gradient-based ABL height-detection schemes based on relatively large datasets of temperature, wind, and cloud-water profiles and validate the criteria with the 'true' ABL height from the turbulence measurements. We then estimate the associated error statistics.
As the vertical structure in the vicinity of h varies for different types of ABL and under different large-scale conditions, we examine a few typical types of ABL in different regions. Although the results are not intended to be universally applicable, they should be valid for similar regions and surface conditions. The organization of the paper is as follows: Sect. 2 describes the dataset and analysis procedure, and Sect. 3 presents the methods to be used for ABL height detection. Section 4 discusses the optimal detection method with suggested criterion range for each ABL type, and also the effect of vertical resolution on the selection of optimal detection criterion. Concluding remarks are presented in Sect. 5.

Datasets and Analysis Procedure
The data used here are spiral or slant-path soundings taken by research aircraft from five field campaigns conducted in the lower troposphere with data sampling rates fast enough to resolve turbulent fluctuations that were also smoothed to obtain profiles of temperature, humidity, and horizontal wind components similar to those from rawinsondes. These intensive field campaigns provided aircraft soundings in different types of boundary layer and over different underlying surfaces: the stable boundary layer (SBL) over land, the convective boundary layer (CBL) over land, the CBL over the ocean, and the cloud-topped boundary layer (CTBL) over the ocean. A brief description of each dataset is given here: (1) The Cooperative Atmosphere-Surface Exchange Study in 1999 (CASES-99), which was the second field campaign of the Cooperative Atmosphere-Surface Exchange Study conducted in Kansas, U.S.A., generated an extensive dataset for the SBL (Poulos et al. 2002). The NOAA Long-EZ (LEZ) and the Wyoming King Air (WKA) made extensive measurements at 50 and 25 Hz sample rates, respectively, from 6 to 27 October 1999. 44 soundings from 24 flights were selected from the original dataset.
(2) The Boreal Ecosystem-Atmosphere Study (BOREAS) was a large-scale international interdisciplinary experiment in the boreal forests of Canada (Sellers et al. 1997 For all the datasets used in this study, high-rate or turbulence sampling refers to a sampling frequency of 20 Hz or above. The true airspeed of the various aircraft varied from 60 m s −1 (CIRPAS Twin Otter) to 120 ms −1 (NCAR C-130), resulting in a horizontal resolution of 3 to 6 m that is small enough to capture the small-scale turbulence variability. The average ascent/descent rate of the airplane was about 2.5 m s −1 . The smoothed thermodynamic and wind profiles use 1-Hz data sub-sampled from the high-rate data. A running mean over a total 7 data points (or 6 s) is applied to the 1-Hz data to further smooth the profiles for estimating a mean vertical gradient that is required for some of the profile methods examined in this study. The running mean results in an average over an interval of about 20 m height. The corresponding horizontal averaging distance for slant path soundings is between 360 and 720 m and much less for spiral soundings. The resultant smoothed profile is thus a composite profile over nominally a horizontal distance of tens of km. We further assume that the observed gradient during the slant ascent/descent results solely from vertical variations. This assumption is justified by examining repeated soundings from the same flight. Although differences are seen among soundings, such differences are much smaller than the vertical gradient near the ABL top.
The performance of each profile method is evaluated by comparing h from this method (h Est ) with that from the turbulence method, which is considered the true ABL height (h Tur ). The difference between the two, h = |h Est − h Tur |, will be referred to as the detection error. Values of h from all the profile methods are examined using various statistics. One important comparison is the cumulative frequency, which we refer to as the detection rate, η, defined as η = N Est /N , where N Est is the number of the cases for which h Est meets the requirement −σ ≤ h ≤ σ, σ being the error range and N the total number of soundings. Thus, η is the percentage of cases for which h Est falls within ±σ of h Tur . The results for each ABL type will be discussed in Sect. 4.  The ABL height identified as the depth of the lowest layer of continuous turbulence is considered the true ABL height (Stull 1988  The sounding was obtained during POST on 29 July 2008, from 0404 to 0413 UTC. The horizontal red lines in h-j denote the ABL height from the turbulence method using a high-pass wavelet filter to remove the slow variations, similar to Wang et al. (1999) and Wang and Wang (2004). Visual inspection of the fluctuation plots clearly separates the ABL from the free troposphere. The ABL height is automatically detected using continuous wavelet transform to the absolute perturbations of velocity components to find the level at which the magnitude of the turbulence fluctuations shows the most rapid decrease with height.

Temperature Gradient Method
The gradient method for identifying h is empirically based on the typical characteristics of temperature profiles at the ABL top. The CTBL and CBL are generally capped by a well-defined temperature inversion with a substantial maximum in the lapse rate of potential temperature (Figs. 1 and 2). This property is used to identify h as the base of this enhanced inversion layer from a single sounding. We will refer to this method as the temperature gradient (TGRD) method. This method has been widely used to determine the CBL height; however, the magnitude of the gradient used as a detection criterion varies rather significantly (Bianco and Wilczak 2002;Zeng et al. 2004;Martucci et al. 2007). Fig. 1, expect for CBL1 from BOREAS; g is specific humidity (q in g kg −1 ); The measurements of CBL1 were made on an ascent on 6 June 1994, from 1505 to 1507 UTC In this study, the TGRD method will be evaluated for all four types of ABL using various detection criteria. However, the application of the detection criteria is slightly different for different boundary-layer types. For the CBL and CTBL, h is the level at which the virtual potential temperature gradient, ∂θ v /∂z, first exceeds the detection criterion at the base of the capping inversion. The sensitivity of the results to the choice of the gradient criterion will be examined and optimal detection criteria will be identified. As seen in Fig. 2, the capping inversion for the CBL can be relatively weak, especially compared to the CTBL (Fig. 1). Consequently, the detection criterion set for the CBL using the TRGD method needs to be smaller. At the same time, the threshold needs to be large enough to prevent random perturbations from exceeding the detection criterion. For the CTBL, a sharp inversion usually occurs at the boundary-layer top due to surface cooling and subsidence warming, resulting in a distinct peak in the profile of ∂θ v /∂z (Fig. 1). Hence, for the CTBL, in addition to the potential temperature gradient method described above, we also define h as the height at which ∂θ v /∂z reaches its maximum. This method will be referred to as the maximum temperature gradient method, or TGRDM. Note that the CTBL cases we discuss refer only to the stratocumulus-topped boundary layer. We had several CBL cases from PASE that were capped by scattered fair-weather cumuli; the TGRDM method did not work well for these cases but the TGRD method used for the CBL did seem to work well.

Fig. 2 Same as
For the SBL height, a variety of criteria and methods based on the temperature profile have been proposed, including the top of the surface temperature inversion (Yu 1978), the  Fig. 1, except for SBL1 from CASES-99. g Is the wind shear (s −1 ). The measurements of SBL1 were made from an ascent on 20 October 1999, from 0951 to 0952 UTC lowest (significant) discontinuity in the temperature profile (Hanna 1969;Beyrich 1997), the first significant variation in ∂θ v /∂z (Dupont et al. 1999), and the positive maximum in the temperature gradient (Martucci et al. 2007). Wetzel (1982) defined the SBL height using a combination of the potential temperature gradient and wind shear; i.e. the height at which θ v begins to stray significantly from a linear profile or the lowest level at which the wind speed shear approaches zero, whichever is the smaller. Most methods are based on the parameters that were available in the specific datasets, with a variety of vertical resolutions and measurement accuracies.
In this study, the TGRD method applied to the SBL is more complicated than those for the CBL or the CTBL because of the variable nature of ∂θ v /∂z profiles at the top of the SBL. Figure 4 compares three typical profiles for the SBL. Here, all three soundings indicate small and less variable ∂θ v /∂z above a certain height, with the ∂θ v /∂z profiles varying greatly in the lower levels. Sounding SBL1 was made in a very stable ABL with a ground-based inversion, where the strongest θ v gradient is near the surface. We found ∂θ v /∂z decreased monotonically with height and eventually reached a minimum and remained small above. For this case, it is possible to identify a minimum ∂θ v /∂z value corresponding to the true ABL height. Soundings SBL2 and SBL3 represent another type of stable ABL where the near-surface thermal stratification is weakened. These soundings were generally made in the early morning after sunrise. Although the stable stratification remained, the temperature near the surface had increased significantly, reducing the magnitude of ∂θ v /∂z close to the surface Fig. 4 Three examples of soundings in the SBL from CASES-99: SBL1 and SBL2 are both from ascents by the Long-EZ from 0951 to 0952 UTC on 20 October 1999 and from 1308 to 1309 UTC on 12 October 1999, respectively; SBL3 is from a descent by the Wyoming King Air from 1338 to 1340 UTC on 11 October 1999. The red line on each profile denotes the ABL height detected from different methods using the corresponding variable. Here, the detection criteria used for TGRD, WDS, and Ri methods are 6.5 K (100 m) −1 , 0.065 s −1 , and 0.5, respectively and thus resulting in a local maximum aloft at 95 and 210 m in SBL2 and SBL3, respectively. This surface heating effect is also evident in the larger magnitude of turbulence seen in the corresponding fluctuation profiles. In these two cases, ABL heights are at or above the level of maximum ∂θ v /∂z as seen in the comparison with the turbulence layer in the right column of Fig. 4. For the ∂θ v /∂z profile of sounding SBL2, h is obtained from the chosen threshold value but above the level of maximum ∂θ v /∂z. However, for sounding SBL3, where the maximum ∂θ v /∂z is in fact less than the chosen detection criterion, h is then defined to be at the height with maximum ∂θ v /∂z. To generalize from the three SBL cases, the SBL top is defined as the height where ∂θ v /∂z first becomes smaller than the specified threshold and above the local maximum, if it exists. Following this definition, h will be at the level of maximum ∂θ v /∂z when the observed ∂θ v /∂z at all levels is less than the given detection threshold.

Richardson Number Method
The gradient Richardson number is an important parameter for diagnosing flow dynamic stability (e.g., Stull 1988;Garratt 1994), and is defined as where g is the acceleration due to gravity, andŪ andV are the mean wind components in the east-west and north-south directions, respectively. The Ri method is a direct approach for estimating h in practical applications and is widely used in diagnosing h from mesoscale forecast models (Straume et al. 1998;Zilitinkevich and Baklanov 2002;Batchvarova and Gryning 2003;Jeričević and Grisogono 2006). It was also proposed as a method to obtain h from sounding data based on the assumption that continuous turbulence vanishes beyond the critical Richardson number (Joffre et al. 2001;Zeng et al. 2004;Balsley et al. 2006;Hennemuth and Lammert 2006;Sicard et al. 2006). One of the disadvantages of this method is the considerable uncertainty in the choice of an appropriate threshold value, ranging from 0.15 to 0.55 (and even larger: 1.3 to 7.2 derived from coarse resolution models, Zilitinkevich and Baklanov 2002). Although the Ri method is physically based, the wide range of suggested critical values makes it difficult to use in practical applications. The vertical gradients in (1) can be approximated with finite differences using adjacent values of the smoothed profiles. It then becomes a bulk Richardson number, whereθ v is the average virtual potential temperature between the two levels z 1 and z 2 . Ri b is sensitive to the vertical resolution of the variables, which is one of the reasons for the wide range of Ri bc in the literature. Therefore, we do not expect that the critical bulk Richardson number Ri bc is the same as the theoretical critical gradient Richardson number Ri c . In this paper, the Ri method for estimating h is defined as the height (starting from near the surface) where Ri first becomes greater than a given threshold (hereafter a detection criterion for Ri c ).

Wind and Wind Shear Profile Methods
The two wind-profile based methods discussed here are applicable to the SBL. The SBL height is sometimes estimated from the vertical wind profile, using e.g. the height of the low-level wind-speed maximum, often referred to as the low-level jet (LLJ) (Melgarejo and Deardorff 1974;Mahrt et al. 1979). However, as pointed out by Balsley et al. (2006) and Meillier et al. (2008), using the jet maximum to define the SBL top sometimes gives conflicting results. In comparing different methods to define the SBL height, Hyun et al. (2005) found estimates of SBL height based on the LLJ to be inconsistent with those from other methods, such as those using ∂θ v /∂z. For example, when SBL heights from other methods indicate a deepening SBL with time, Hyun et al. (2005) found that the height of the LLJ may not show the same trend. For the SBL, turbulence near the surface is generated solely by mean wind shear, S = ∂Ū /∂z 2 + ∂V /∂z energy (Sun 2011). In Fig. 3 the largest wind shear is found near the surface although moderate wind shear may also be present at the SBL top. Hence, Hyun et al. (2005) used a wind-shear method (WDS) to estimate the SBL height, and following Hyun et al. (2005), the criterion for wind shear is ∂Ū ∂z where S c is the detection criterion for wind shear. Similar to the definition for the TGRD method, the SBL top is defined as the height where the wind shear first becomes less than a detection criterion, S c , and above the local maximum if it exists. Thus, h will be the height of maximum wind shear when the observed value at all levels is less than S c . The study of Hyun et al. (2005) was limited to nine SBL rawindsonde wind profiles using S c = 0.04 s −1 based on measured vertical shear from a previous study (Bader and Mckee 1985). In our study, the WDS method is further analyzed by comparison with the turbulence method from aircraft measurement. Column d in Fig. 4 shows that for these cases the wind shear varies with height rather significantly up to a level just above regions of strong turbulence.

CTBL Height Using Cloud Top and Relative Humidity
The stratocumulus-topped boundary layer, CTBL, covers extended regions off the subtropical west coast of major continents where large-scale subsidence of warm dry air meets the cool moist boundary-layer air over cool upwelling water. Figure 1 depicts typical vertical profiles of a stratus/stratocumulus-topped ABL where the variation of liquid water content, q c , with height in the cloud layer is nearly adiabatic. For the CTBL, turbulence is mainly generated by radiative cooling at cloud top, with some contribution from buoyancy and shear forcing from the surface. The cloud top provides a good estimate of the top of the turbulent CTBL where a rapid decrease of q c defines the cloud boundary. Here, we use the highest level at which q c > 0.04 g m −3 as the cloud top. This criterion was used to denote the cloud boundary by Lenschow et al. (2000) using aircraft measurements. The value of 0.04 g m −3 instead of zero is used to avoid the possibility that a systematic sensor error might give values > 0 in cloud-free regions. The cloud layer has also been defined as the layer where the relative humidity (R H) > 97 % (Zeng et al. 2004), which is also observed in broken cloud layers (Albrecht et al. 1985;Betts et al. 1995). These two methods are referred to as the CLTP and RH methods, respectively. Another alternative method involving relative humidity for identifying the ABL top for this type of boundary layer is to identify the level at which relative humidity has the maximum vertical gradient (RHGRDM). This method will also be evaluated in Sect. 4.

Evaluation of Boundary-Layer Height Detection Criteria
As mentioned in Sect. 3, several h detection criteria have been reported in the literature using variables derived from the wind-speed and/or potential temperature and humidity profiles. Some of the criteria used by other researchers are listed in Table 1; most of these are based on visual inspection of a few soundings. Here, we systematically test these criteria by varying the critical values for each method and for the most frequently observed boundary layers (SBL-land, CBL-land, CBL-ocean, and CTBL). The results for each ABL type, including the impact of vertical resolution, are discussed in the following subsections. Compared to other ABL types, the SBL is most problematic when determining h from measured or modelled profiles. Here we tested three methods for estimating SBL height using more h Est values are detected with errors < σ. For the TGRD method (Fig. 5a), we used gradient criteria ranging from 0.04 to 0.08 K m −1 with an increment of 0.005 K m −1 . We see that the gradient criterion that performs best is 0.065 K m −1 . With this detection criterion we obtain the highest detection rate for an error range from 5 to 50 m, with the detection rate increasing somewhat faster for a smaller error range. It identifies about 43 % of h Est with errors <10 m, and about 84 % of h Est with errors <30 m. We find similar results for gradient criteria from 0.05 to 0.08 K m −1 . If we use an error range of 30 m as a benchmark, we expect to be able to detect about 77 % of h Est values within an error of 30 m when using detection criteria within this range.
Results using the WDS method are shown in Fig. 5b with the wind-shear criteria ranging from 0.040 to 0.075 s −1 . At an error range <30 m, the threshold value of 0.065 s −1 seems to perform best with about an 81 % detection rate. Similar results are obtained for wind-shear criteria of 0.055, 0.6, 0.7 and 0.075 s −1 . The optimal range of the wind-shear criterion is thus determined to be from 0.055 to 0.075 s −1 .
Although Ri b is directly related to the generation/consumption of turbulence in the ABL, detecting h using Ri b (Fig. 5c) does not give satisfactory results compared to the TGRD and WDS methods. Figure 5c shows that the detection rate is low and insensitive to the selection of the detection criteria since results with Ri c ranging from 0.25 to 3 are similar. Here, the detection rate increases almost linearly with the error range and only about 69 % are detected with an error of 30 m or less, much lower than for the TGRD and WDS methods. Figure 6 shows a more detailed comparison between h Est and h Tur for the SBL using all three methods. Here, the detection criteria used for the TGRD, WDS, and Ri methods are 0.06.5 K m −1 , 0.065 s −1 , and 0.5, respectively, all within the optimal range determined from Fig. 5. The top panel shows the differences among all four methods for each of the 44 soundings from CASES-99 (sounding numbers 1 to 44). The soundings to the right side of Fig. 6a (sounding numbers 47 to 52) are from the stable cases in BOREAS. Visual inspection of Fig. 6a suggests h Est from the TGRD method (h TGRD ) is closest to h Tur , while the Ri method has the largest deviations. This is clearly seen in the corresponding histogram in Fig. 6e-g, where the number of soundings (scaled by the total number of soundings) in each 20-m wide error bin is plotted. Apparently, very large errors may result from all three methods for certain soundings, especially with the Ri method. If we consider large errors as those exceeding a 100-m difference from h Tur , the percentages of h Est estimates with large errors are 2, 6, and 4% for the TGRD, WDS, and Ri methods respectively.
The middle row of Fig. 6 shows scatter plots of h Est versus h Tur ; some basic statistical descriptions of the detection error are also given in these figures. In order to eliminate the dominant contributions from outliers, the statistical values listed in Fig. 6b-d are calculated using results with errors <100 m. The means, medians, and standard deviations of the errors for the TGRD and WDS methods are similar, although the TGRD method yields a higher correlation coefficient (0.91) than the WDS method (0.88). On the other hand, the Ri method comparison with h Ri has a smaller correlation coefficient (0.85) and a higher standard deviation (32 m). Hence, among all three methods, TGRD gives the best overall results, followed closely by the WDS method, while the Ri method is clearly the worst.
We further examined the soundings that resulted in large errors for the TGRD and WDS methods. The SBL3 sounding in Fig. 4 is one good example that illustrates the conditions resulting in large variations of h Est . For this sounding, the TGRD and Ri methods give more reasonable results (229 and 210 m, respectively compared to 205 m from the turbulence method). However, we find a large error in h Est using the WDS method (h WDS = 94m). We see that the wind shear varies significantly in the lowest 500 m, where its lowest local maximum is located at about 65 m; above this the minimum shear is 0.045 s −1 at 145-m Fig. 6 Evaluation of ABL heights from the three profile methods (TGRD, WDS and Ri) for SBL cases. a ABL heights detected from all three methods and from the turbulence method for each sounding in CASES-99 (sounding number 1-44) and in BOREAS (sounding number 47-52 to the right of the vertical dash line); b-d Scatter plots of ABL height from each profile method compared to the 'true' ABL height from the turbulence method. Some statistics of the error in ABL height are given in the respective scatter plot; e-g Histograms of the errors in the detected ABL height for all three profile methods. All data are from CASES-99 except sounding numbers 47-52 in a height. There is also a second local maximum at 263 m that has a smaller magnitude than that at 65-m height. If the detection criterion is set at 0.04 s −1 , h Est is detected as 390 m. If the detection criterion is set slightly larger at 0.045 s −1 , h Est is detected as 145 m. In this case, a change of 0.005 s −1 in the detection criterion results in h Est differing by about 245 m. Similar sensitivity of the results to detection criteria can be also found for a few soundings using the Ri method (not shown).
Another sounding that has a large error in h Est is sounding 22 shown in Fig. 7a, where h Tur is much higher than those from all three other methods. This sounding was taken on October 21 around 0615 UTC by the King Air aircraft. The sounding profiles (not shown) reveal a layer of constant potential temperature and constant specific humidity above the near-surface stable layer. Weak turbulence is also present in this layer. Judging from the time of the measurements, it is likely that the constant potential temperature layer above the SBL is the residual daytime mixed layer with dissipating turbulence. In this case, the turbulence method gives the top of the residual layer, while the TGRD and WDS methods give h Est associated with the SBL. This indicates some arbitrariness in the determination of ABL height.
We tested the criteria derived from CASES-99 soundings to identify h in some aircraft soundings made by the NCAR Electra in the BOREAS SBL. The results are shown in Fig.  6a to the right of the vertical dashed line, where it is clear that the TGRD method results in consistently good estimates of h that are closest to h Tur . For the BOREAS SBL cases, the WDS method gives rather inconsistent results, while the Ri method significantly underestimates h. Testing of the detection criteria for the BOREAS SBL seems to again suggest that the TGRD method is the most reliable among the three methods, consistent with that found from the CASES-99 soundings.

CBL Over Land
Measurements from BOREAS experiment were mostly in clear or partly cloudy CBLs over forest. Figure 7a, b show the detection rates and its performance for different values of the detection criterion for both the TGRD and Ri methods. The tested detection criteria range from 0.006 to 0.03 K m −1 for the TGRD method and 0.25 to 3 for the Ri method. Because of the large range of h in the cases considered here (500 to 3000 m), the error is expressed as a percentage difference, ε = h Est −h Tur h Tur × 100. The best performing detection criterion for the TGRD method is from 0.01 to 0.018 K m −1 . Using a detection criterion of 0.013 K m −1 for the TGRD method, we found that about 58 % of h Est values had <10 % error and a 76 % probability of detecting h Est within a 20 % error. Results in Fig. 7a, b also suggest that the Ri method fails to detect an h value, as the detection rate is very low at small ε for all tested detection criteria. The Ri method is thus not recommended for application to the CBL. Figure 7c-e gives more information on the comparison between h Est and h Tur for the TGRD method only, where the estimates were obtained using a gradient criterion of 0.013K m −1 . In general, h TGRD follows h Tur fairly well, with a correlation coefficient of 0.93 for all 73 soundings profiles used in this study. Both the scatter plot and the relative error histogram (Fig. 7d, e) suggest h may be underestimated with a median difference of about 31 m, although there are a few soundings that substantially overestimate h, resulting in significant standard deviations. The mean absolute relative error, ε, is about 0.14. Given a 1-km deep CBL, this translates to an estimated error of 140 m.
We note that the TGRD method failed to detect h for three soundings (Fig. 7c) because the detection criterion was not met. If we use a larger detection criterion of 0.015 K m −1 , the TGRD method fails to find an h value for six soundings (about 8 % of the soundings). This is a direct result of the small gradient at the CBL top and perhaps the CBL not being well-mixed. Hence, the TGRD method may not always be able to define h.

CBL Over Tropical Oceans
Detecting h in the CBL over the tropical ocean has more uncertainty than over land because the characteristics of the capping inversion are often even less well-defined. Figure 8 shows Fig. 7 Evaluation of ABL heights from both the TGRD and Ri methods for the CBL cases in BOREAS. a-b ABL height detection rate, η, as a function of the relative error, ε, for different values of the detection criterion shown in the figure legend in units of K(100 m) −1 ; c ABL heights detected from TGRD using the optimal gradient criterion of 1.3 K(100 m) −1 and turbulence methods; d Scatter plots of ABL height from the TGRD method versus those from turbulence method; and e Histogram of the errors in the detected ABL height using TGRD method. The detection criterion used in c-e is 1.3 K(100 m) −1 a typical sounding from PASE to illustrate the issues. Here, the turbulent boundary layer extends to about 685 m above the surface. The magnitude of the turbulent fluctuations is similar to the BOREAS CBL (Fig. 2). The air in most of this turbulent layer is well mixed  Fig. 1, expect for a CBL sounding from PASE. g q, specific humidity in g kg −1 ; this is a descent on 25 August 2007 from 2343 to 2355 UTC as seen in θ v except in the top 100 m; there is a substantial gradient in specific humidity (q) throughout the lower layer. This was also observed in the CBL over land as well as over the ocean because of entrainment of dry air aloft into a moist CBL (Mahrt 1976;Betts 1982). The gradient of θ v in the upper part of the mixed layer is only slightly > 0 with only a few values as large as about 0.02 K m −1 , suggesting weak stable stratification. This weak stable layer may contain layers of intermittent turbulence as seen between 2,500 and 3,000 m in Fig. 8, which is likely a result of the mean wind shear at this level. This shear-generated turbulence is consistent with the small Ri in the layer between 2,500 and 3,000 m. Other soundings from PASE show thin layers of turbulence in the weakly stably stratified layer.
Detection criteria ranging from 0.003 to 0.01K m −1 were used for all 68 soundings from 16 flights during PASE, and the results are shown in Fig. 9a. Again, the θ v data were smoothed with a running mean of 20 m in order to obtain a reasonable gradient, although the smoothed gradient still varies significantly with height as shown in Fig. 8b. Results in Fig. 9a suggest the optimal range for the detection criterion is from 0.006 to 0.008 K m −1 . Detection criteria within this range gave similar results with a relative error of less than 10 % and a detection rate of about 43 %. At 20 % error, 0.007 and 0.008 K m −1 seem to give better results with a 60 % detection rate. Compared to the CBL over land and the SBL, h is detected rather poorly for the CBL over the tropical ocean with a correlation coefficient of only 0.37 for the TGRD method. This poor correlation is clearly shown in Fig. 9c for each individual sounding.
Similar to the CBL over land, the Ri method performs poorly for the PASE cases. Figure  9b shows that with a relative error of 10 or 20 % the detection rate is less than 10 % for all Fig. 9 Same as in Fig. 7, except for the CBL over the ocean from PASE values of the detection criterion. We, therefore, do not recommend this method for the CBL over the tropical ocean.
The scatter plots and histograms in Fig. 9d, e show significant underestimates for some PASE cases where h Tur is nearly twice as large as h Est . We examined a few soundings where the TGRD method results for h are very different from h Tur . The discrepancy may have resulted from several possible factors, including the presence of significant ∂θ v /∂z values in the CBL due to its not being well-mixed. Because we use a small detection gradient used for this type of CBL, any slight stratification in the lower part can trigger the h criterion with the TGRD method. The presence of clouds in the lower atmosphere also complicates the detection of h in the PASE cases. We found that about 40 % of the PASE soundings penetrated shallow layers of fair-weather cumuli below 2 km. Although their liquid water content was small (< 0.2 g m −3 ), these cloud layers are normally turbulent and ∂θ v /∂z within the cloud layer becomes significant (but not large enough to be moist adiabatic). The detection criterion used to define h for this type of CBL is small enough that the TGRD method, based on ∂θ v /∂z, occasionally detects h close to the cloud base or where the gradient has small perturbations. The true ABL height h Tur is also somewhat arbitrary in a few cases due to turbulence within the cloud layer.
For all cases, we tried to identify the top of the lowest level with continuous turbulence as h. This is not an issue if the cloud layer is significantly higher than the surface-based turbulence layer, since in that case the turbulence associated with the cloud is uncoupled from the turbulent boundary layer below. The top of the surface-based turbulence layer becomes ambiguous when a cloud layer is low and merges with the surface-based turbulence layer. In these cases, it is likely that the TGRD method detects h at cloud base while the turbulence method detects h close to the cloud top. It might be preferable to use variables that are conserved in a moist adiabatic process to detect h in the CTBL, such as the equivalent potential temperature or the liquid water potential temperature, as the vertical gradients in these conserved variables are expected to be close to zero in a well-mixed CTBL. We tested the use of the equivalent potential temperature gradient to detect h for all PASE soundings. The maximum detection rate is about the same as in Fig. 9a, except that the best performing detection criterion for the gradient of equivalent potential temperature is 0.01 K m −1 . The errors resulting from the presence of cloud should be on the order of 200 m or less since most of the cloud layers are less than 200 m deep. It should also be noted that the aircraft soundings are not strictly vertical soundings. Because of the horizontal traverse during the sounding, the aircraft soundings may at times introduce horizontal variation into the vertical variation. This is especially true in the presence of scattered cumulus clouds capping the CBL, which also contribute to part of the uncertainty seen in the PASE results.
In order to validate the applicability of the optimal range (0.006 to 0.008 K m −1 ) of the detection criterion identified from the PASE dataset for other CBL types in a similar environment, we applied the detection criterion to identify h in 26 aircraft soundings over the ocean from the NCAR Electra during TOGA COARE. It is clear from Fig. 10 that for the TGRD method, 0.007 K m −1 gives the best h estimates compared to h Tur (58 % detection rate at 10 % error and 85 % detection rate at 20 % error), with a correlation coefficient of 0.93. These results are comparable to those for the CBL over land, however his not detected well in the PASE CBL. This implies that intermittent turbulence in shallow layers of fair-weather cumulus clouds complicates the h detection.

CTBL Over Subtropical Ocean
All the methods of ABL height detection discussed in Sect. 3 can be tested for the CTBL, with the exception of the WDS method where the empirical threshold values were derived for the SBL. The results are shown in Fig. 11 and the detection rates at 5 % and 10 % error are summarized in Table 3; this indicates that among all the ABL types discussed in this study, the highest detection rate is for the TGRDM method, where h Est is estimated in about 90 % of the cases with an error of only 5 %. The CLTP and TGRD methods give similar results that are not as satisfactory as the TGRDM method. Methods involving the maximum Fig. 10 Evaluation of ABL heights from the TGRD method for the CBL cases in TOGA COARE. a ABL height detection rate, η, as a function of the relative error, ε, for the detection criteria derived from PASE soundings; b ABL heights detected from TGRD and turbulence methods; c Scatter plots of ABL height from the TGRD method vs. those from turbulence method; and d Histogram of the errors in the detected ABL height using TGRD method. The detection criterion used in b-d is 0.7 K(100 m) −1 gradient of RH (RHGRDM) or RH itself give similar results, while the Ri method yields the least accurate h estimates and is not recommended for use in the subtropical CTBL.
It should be noted that only minimum smoothing (5 m in height) of the data was done for the CTBL when the vertical gradients were calculated to obtain the results in Fig. 11 and Table 3. The CTBL top has very distinctive characteristics, such as sharp temperature and moisture gradients and an abrupt decrease of the cloud liquid water. This is a major reason for the relative success of h estimates for this type of boundary layer. The correlation coefficients between h Est and h Tur ≈ 1 for all three selected methods. We also find that the mean differences between h Est and h Tur are < 2.5 m for the TGRDM and RHGRDM methods; however, the CLTP method consistently underestimates h with a mean difference of about 9 Fig. 11 ABL height detection rate, η, as a function of the relative error, ε, for different values of the detection criterion for the CTBL from POST. a TGRD method, including the TGRDM method; b Methods using cloud boundary or relative humidity; and c Ri method. The q c criterion is in units of g m −3 ; the detection criterion for TGRD is in units of K (100 m) −1 m (not shown). This result suggests that turbulence may extend into a thin layer overlying the cloud top, which is likely the result of local wind shear. We point out that measurements or model results at a lower vertical resolution may not be able to resolve the abrupt changes in thermodynamic and cloud properties at the cloud top and hence significantly modify the results in Table 3.

Effects of Vertical Resolution on the Performance of Detection Methods
The analyses in Sects. 4.1-4.4 were done after smoothing the original aircraft soundings using a window length of 20 m (or 5 m for CTBL). These results should also be applicable to profile data with equivalent vertical resolution such as rawinsonde soundings. We also tested the effects of vertical resolution on the selection of the optimal detection criterion. Two other window lengths (50 and 100 m) were chosen for the same analyses as above to find the corresponding optimal detection criteria for these coarse resolution profiles. These two data window lengths were selected to be similar to mesoscale and regional scale model resolutions. Some remote sensing profiling instruments for the ABL, such as wind profiler/RASS systems, also sample with a similar vertical resolution. Results from these tests are given in the last two columns of Table 2.
For the SBL, the range of detection criteria changes slightly as the vertical grid resolution becomes coarser. Going from 20-m vertical resolution to 50-m resolution, the detection rates of TGRD and WDS methods decrease significantly, while that of the Ri method decreases only slightly. Therefore, at 50-m resolution, all three methods give similar results and the detection rate is between 66 and 70 %. At 100-m resolution, the detection rate of WDS method falls substantially, followed by the Ri method, while the TGRD method performs somewhat better than the others.
For the mid-latitude CBL over land, the TGRD method gives good ABL height estimations at all vertical resolutions. As the vertical resolution varies from 20 m to 100 m, the detection rate remains approximately the same (88, 87 and 86 %), although the suggested critical value range tends to be narrower. The h estimates are more reliable for the CBL over the tropical ocean at coarser vertical resolution. At 20-m resolution, the TGRD method detects h values 66 % of the time with <30 % error, and this detection rate increases to 78-79 % at 50 m and 100 m because of the larger variation of the vertical gradient of temperature at fine (20 m) vertical resolution.
For the CTBL, on the other hand, multiple methods can provide reasonable results even at low vertical resolution. Table 3 shows a detailed comparison at 5, 20, 50, and 100 m vertical resolution at 5 and 10 % detection error. At high resolutions (5 and 20 m), the TGRDM method detects h values 94 to 95 % of the time with less than 10 % error. This method can still detect h values 71 % of the time at 100-m resolution. The cloud top method fails at low resolution (100 m), although it maintains a relatively high detection rate up to 50m resolution. A similar fall in performance occurs with the TGRD method while the RH (<90 %) method performs only slightly better (Table 3). Again, the Ri method cannot detect the ABL height with reasonable accuracy at any vertical resolution. The recommended method and the suggested range of detection criteria for the CTBL are summarized in the last two rows of Table 2.

Summary and Conclusion
Four commonly-used methods for estimating ABL height (h), based on temperature, wind, and humidity profiles, were evaluated against the 'true' ABL height (h Tur ) determined from aircraft sounding profiles, where h Tur is obtained using high-rate measurements of turbulent fluctuations that help identify the boundary-layer top as the top of the layer with significant and continuous turbulence. We used aircraft soundings from five major field experiments: CASES-99, BOREAS, PASE, TOGA COARE and POST. These projects provided measurements of the mid-latitude SBL and CBL over land, the CBL over the tropical ocean, and the sub-tropical CTBL over the ocean. The soundings obtained under stable conditions overland from BOREAS and those for well-mixed conditions over the ocean during TOGA COARE were used to validate the applicability of the results derived from the CASES-99 SBL and PASE CBL.
(∂ R H/∂z) max All detection methods are defined in the text. The numbers in parenthesis (%) are the best detection rates using the values in the given detection criterion range. The benchmarks used are an error range of 30 m for SBL, a relative error of 30 % for CBL, and 10 % for CTBL A detection rate, defined as the percentage of correctly-detected h within a given error range, was used to evaluate the performance of each of the h detection methods systematically. Based on the highest detection rate, we identified the optimal detection criteria for all considered ABL types, and examined their dependence on the vertical resolution of the sounding data. These analyses were done after smoothing the original 2-to 5-m resolution aircraft soundings with windows of 20-, 50-, and 100-m resolution. The suggested range of the detection criteria and the resultant detection rate for all three resolutions are summarized in Tables 2 and 3.
We note that several mesoscale models, such as the Coupled Ocean-Atmospheric Prediction System (COAMPS) and Weather Research and Forecasting (WRF) model, use the Ri method to diagnose h for all boundary-layer types. Our results suggest that the Ri method yields comparable results to other methods only at 100-m vertical resolution with a detection rate of about 57 % for the SBL. However, this method performs poorly for the CBL and the CTBL and thus is not recommended for these ABL types. In particular, for the CTBL, the Ri method has a much smaller detection rate compared to methods using the maximum gradient of temperature or relative humidity even at the coarsest vertical resolution of 100 m.
We examined several types of ABL over several surface types. The vertical thermodynamic and dynamic structure of the layer near the ABL top is closely related to turbulent mixing in the boundary layer, as well as entrainment and the characteristics of the air mass aloft, which, in turn, are related to a variety of factors including surface temperature, synoptic conditions, and local differential advection. Criteria identified from this study may not be applicable for conditions that are significantly different from the cases analyzed here. However, the results from this study provide a general guidance for a few frequently observed ABL types typically over the tropical ocean, mid-latitude land surfaces, and sub-tropical ocean on the west coast of major continents. Using independent datasets in similar regions provides further support for applying our results to a rather broad set of conditions. Research Program under Grant No. XDA05110104. The aircraft data provided by NCAR/EOL under sponsorship of NSF (http://data.eol.ucar.edu) are gratefully acknowledged. The National Center for Atmospheric Research is sponsored by the National Science Foundation. The authors appreciate the contributions of the anonymous reviewers whose comments and suggestions led to substantial improvements. The first author is grateful to Dr. Robert Bornstein for his interest and encouragement.
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.