1 Introduction

The diurnal evolution of the urban boundary layer (UBL) is critical for a complete air-quality assessment. The height of the UBL determines the extent of vertical mixing, and together with the emission of toxic gases and particles and their chemical processes, by regulating air-quality variations within the urban dome. Such UBL dynamics are controlled by the surface roughness, land cover and evapotranspiration of the city, thermal mixing and the anthropogenic heat emission (Stull 2012; Oke et al. 2017).

During daytime, convection and turbulence created by surface heating lead to the gradual growth of the UBL, mixing pollutants within the convective mixed layer. It is well known that within the mixed-layer vertical profiles of potential temperature, water vapour, pollutants concentration, and wind speed and direction tend to be uniform with height (Stull 2012; Oke et al. 2017). At the top of the mixed layer is the entrainment zone, where exchange with the free atmosphere occurs, exchanging cleaner or dirtier air with the UBL depending on regional atmospheric pollution. At night the UBL shrinks due to cooling at the surface inhibiting vertical mixing and leading to evolution of the nocturnal boundary layer (NBL). Above the NBL lies the residual layer comprising remnants of the UBL from the previous afternoon (Stull 2012; Oke et al. 2017).

As the mixed layer evolves during daytime, primary pollutants from emission sources such as vehicular traffic and industry are diluted within a larger volume of air, leading to cleaner air when photochemical production and the advection of polluted plumes have minor contributions. In contrast, after sunset, air quality deteriorates in the NBL, particularly with strong emissions of pollutants.

Despite the need for accurate knowledge of the diurnal evolution and dynamics of the UBL, continuous monitoring of the UBL is rarely performed. Radiosondes usually launched once or twice per day by meteorological services are often used to retrieve the UBL height (ZΔT). Recent improvements to remote sensing instruments, such as thermodynamic microwave radiometers, radar wind profilers and lidars have allowed continuous monitoring of boundary-layer evolution. Among these instruments, automatic lidars and their commercial version, ceilometers, offer a low-maintenance and low-cost solution to the continuous monitoring of the mixed layer using aerosol backscatter during daytime, as well as the NBL and residual layer at night-time (e.g., Kim et al. 2007; Haman et al. 2012; Pandolfi et al. 2013; García-Franco et al. 2018; Kotthaus and Grimmond 2018b).

Ceilometers provide vertical profiles of aerosol backscatter at high spatial resolution (≈ 15 m) up to 5–8 km and at temporal resolutions of 1–30 s. These continuous observations require reliable post-processing methods to estimate the height of the mixed layer (ZML). Methods of varying complexity have been proposed and evaluated for the urban atmosphere, e.g., Hayden et al. (1997), Menut et al. (1999), Steyn et al. (1999), Seibert et al. (2000), de Haij et al. (2006), Eresmaa et al. (2006, 2012), Sicard et al. (2006), Münkel et al. (2007), Emeis et al. (2007), Compton et al. (2013), Sokół et al. (2014), Lotteraner and Piringer (2016), Kotthaus et al. (2016), Tang et al. (2016), Caicedo et al. (2017) and Kotthaus and Grimmond (2018a). Among the available methods used to retrieve ZML, the gradient method has been widely used because of its relative ease of application. Here, we evaluate five variants of this method and propose an improved simple scheme of the same method based on parameter optimization of empirical coefficients. The parameters need to be adjusted to local environmental characteristics with respect to noise filtering and the relationship between the aerosol mixing layer and temperature inversion in the boundary layer, which has not been documented clearly in previous studies. The performance of these six retrieval methods is tested using a dataset of attenuated aerosol backscatter profiles based on a Vaisala CL-31 ceilometer, and using high temporal resolution radiosondes launched simultaneously in the metropolitan area of Seoul, Korea.

To investigate the relationship between the evolution of the UBL and air pollution, we investigate the application of inverse regressions of first order between the retrieved values of ZML and air-quality data from the local monitoring network. Even though Korea has implemented stringent control measures aimed at reducing the domestic emission of pollutants, the rapid industrialization and urbanization of East Asia have hampered to some extent such measures, e.g., Lee et al. (2011, 2013), Vellingiri et al. (2015), Chambers et al. (2017) and Tang et al. (2018). To investigate the impact of regional air pollution over Seoul we use data from a background air-quality monitoring station to determine the height at which local pollution reaches its regional levels for the cases of sulfur dioxide (SO2), carbon monoxide (CO), nitrogen dioxide (NO2), ozone (O3), and particulate matter of diameter smaller than 10 µm (PM10).

2 Methods and Aerosol Backscatter Retrieval Evaluation

The study is based on a 2-month campaign (1 May–28 June 2016) of continuous backscatter profile measurements using a ceilometer installed on the rooftop of a tall building at Yonsei University within the urban core of Seoul (37.57°N, 128.98°E, building height: 30 m, 80 m above sea level). On 29 December 2016, eighteen radiosondes were launched from the same building to test the performance of the retrieval algorithms based on the gradient method described above.

We firstly provide a complete description of the ceilometer measurements and ZML retrieval procedure. Then, the proposed parametrization of empirical coefficients for an improved retrieval algorithm based on the gradient method is introduced step by step, followed by details of the radiosonde measurements used for its evaluation. Results of this evaluation and comparisons against the other five algorithms are discussed, and finally, information on the air-quality data used to investigate the relationship between UBL dynamics and air pollution is provided.

2.1 Ceilometer Measurements

As already indicated, a laser ceilometer (CL-31, Vaisala Inc., Finland) was used to investigate UBL evolution over Seoul. This commercial ceilometer is equipped with an eye-safe 910-nm wavelength indium–gallium–arsenide diode laser and a single coaxial-design lens used for both an emitter and a receiver (firmware version 2.027). It has a maximum detection height of 7.5 km, with spatial and temporal resolutions of 10 m and 16 s, respectively. The backscatter readings were recorded and post-processed using an averaging period of 1 h to obtain ZML data.

2.2 Boundary-Layer Height Retrieval Procedure

A ceilometer measures the attenuated backscatter (B(z)) of light due to aerosols, including cloud and fog droplets, and particulate matter where z indicates the zth level of ceilometer data from the bottom (i.e., the actual height is 10z m). Accordingly, higher aerosol concentrations within the UBL create a sharp variation of B(z) across the top of the UBL, enabling a measurement of ZML to be made. Among the available methods used to retrieve ZML from attenuated aerosol backscatter vertical profiles, the gradient method has been widely used because of its ease of application compared to other methods, and accurate results when it is properly adjusted to the conditions of the site and hardware characteristics.

The gradient method defines ZML as the height of the minimum vertical gradient of B(z), which can be determined as the largest negative peak of the first derivative of B(z) (Hayden et al. 1997; Flamant et al. 1997) or by the second derivative (Menut et al. 1999). Similarly, the largest negative gradient in the logarithm of B(z) can also be used to detect the top of the mixed layer (Senff et al. 1996). Emeis et al. (2007) refined the gradient method to detect the inflection point using the derivative of B(z) and five adjustable parameters. Recently, based on Emeis et al. (2007) and Kotthaus and Grimmond (2018a) proposed an algorithm to retrieve ZML from a ceilometer by considering cloud cover, type and height to reduce false-layer selection. Here, we extend the gradient method of Emeis et al. (2007) by focusing on parameter optimization for better filtering of ceilometer noise. It is also based on the first and second derivatives, but only three parameters need to be adjusted.

In brief, the algorithm developed by Emeis et al. (2007) is as follows. Prior to the determination of the inflection point, the overlap and range of B(z) are averaged over time and height to suppress noise artefacts. The noise artefacts are produced by ceilometer hardware (e.g., transmitter, electronic noise, and optical noise contaminated by the Sun), firmware and specific processes such as the cosmetic shift that shifts the background signal systematically to help cloud detection (Kotthaus et al. 2016), and so must be filtered effectively to reduce noise contamination of B(z). Then, ZML values are determined, firstly within a vertical-moving average window (wv) from 140 m to 500 m at vertical intervals (∆z) of 80 m over averaging periods (wt) of 15 min, and then, in a layer between 500 m and 2000 m using ∆z = 160 m. The minimum threshold value of B(z) (Bmin) and maximum threshold value of the first derivative (∂B/∂z)maxψ below a lifted inversion must be 200 × 10−9 m−1 sr−1 and < − 0.30 × 10−9 m−2 sr−1 in the lower layer, and 250 × 10−9 m−1 sr−1 and < − 0.60 × 10−9 m−2 sr−1 in the upper layer. The first and second derivatives are calculated as:

$$ \left. {\frac{\partial B}{\partial z}} \right|_{z} = \frac{{B\left( {z + \frac{\Delta z}{2}} \right) - B\left( {z - \frac{\Delta z}{2}} \right)}}{\Delta z}, $$
(1)

and

$$ \left. {\frac{{\partial^{2} B}}{{\partial z^{2} }}} \right|_{z} = \frac{{\left. {\frac{\partial B}{\partial z}} \right|_{{z + \frac{\Delta z}{2}}} - \left. {\frac{\partial B}{\partial z}} \right|_{{z - \frac{\Delta z}{2}}} }}{\Delta z}, $$
(2)

noting that Bmin occurs when ∂2B/∂z2 passes from positive to negative values. If the conditions regarding Bmin and ∂B/∂z stated above are met, the height of Bmin will correspond to ZML. For details, see Emeis et al. (2007, 2008).

We propose an optimized version of Emeis et al. (2007) method (hereafter denoted as the optimized Emeis method) aiming to reduce potential overestimation in ZML and the number of parameters that need to be adjusted together. Adjustable parameters are related to spatial and temporal averages to filter out ceilometer-reported noise and to preserve meaningful signals. Through comparisons against ZML retrieved from radiosonde data during the period 0500–2300 local standard time (LST) on 29 December 2016 (see Sect. 2.3 for detail), we optimized such parameters, wv, wt and ∆z. The parameters were optimized for daytime and night-time separately based on times of sunrise and sunset and these optimized parameters were evaluated using radiosonde observations of May 2016. Emeis et al. (2007) uses Bmin and (∂B/∂z)max to determine ZML, while the optimized Emeis method estimates Bmin from the ceilometer itself using the averaged B(z) in the free atmosphere during clear night-time conditions to remove erratic changes in B(z). This background correction is implemented to deal with the signal-to-noise ratio of B(z), which is low at high altitudes (making B(z) < 0), and the minimum of ∂B/∂z frequently occurs at high altitude accordingly, leading to abnormally high ZML values (Kotthaus et al. 2016).

The proposed method that retrieves ZML from the attenuated aerosol backscatter vertical profiles can be summarized in four main steps as follows:

  1. (1)

    Vertical and temporal moving averages, wv and wt, are applied to minimize noise artefacts when computing the first and second derivatives according to Eqs. 1 and 2. These parameters are optimized for daytime and night-time separately, considering only one layer and not two as in Emeis et al. (2007), so reducing requirements to one set of parameters per retrieval.

  2. (2)

    Inflection points are detected from ∂2B/∂z2, when its values are negative and pass from positive to negative.

  3. (3)

    The three lowest heights among the inflection points are selected as ZML potential candidates (\( Z_{\text{ML}}^{{\prime }} \)) by

    $$ Z_{ML}^{{\prime }} = z, if \left( {\begin{array}{*{20}c} { \frac{\partial B}{\partial z} < 0,} \\ {\frac{{\partial^{2} B}}{{\partial \left( {z - 1} \right)^{2} }} \le 0,} \\ {\frac{{\partial^{2} B}}{{\partial z^{2} }} > 0,} \\ {{\text{and}}\;B\left( z \right) > B_{min} } \\ \end{array} } \right) $$
    (3)

    The averaged B(z) should be larger than Bmin, after applying the background correction filter suggested by Kotthaus et al. (2016).

  4. (4)

    ZML values are decided as the height of the smallest value of ∂B/∂z among the three inflection points selected as potential candidates (\( Z_{\text{ML}}^{{\prime }} \)). The maximum ∂B/∂z, which was used in Emeis et al. (2007), is not required accordingly.

The performance of the optimized Emeis method is evaluated using a set of ZΔT values retrieved from radiosondes profiles and ZML values obtained from the original algorithm of Emeis et al. (2007) and other four versions of the gradient method. The latter include ZML values derived from the first derivative (FIR), second derivative (SEC), logarithmic derivative (LOG), and CL-31 built-in software (Vaisala sky condition, VAI) (Münkel et al. 2007; Vaisala Oyj 2011). Table 1 lists the main characteristics and parameters to adjust for each one of these versions of the gradient method.

Table 1 Algorithms based on the gradient method used to retrieve ZML from attenuated aerosol backscatter vertical profiles measured by a commercial ceilometer

Through a sensitivity analysis, the appropriate values of wv, wt and ∆z are determined for FIR, SEC, LOG, the original Emeis et al. (2007) and the optimized Emeis method algorithms separating daytime and night-time periods. The optimum values were chosen as those for which the root-mean-square error (RMSE) of the derived ZML values against ZΔT values obtained from radiosonde data presents a minimum. The statistical metrics RMSE, mean bias error (MBE) and correlation coefficient (r2) are defined as

$$ RMSE = \sqrt {\frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left( {Z_{{{\text{ML}}, i}} - Z_{{\Delta {\text{T}}, i}} } \right)^{2} } , $$
(4)
$$ MBE = \mathop \sum \limits_{i = 1}^{N} \left( {Z_{{{\text{ML}}, i}} - Z_{{\Delta {\text{T}}, i}} } \right), $$
(5)

and

$$ r^{2} = \left[ {\frac{{\mathop \sum \nolimits_{i = 1}^{N} \left( {Z_{{{\text{ML}}, i}} - \overline{{Z_{\text{ML}} }} } \right)\left( {Z_{{\Delta {\text{T}}, i}} - \overline{{Z_{{\Delta {\text{T}}}} }} } \right)}}{{\sqrt {\mathop \sum \nolimits_{i = 1}^{N} \left( {Z_{{{\text{ML}}, i}} - \overline{{Z_{\text{ML}} }} } \right)^{2} \mathop \sum \nolimits_{i = 1}^{N} \left( {Z_{{\Delta {\text{T}}, i}} - \overline{{Z_{{\Delta {\text{T}}}} }} } \right)^{2} } }}} \right]^{2} , $$
(6)

where N is the number of data, subscript i is the ith datum, and an overbar denotes the average of a variable. These parameters could not be optimized for the VAI algorithm, and so the parameters proposed by Münkel et al. (2007), included in the manufacturer’s software, were used instead. It is important to note that the suggested values for these parameters depend not only on boundary-layer properties, such as temperature gradient and aerosol concentration, but also on ceilometer characteristics (e.g., firmware, hardware, and raw data acquisition setting), thus showing spatio–temporal variations. The optimized parameters were tested in another season, May 2016. Section 2.4 discusses the performance of the six different retrieval methods using as reference ZΔT values derived from radiosondes.

2.3 Radiosonde Measurements

A series of air temperature and wind profiles derived from radiosondes (RS41-SG, Vaisala, Finland) are used to evaluate the performances of the ZML retrieval algorithms. Radiosonde pressure measurements have a resolution of 0.01 hPa and an accuracy of 0.3–1.0 hPa, temperature has a resolution of 0.1 °C, an accuracy of 0.1 °C to 0.4 °C, and a response time of 0.5 s, humidity has a resolution of 0.1%, an accuracy of 2% to 4%, and a response time of < 0.3 s at 20 °C. Wind speed and direction have resolutions of 0.1 m s−1 and 0.1° and accuracies of 0.15 m s−1 and 2°, respectively. The averaged ascent rate of the radiosonde was ≈ 2 m s−1 giving a high vertical resolution.

Eighteen radiosondes were launched on 29 December 2016 from the rooftop of the building, upon which was located the CL-31 ceilometer, at intervals of 1 h starting at 0500 LST and ending at 2300 LST. Sky conditions were clear during the radiosonde observations, except for the first launch at 0500 LST when the sky was partially overcast. ZΔT values derived from the radiosonde data were determined as the layer at the maximum of potential temperature gradient during daytime (Seibert et al. 2000; Seidel et al. 2010) and top of the inversion layer at the lowest altitude during night-time (Seidel et al. 2010). The thermodynamic profiles shown in Fig. 1 depict representative profiles of potential temperature, the vertical gradient of potential temperature and wind speed, during daytime and night-time.

Fig. 1
figure 1

Representative thermodynamic profiles at day and night retrieved from radiosondes launched on 29 December 2016 at 1300 LST and 2300 LST, respectively. Profiles of potential temperature, vertical gradient of potential temperature (dϴ/dz), and wind speed are depicted in in red, green, and black, respectively. The retrieved ZΔT is indicated by a red dotted line

For further verification, the algorithms including the adjusted parameters were tested against ZΔT values retrieved from radiosonde observations conducted during the Korea–United States Air Quality (KORUS-AQ) field campaign (Tang et al. 2018). This was an intensive campaign in which four radiosondes were launched during the daytime for four days of May 2016 at the Olympic Park in Seoul.

2.4 Air-Quality Data

Air-quality data from 38 monitoring stations across the metropolitan area of Seoul and the national background monitoring station of Baengnyeong Island were used to investigate the relationship between the evolution of the UBL and air pollution at ambient level. The air-quality data included concentrations of SO2, CO, NO2, O3 and PM10. The air-quality monitoring station of Baengnyeong Island is located 200 km to the west of Seoul and 250 km from the Shandong Peninsula, China, no urban settlement or large emission source affects Baengnyeong Island, and therefore we assume its air quality is representative of the regional background.

All monitoring stations have sampling rates of 5 min with averaging periods of 1 h. Measurements of SO2, CO, NO2, O3, and PM10 were based on pulse ultraviolet fluorescence (SA-731, KIMOTO, Japan), non-dispersive infrared (CA-751, KIMOTO, Japan), chemiluminescent (NA-721, KIMOTO, Japan), ultraviolet photometric (OA-781, KIMOTO, Japan) and β-ray absorption (PM-711, KIMOTO, Japan) methods, respectively. Quality assurance was applied based on guidelines for the air-quality monitoring network of the Korean Ministry of Environment (KME 2016).

3 Results

We present diurnal variations of ZML retrieved from the aerosol backscatter measurements during the 2-month campaign and companion air-quality data. The latter are used to investigate the dilution effect on air pollution within the city as the mixed layer evolves during daytime through the application of inverse regressions of first order between ZML and pollutant concentrations.

3.1 Evaluation of Six Gradient Methods Used to Retrieve Ceilometer Mixed-Layer Height

The optimal wv and wt values as a function of ∆z were determined using the smallest RMSE values between the estimated ZML by the optimized Emeis method and the FIR, SEC, LOG and original Emeis et al. (2007) retrieval algorithms, and ZΔT obtained from the set of radiosonde profiles. The parameters in the VAI algorithm provided by the ceilometer manufacturer could not be adjusted. Table 2 shows the values of the adjusted parameters and Table 3 the respective statistical metrics (RMSE, MBE, r2 for ZML) used to evaluate the algorithms’ performance. As expected, the optimized Emeis method showed the best performance during both, daytime and night-time. During daytime the adjusted algorithms (i.e., FIR, LOG, and original Emeis et al. 2007) and the VAI algorithm also showed good performance, but not during night-time. As expected, the FIR, LOG, and original Emeis et al. (2007) algorithms improved their performances once the adjusted parameters were introduced. The SEC algorithm based on the second derivative showed a consistent underestimation in ZML during daytime and early morning, while the adjusted algorithm of original Emeis et al. (2007) showed overestimations during the complete diurnal course, yielding MBE = 372 (1137) m during daytime (night-time).

Table 2 Adjusted parameters obtained for each retrieval algorithm evaluated against the radiosonde observations of 29 December 2016
Table 3 Statistical metrics obtained for each adjusted retrieval algorithm evaluated against the two verification sets of radiosonde observations

Daytime convective conditions are favorable for slowly-varying large eddies compared to night-time conditions (Stull 2012). The moving average operates as a low-pass filter and its frequency response is a sine cardinal function as

$$ {\text{sinc}}\left( x \right) \equiv \frac{\sin \left( x \right)}{x}, $$
(7)

where x is frequency (see Finnigan 2006). In this perspective, the larger wv values during daytime remove contamination from high frequency noise. At night, the estimation of ZML becomes more uncertain since smaller wv values are needed to resolve smaller boundary-layer structures from high frequency noise. A larger averaging time compensates for uncertainties due to smaller wv and suggests that the boundary layer does not change abruptly through night. During daytime the entrainment zone was relatively larger and produced larger ∆z and wv values thanks to the turbulent mixing above the temperature inversion arising from the presence of strong updrafts and entrainment with the overlaying free atmosphere.

Figure 2 shows the evolution of ZΔT obtained from the radiosonde data and six different retrieval algorithms for the evaluated period of 18 h. The radiosonde retrievals exhibit the typical diurnal variation of ZΔT (Stull 2012), with growth of the convective mixed layer commencing after sunrise at 0900 LST, and reaching a maximum of 1270 m at noon, and heights over 1000 m during the afternoon. After 1600 LST the mixed layer starts to collapse, and by 1800 LST a NBL of depth ≈ 200 m has been formed. The intense nocturnal release of heat stored by the urban fabric during daytime (Hong and Hong 2016) is apparently capable of maintaining such a depth throughout the rest of the night.

Fig. 2
figure 2

Hourly ZΔT and ZML values retrieved on 29 December 2016 from thermodynamic radiosonde profiles (RAD) and attenuated aerosol backscatter vertical profiles measured by a commercial ceilometer in combination with the retrieval algorithms (see Table 1 for the definitions) evaluated in this study based on the gradient method, respectively

Notably, the improved version of the original gradient method of Emeis et al. (2007) with the parameter optimization tuned to the local conditions, the optimized Emeis method, was able to reproduce the observed diurnal variation of the mixed layer over the urban region. During daytime, no major difference was observed against the ZΔT values retrieved from radiosonde profiles. The morning growth and evening collapse of the mixed layer were well reproduced. At night-time, the presence of inflection points in the vertical profile of ZΔT originated by leftover constituents from the daytime UBL might yield a minimum gradient in B(z) within the residual layer and mislead the ZML estimation. As pointed out previously, this issue prevents a reliable ZML retrieval during the whole diurnal course.

Indeed, during night-time, especially after 1800 LST, the optimized Emeis method underestimated the NBL height, despite the parameters’ adjustment. In the early morning (i.e., before 0800 LST) this disagreement was not clear since two of three retrievals were overestimated as a probable consequence of the differences in the methods to determine ZML and ZΔT values when using ceilometer and radiosonde data, as discussed in Caicedo et al. (2017). However, the optimized Emeis method provided more reliable ZML estimations, suggesting that the optimization of parameters is fundamental to obtaining accurate ZML retrievals. The radiosondes launched in May 2016 were used for further verification and confirmed that parameter adjustment is necessary to reproduce the daytime boundary-layer height (Table 3).

The improved performance shown by the optimized Emeis method is explained by, (i) the combination of first and second derivatives of B(z) to select inflection points in the attenuated aerosol backscatter vertical profile, thus reducing the number of parameters to adjust and lessening the overestimation yield by the original algorithm of Emeis et al. (2007); (ii) the Bmin value based on the averaged background B(z) during night-time in the free troposphere under clear conditions helps remove erratic changes in B(z); and (iii) the parameters are adjusted using direct measurements of ZΔT obtained from local radiosonde observations.

3.2 Diurnal Variation in the Mixed-Layer Height

Figure 3 shows the time series of ZML retrieved throughout the complete campaign using the optimized Emeis method, and Fig. 4 shows the ensemble diurnal variations of the mixed-layer evolution for each monitored month. The daily profiles during both months are similar to those obtained from the radiosonde thermodynamic profiles shown in Fig. 2. The mean (range) of ZML during daytime (0500–2000 LST) in May and June is 1007 (838–1149) m and 1112 (966–1276) m, respectively. Similarly, maximum heights of 1342 (1194–1628) m and 1391 (1303–1582) m are reached at 1400 LST in May and June. The convective mixed layer starts to collapse consistently after 1700 LST, and by 1900 LST it has completely collapsed, giving place to a shallow NBL of 160 (298–570) m and 390 (233–649) m. The minimum ZML value is 90 (90–166) m and 100 (90–112) m at 2000 LST in May and at 1900 LST in June, respectively.

Fig. 3
figure 3

Time series of ZML retrieved using the improved gradient method proposed here, the optimized Emeis method, for the complete 2-month campaign (May–June 2016). Precipitation events are indicated by blue bars

Fig. 4
figure 4

Box plots of ZML retrieved by the improved gradient method proposed here, the optimized Emeis method, for each monitored month. The horizontal lines and black dots within each box are median and mean hourly values, respectively. The top and bottom of the boxes represent the 75th and 25th percentiles, while the whiskers extend to the 10th and 90th percentiles, respectively

On six (four) days the mixed layer reached heights over 1500 m in May (June). Similarly, 45 (79)% of the days registered maximum ZML > 1250 m, 45 (21)% between 1000 and 1250 m, and 10 (0)% < 1000 m. Regarding the NBL height, 41 (21)% of the days presented depths < 200 m, 36 (36)% between 200 and 300 m, and 23 (43)% > 300 m. Although a shallower convective mixed layer and shallower NBL were reported in May than in June, the differences between both months were not statistically significant (p > 0.05). The slightly smaller ZML values and larger day-to-day variability shown by the vertical bars in Fig. 4, were due to more frequent precipitation events in May.

It is well known that ZML-retrieval algorithms based on the gradient method do not provide reliable values during and after periods affected by rain (e.g., Haman et al. 2012; Caicedo et al. 2017; Kotthaus and Grimmond 2018a). Our retrievals also showed suspicious variations in ZML during rainfall, noting that 12 precipitation events occurred during the campaign and the ceilometer measured very large B(z) values as a consequence of raindrop interference along the optical path. In general, clouds produce significant changes in B(z), but boundary-layer clouds can be resolved in B(z). Our proposed method defines the cloud base as the boundary-layer top. However, in the case of clouds within the boundary layer, typical properties of the boundary layer cannot be clearly defined (Stull 2012). Accordingly, our analysis of air pollutants in the next section focuses on clear daytime conditions.

On two occasions, the convective mixed layer did not collapse rapidly after sunset according to our retrievals, instead it continued evolving until the early morning. Obviously, this was a failure of the optimized Emeis method to retrieve ZML at night-time. On those days, an abrupt collapse of the convective mixed layer detached the residual layer from the surface forming a pool of polluted air rich in particles. This misled the retrieval algorithm, and further work is needed to differentiate the top of the NBL from the residual layer in such conditions.

3.3 Air-Quality Diurnal Variations

Figure 5 shows the monthly ensemble diurnal patterns of ambient concentration for the five criteria pollutants monitored at the background station of Baengnyeong Island and across Seoul’s metropolitan area that are used to investigate the relationship between local air quality and ZML. We can divide these five pollutants into three categories according to their diurnal pattern characteristics: SO2, CO, and NO2 form a first category (Fig. 5), and primary emitted pollutants whose ambient concentrations follow closely the diurnal pattern of anthropogenic activities within urban areas. They show the typical morning and evening rush-hour peaks caused primarily by traffic emissions. The evolution of the convective mixed layer throughout the day and the photochemical activity determine the size of these peaks. As expected, the concentrations of these pollutants at the backroom site are relatively constant throughout the day and smaller to those observed within Seoul.

Fig. 5
figure 5

Ensemble diurnal patterns of ambient concentrations of SO2, CO, NO2, O3, and PM10 within the metropolitan area of Seoul (box plot) and at the background monitoring station of Baengnyeong Island (red line) during May (left) and June (right) 2016. Seoul’s concentrations include data from 38 stations across the city. The horizontal lines and triangles within each box are median and mean hourly values, respectively. The top and bottom of the boxes represent the 75th and 25th percentiles, whiskers extend to the 10th and 90th percentiles, while the black dots indicate the 5th and 95th percentiles, respectively

PM10 forms a second category of pollutants, noting that its ambient concentrations do not present a bimodal diurnal profile as with pollutants of the first category. The difference between the background and urban concentrations is small or null, but during episodes associated with westerly winds the concentrations at the Baengnyeong Island are on occasions significantly higher. A cross-correlation analysis between concentrations at both locations shows a consistent time lag of ≈ 7 h, particularly during high concentration events (not shown). This result suggests that an important fraction of PM10 over Seoul has a long-distance origin, decreasing the relevance of locally-emitted particles (Lee et al. 2011, 2013).

The third category covers pollutants of secondary origin, such as O3 in our case. The diurnal variations of O3 in Seoul, like in many other cities, are inversely related to those of many primary emitted pollutants. In general, solar radiation triggers its formation in the presence of two major groups of precursor species, volatile organic compounds (VOCs) and nitrogen oxides (NOx = NO + NO2) (see Vellingiri et al. (2015) and Iqbal et al. (2014) for detailed information about Seoul’s O3 pollution). Thus, the buildup of O3 starts after sunrise and reaches its peak at ≈ 1500 LST, decreasing gradually throughout the rest of the day. At night-time the O3 removal by titration is enhanced by fresh emissions of NO that accumulate in the shallow NBL. Although ZML is still large at ≈ 1600 LST and the dilution effect is still strong, the maximum O3 concentration is usually recorded at this time. The hour of the O3 peak can be explained as follows: (i) it is the time when the photochemical O3 formation overwhelms the dilution effect, and (ii) the mixed layer reaches its maximum depth, enhancing the entrainment of polluted air originating elsewhere. Unlike SO2, CO, and NO2, the background concentration of O3 is higher than in Seoul most of the time as shown in Fig. 5 and reported in literature (e.g. Zhang and Rao 1999; Schäfer et al. 2006; Shiu et al. 2007).

3.4 Relationship Between Mixed-Layer Height and Air Pollution

To investigate the dilution effect on air pollution caused by the evolution of the convective mixed layer during daytime over the metropolitan area of Seoul, we applied a simplified two-layer mixing model between the mixed layer and overlying free atmosphere. This model returns inverse relationships of first order between the observed concentrations of pollutants and ZML, under the assumptions that (i) concentrations reported by the air-quality monitoring network are representative of the pollution within the entire mixed layer across the city, (ii) the entrainment process between the mixed layer and free atmosphere occurs over a relatively short period, such that changes in ZML and pollutant concentration as a function of time are negligible, (iii) pollutants contribution by advection can be neglected, (iv) emissions and deposition can be also omitted, and (v) photochemical activity is much less important than the stretching of the mixed layer.

Figure 6 shows the inverse regressions of first order for each of the pollutants analyzed here. The two-layer mixing model does indeed capture the variability in concentrations as a function of ZML, with PM10, CO, SO2, and NO2 showing negative correlations, in contrast to O3. These results are consistent with similar studies over urban regions (Schäfer et al. 2006; Tang et al. 2016; Leng et al. 2016). Correlation coefficients varied from 0.31 for O3 to 0.76 for CO, with higher correlation for CO explained by its low chemical reactivity and main origin in anthropogenic sources within the city. NO2 and PM10 also showed high correlations of 0.60 and 0.66, respectively. The correlations suggests that the evolution of the mixed layer and entrainment process with the free atmosphere have an important role in Seoul’s air quality.

Fig. 6
figure 6

Inverse regressions of first order between retrieved ZML and pollutant concentrations during daytime (0900–1800 LST) showing the dilution effect caused by the evolution of the convective mixed layers. Hourly data from May and June 2016 are included. Periods affected by precipitation are not considered and error bars indicate standard errors. Each point is an ensemble of 20 points

All pollutants reach a point in the inverse regressions in which their concentrations reduce substantially the rate of change (increase or decrease) regarding to ZML. For O3, this point occurs for ZML = 1100 m and a related concentration of ≈ 50 ppb (see Fig. 6d). At this height, the rate-of-change of concentrations of the other pollutants have also significantly reduced (i.e. tend to become asymptotic), suggesting that O3 has reached its background concentration. As already mentioned, the PM10 pollution over Seoul has also a strong trans-boundary component, and therefore this equilibrium point suggests a potential background contribution up to 50 µg m−3. These results highlight the relevance of the regional air pollution in Seoul’s air quality.

4 Summary and Conclusions

This study evaluated the dilution effect related to the evolution of the mixed layer by measuring the temporal variations of ZML and using air-quality data over Seoul’s metropolitan area during two polluted months. A commercial ceilometer was used to measure continuously attenuated aerosol backscatter vertical profiles from which ZML values were subsequently retrieved. Five different retrieval algorithms based on the gradient method were adjusted to the local conditions and instrument characteristics using high-temporal-resolution radiosonde observations. A sensitivity analysis showed that all adjusted methods were able to reproduce the observed ZΔT values obtained from daytime radiosonde launches, but not from radiosondes launched at night. Hence, it was found that pre-adjusted gradient methods can be used to retrieve reliable height estimations of the convective mixed layer, but not of the NBL. Based on this result, we proposed an improved algorithm by optimizing the vertical and temporal moving averages for daytime and night-time separately to minimize noise artefacts when computing the first and second derivatives of the attenuated aerosol backscatter that define ZML following the gradient method developed by Emeis et al. (2007). This new algorithm, the optimized Emeis method, reduces the number of parameters that have to be adjusted and includes a background noise correction. Its application improves the estimates of ZML, at day and night. However, a consistent underestimation in the NBL height, particularly during the early morning, was not solved and further analysis is therefore needed.

The relationship between the evolution of the convective mixed layer and daytime air pollution was investigated using a two-layer mixing model under the assumption that changes in pollutant concentration depend only on the urban boundary-layer growth and the entrainment of air from the free atmosphere. High negative correlations were found for primary-emitted pollutants such as NO2, CO, SO2 and PM10, and a modest positive correlation for O3. Using air-quality data from a remote background site, the ZML values at which the local O3 and PM10 concentrations reach those at regional scale was found to be 1100 m. The mixed layer reached this height after midday when the highest O3 concentrations were also reported. The background concentrations of NO2, CO, SO2 were consistently lower than those in Seoul, whereas concentrations of PM10 and O3 were similar or higher, respectively.

Our results provide insight into the significant influence of urban boundary-layer evolution and regional air pollution on Seoul’s air quality. Accurate and continuous measurements of the boundary-layer height are therefore needed for a complete air-quality assessment. In this context, ceilometers in combination with retrieval algorithms adjusted to the local conditions represent a means of obtaining reliable information on vertical mixing and atmospheric structure above urban areas.