Advertisement

Probabilistic Nowcasting of Low-Visibility Procedure States at Vienna International Airport During Cold Season

  • Philipp Kneringer
  • Sebastian J. Dietz
  • Georg J. Mayr
  • Achim Zeileis
Open Access
Article

Abstract

Airport operations are sensitive to visibility conditions. Low-visibility events may lead to capacity reduction, delays and economic losses. Different levels of low-visibility procedures (lvp) are enacted to ensure aviation safety. A nowcast of the probabilities for each of the lvp categories helps decision makers to optimally schedule their operations. An ordered logistic regression (OLR) model is used to forecast these probabilities directly. It is applied to cold season forecasts at Vienna International Airport for lead times of 30-min out to 2 h. Model inputs are standard meteorological measurements. The skill of the forecasts is accessed by the ranked probability score. OLR outperforms persistence, which is a strong contender at the shortest lead times. The ranked probability score of the OLR is even better than the one of nowcasts from human forecasters. The OLR-based nowcasting system is computationally fast and can be updated instantaneously when new data become available.

Keywords

Aviation meteorology low visibility probabilistic nowcasting statistical forecasts ordered logistic regression 

1 Introduction

Low-visibility conditions at airports impact air traffic regarding aviation safety and economic efficiency of airports and airlines. The low-visibility procedures (lvp) come into force when horizontal and/or vertical visibility fall below airport-specific thresholds. Additional measures, e.g., increasing spacing between approaching and taxiing aircraft, ensure safe operations but also reduce the capacity of the airport. Consequently, planes might be delayed, diverted to alternative airports or prevented from taking off. Hence, reliable low-visibility forecasts are needed for tactical planning of the aircraft movements within the next few hours.

The two major weather phenomena producing low-visibility conditions are fog and low ceiling. Fog development and dissipation generally depend on temperature, the humidity and the available condensation nuclei of an air mass. Radiative cooling, change of the total water amount due to precipitation or mixing of air masses are only a few processes changing these important parameters (Gultepe et al. 2007). However, despite various field experiments (e.g., Gultepe et al. 2009; Haeffelin et al. 2009; Bergot 2016) the overall effect of some physical processes is still unknown. The complexity of the processes driving low visibility makes predictions challenging.

Physical modeling with numerical weather prediction models and statistical modeling are two general low-visibility forecasting approaches (Gultepe et al. 2007). The statistical approaches are data-driven and computationally faster. Model parameters are estimated on an archive data set and then applied to new data to forecast. The choice of the statistical model depends on the desired form of the forecast variable (continuous or categorical) and the type of the forecast output (deterministic or probabilistic). Regression models were among the first statistical forecasting approaches applied to continuous variables with linear regression (Bocchieri and Glahn 1972) and were extended to binary and multinomial categorical variables (e.g. Hilliker and Fritsch 1999; Herman and Schumacher 2016). Some machine-learning methods have also been used for low-visibility forecasts; for example, tree-based methods (Dutta and Chaudhuri 2015; Bartokov et al. 2015; Dietz et al. 2017) and artificial neural networks (Bremnes and Michaelides 2007; Marzban et al. 2007; Fabbian et al. 2007).

Since risk management and decision making depend heavily on the probabilistic information (Murphy and Winkler 1984). Probabilistic forecasts are essential to make safe and economic decisions especially for an application such as air traffic regulation. One option for airport visibility forecasts is to predict the horizontal and vertical visibility separately and determine lvp afterwards. However, since horizontal and vertical visibility are not statistically independent, there is no obvious way to obtain the combined probability. Instead, the variable of interest to aviation end-users, the lvp categories, should be forecast directly.

In this paper, we present a new way to generate probabilistic lvp state forecasts for the next 2 h with a statistical regression method. The method of choice is ordered logistic regression (OLR) to capture the categorical and ordered nature of the end-user forecast variable, which is based on fine-grained visibility and ceiling thresholds. Due to the interest in short lead times, the nowcasting system is exclusively based on point measurements (see Vislocky and Fritsch 1997). The performance of the nowcasting system is compared to climatology, persistence, and human forecasts. The methodology to develop the nowcasting system is shown in Sect. 2. Section 3 examines the area of investigation and the data used. Section 4 presents the results of the nowcasting system which are discussed in the final section.

2 Methods

2.1 Ordered Logistic Regression

The method used to develop a probabilistic lvp nowcasting system is OLR. This method allows prediction of the probabilities for all categories of an ordered response, such as lvp, within one consistent model. Additionally, OLR has the benefit of providing a fast update cycle with low computational costs.

The OLR model describes an ordered categorical variable by assuming an underlying continuous variable mapped with thresholds to the categories. The threshold coefficients and predictor coefficients are determined during model estimation. An arbitrary number of predictors is possible, similar to multiple linear regression. The occurrence probability of the individual categories can be determined by evaluating the chosen error distribution function at the lower and upper thresholds of a category. In mathematical notation the OLR model is described as follows:

Each observation of the response falls into a ordered category \(j=(1,2,\ldots ,J)\). The ordering is from no impact to highest impact on airport operations in this case. These ordinal response \(y_{i}\) is modeled by assuming a continuous auxiliary variable \(y_{i}^{*}\) capturing visibility. For this variable a linear model
$$\begin{aligned} y_{i}^*= \mathbf {x}_{i}^\top \beta + \varepsilon _{i} \end{aligned}$$
(1)
holds, where \(i=(1,2,\ldots ,n)\) is the index over the observations. But \(y_{i}^{*}\) is not observed directly, therefore, the observation i is modeled to category j by the thresholds
$$\begin{aligned} \alpha _{j-1} > y_i^* \ge \alpha _{j} \end{aligned}$$
(2)
Equation 1 shows the deterministic component as \(\mathbf {x}_{i}^\top \beta\) and the random term \(\varepsilon _{i}\) which is assumed to be i.i.d. with zero mean. The vector \(\mathbf {x}_i=(x_{i,1},\ldots ,x_{i,m})\) includes all the m predictors and \(\beta =(\beta _1, \ldots ,\beta _m)\) includes their coefficients. The thresholds \(\alpha =(\alpha _0, \ldots ,\alpha _J)\) are determined together with the predictor coefficients \(\beta\) when estimating the OLR model. The lowest and highest threshold values are fixed at the values \(\alpha _0 = -\infty\) and \(\alpha _J = \infty\). To access the probabilities of the categories an error distribution needs to be selected. Typical distributions are the standard normal and the logistic distribution. The logistic distribution accounts better for observations in the tails of the distribution (Winkelmann and Boes 2006). We select the logistic distribution with its cumulative distribution function:
$$\begin{aligned} H_\mathrm{logit}(\cdot ) := \frac{\exp (\cdot )}{1+\exp (\cdot )}, \end{aligned}$$
(3)
with \(H(-\infty ) = 0\) and \(H(\infty ) = 1\). The probabilities for the categories are derived with the cumulative probability model:
$$\begin{aligned} \sum _{s=1}^{j}\pi _{is} = H_\mathrm{logit}(\alpha _j-\mathbf {x}_i^\top \beta ) = \frac{\exp (\alpha _j-\mathbf {x}_i^\top \beta )}{1+\exp (\alpha _ j-\mathbf {x}_i^\top \beta )}, \end{aligned}$$
(4)
where \(\sum \nolimits _{s=1}^{j}\pi _{is}\) is the cumulative probability that an observation \(y_i\) falls into category j or lower. The probability for the individual category \(\pi _{ij} = P(y_i = j | x_i) = H(\alpha _j-\mathbf {x}_i^\top \beta ) -H(\alpha _{j-1}-\mathbf {x}_i^\top \beta )\) is the difference between the cumulative probabilities at the associated thresholds (Fig. 1).
A notable advantage of this method is the computational speed of the model estimation. It is almost as fast as a linear regression and can be performed instantaneously using standard software. We use the function clm() from the R package ordinal (Christensen 2015), which implements the OLR model with maximum likelihood optimization.
Fig. 1

Probability density function of an OLR model. The shaded area highlights the probability for the category j

2.2 Predictor Selection

The first task in estimating the models is to decide which of the input variables to select and which to omit. We use stepwise selection and verify the forecasts with the ranked probability score (RPS see Sect. 2.3). The initial step of the algorithm estimates the climatology as a first guess. In the next step the variable that most improves the RPS of the model is added. Subsequently this model is used as the new best guess and all remaining variables are tested again. This procedure is repeated until either the model skill does not improve anymore, or no additional variable is left. The variable configuration of the final best-guess model is used to produce the low-visibility forecasts.

2.3 Verification

To test the model and to cover the uncertainty within the model estimation, we perform tenfold cross validation. Therefore, we split the data set into ten parts, use nine parts for training, and do out-of-sample predictions on the remaining test data set. The test data set is exchanged with one part of the previous training data set and again the model estimation and out-of-sample predictions are performed. This is repeated until we have out-of-sample forecasts for all ten parts of the previously split data set, based on ten different models with slightly different training data sets.

To determine the skill of the probabilistic categorical ordered forecasts a proper scoring rule is required. The ranked probability score (RPS) is such a metric (Wilks 2011). It compares the cumulative distribution function of the forecasts and the observations. The RPS of a forecast i is given by:
$$\begin{aligned} \mathrm{RPS}_i = \frac{1}{J-1}\sum _{s=1}^{J}\left[ \sum _{j=1}^{s}(y_{ij} - o_{ij}) \right] ^2 , \end{aligned}$$
(5)
with \(y_{ij}\) the predicted probabilities and \(o_{ij}\) the observations for each category \(j={1,2,\ldots ,J}\). While the predicted probabilities can have continuous values between 0 and 1, the observation is either 0 or 1. The RPS can be interpreted as the normalized shift in categories between the forecast and the observation. In addition, the RPS is normalized by the number of categories \(J-1\) to obtain scaled values within the interval [0, 1]. A perfect forecast has an RPS of zero, the worst forecast has an RPS of 1.

The quality of the prediction is determined by calculating the RPS of all individual observation-prediction pairs within the out-of-sample prediction data set, and averaging them. Bootstrapping is used to estimate the uncertainty due to the limited sample size. Hence, we take 500 random samples of the RPS values from the out-of-sample prediction data set. Each of these samples is taken with replacement and has the size of the full data set. Now we calculate the mean RPS of each random sample. These 500 RPS values are the basis of the results shown in Sect. 4 including the model uncertainty.

2.4 Reference Forecasts

The OLR models are compared to three reference forecasts. The first one is the climatology, which uses the climatological occurrence probability of each category (Fig. 2b) as forecast. The second reference is the persistence forecast, which assumes that the lvp state at the forecast initialization remains. This state is predicted with a probability of 100% and all other states with 0%. It needs to be mentioned that persistence is already a benchmark at short lead times (Vislocky and Fritsch 1997). As a third reference, we compare the OLR to the operational human forecasts at Vienna International Airport (VIE). Human forecasters use all available information from observations and numerical weather predictions and produce operational forecasts at most airports.

3 Data

VIE is selected to develop the statistical low-visibility nowcasting tool. The airport with its two runways is located in the Vienna Basin 20 km southeast of downtown Vienna. The basin is bounded by the Alps to the west and by the Carpathian Mountains to the east. The Airport is surrounded by many humidity sources. Moisture advection from the southeast (Lake Neusiedl and wetlands) and the north (Danube River) favor low-visibility conditions.

3.1 Definition of the Low-Visibility Procedure States

Low-visibility events occur when horizontal and/or vertical visibility drop below a set of thresholds. The runway visual range (rvr) is used as horizontal visibility measure. It is defined as the range over which the pilot of an aircraft on the center line of a runway can see the runway surface markings or the lights delineating the runway or identifying its center line (World Meteorological Organization 2006). The rvr is closely correlated to the horizontal visibility but is truncated with an upper limit of 2000 m for higher visibilities. Vertical visibility is determined by the ceiling height. It is defined as the height of the lowest cloud layer covering more than half of the sky. The ceiling height is measured by ceilometers but finally determined by human observers while the rvr is measured automatically by transmissiometers.
Fig. 2

a Ceiling and runway visual range (rvr) thresholds for the lvp states. b Climatological occurrence of the three lvp and non-lvp (lvp 0) states at VIE for the whole year (all), and for the cold (September to March) and the warm season, respectively, over the period July 2008 until March 2017

At VIE, lvp states from 1 to 3 depend nonlinearly on rvr and ceiling (see Table 1). The categories are ordered but not equidistant. The state lvp 0 indicates good visibility and ceiling and no restrictions for aviation with a maximum capacity of 40 aircraft per hour. At lvp 3 the airport operates at 40% of its maximum capacity (Table 1).
Table 1

Definition of lvp at VIE and associated capacities relative to a maximum of 40 aircraft per hour

 lvp state

rvr

 

Ceiling

Capacity (%)

 0

   

100

 1

< 1200 m

or

< 300 ft

75

 2

< 600 m

or

< 200 ft

60

 3

< 350 m

  

40

3.2 Climatology of the Low-Visibility Procedure States

A climatology of the lvp states was compiled to be used as one of the reference models. Figure 2b shows the small proportion (about 4.5%) of lvp states 1 or higher for the whole year. Basically no low-visibility events occur during warm season. This seasonal pattern is typical for continental climates with topographical properties similar to the Vienna Basin (Egli et al. 2017). The challenge is to capture these relatively rare lvp events with a statistical approach.
Fig. 3

Contour plot of the annual and diurnal cycle of the climatological probability for poor visibility (lvp states ≥ 1) during the period July 2008 until March 2017

The significant annual cycle is also visible in Fig. 3 showing the occurrence probability of any lvp 1–3. The annual maximum is during December. Furthermore, the diurnal maximum is during the morning hours, which coincide with one of the airports rush hours. Therefore these lvp state events have a high impact on airport operations. The diurnal minimum occurs during afternoon. The correlation of the lvp states with wind and thus advection from humidity sources is shown in Fig. 4. The two primary wind directions are from the southeast and the northwest (Fig. 4a). Low visibility states are associated predominantly with winds from the southeast, with a small peak in the northern sector, both of which are known moisture source regions (Fig. 4b). There is also a shift from higher to lower wind speeds (color shading) between the full data set (Fig. 4a) and the lvp data set (Fig. 4b), as lower wind speed generally favors the development of radiation fog. As a result of this climatology, we focus only on the cold season beginning in September until the end of March.
Fig. 4

Wind rose plots of VIE for September to March a all cases and b lvp states ≥ lvp 1. The percentage of the different wind directions (color shadings) is divided into three wind speed clusters (see legend)

3.3 Measurements and Model Configurations

Modeling and forecasting the lvp state with statistical methods require information related to low-visibility formation and dissipating processes. Since we are interested in lead times up to few hours only, we exclusively use standard meteorological measurements available at all larger airports as basis of this nowcasting system. The data come from the observation system of VIE in a 30-min resolution and are available from September 2008 to Mach 2017.

All potential predictors for the use in this statistical nowcasting system are shown in Table 2. The variable setup includes visibility measures, humidity measures, vertical temperature differences, wind information and climatological information. Most of the predictors are continuous variables except the lvp state, the precipitation variable, and the wind direction sector variables. These three variables are categorical variables with two or more levels (see Table 2). Since no soil moisture measurements are available, we used a precipitation sum of at least 1 mm over the previous 12 h as factor variable and proxy. The wind direction is split into two factor variables to capture the main fog wind directions based on the lvp climatology (Fig. 4b). Three visibility measures are used. The vertical visibility is captured by the ceiling and the horizontal visibility by the rvr and the meteorological visibility. The ceiling variable is set to 25,000 ft for the cases where no ceiling was detected due to the absence of clouds. The rvr and the meteorological visibility provide similar information but rvr is truncated at 2000 m, whereas the meteorological visibility is truncated at 20000 m. Two further variables are the relative humidity and the dew point depression which are non-linearly-related humidity measures. Information about the vertical stratification of the lowest air layers is provided by two vertical temperature differences. One is the difference between the 2 m and surface temperature. The other is the difference between 2 m and the tower temperature roughly 100 m above ground. The only climatological variable used is the solar zenith angle, which represents the diurnal cycle of the lvp events. Dependent on the desired lead time, we use the associated lag of the above-shown predictors. For instance, if the lead time of + 30-min is of interest the 30-min lags of the measurements are used as predictors, respectively, for other lead times.

With measurements from nine cold seasons in 30-min resolution, we obtain a data set with roughly 85,000 observations. Thus, we have a data set which is sufficient to estimate and verify the statistical models despite the rare occurrence of the low-visibility events (about 5000 observations). The forecasts are produced over the whole day with four lead times from + 30 to + 120-min. Four models, one for each lead time, provide the forecasts. The model skill is based on out-of-sample verification using tenfold cross-validation and estimates the model variance with bootstrapping (Sect. 2.3).
Table 2

Predictors used within the statistical nowcasting approach

Name

Description

Unit

lvp

Low-visibility procedure state

Ordered [0, 1, 2, 3]

rvr

Runway visual range

m

cei

Ceiling height

ft

vis

Horizontal visibility

m

rh

Relative humidity

%

dpd

Dew point depression

°C

rr12

Precipitation in the last 12 h

Factor [yes, no]

dt.tow

Temperature difference 2 m: tower

°C

dt.surf

Temperature difference 2 m: surface

°C

ff

Wind speed

m s−1

dd.se

Wind direction from sector southeast

Factor [yes, no]

dd.n

Wind direction from sector north

Factor [yes, no]

sza

Solar zenith angle

°

4 Results

4.1 Predictor Selection and Effects

Fig. 5

RPS model skill due to parameter selection with the forward stepwise selection method for the two lead times a + 30-min and b + 120-min. The model size increases from left to right adding the predictor (see Table 2) which improves the model most. The black line shows the median of the RPS. The last model indicates the best model where the stepwise algorithm stops

The results of the stepwise model selection (Sect. 2.2) are shown in Fig. 5. The box plots are computed by the bootstrapped RPS mean values from the (a) + 30-min and (b) the + 120-min forecasts. Each box plot shows one forecast model. The model size increases from left to right by adding the annotated variable. The initial best guess is the intercept (= climatological) model (not shown in the graphic) which has the same skill for all lead times with an RPS \(\sim\) 0.032. The stepwise algorithm indicates the best input variable combination and also shows the importance of the individual variables. The variable lvp is selected first within all models and is the most important variable. This confirms the findings from (Vislocky and Fritsch 1997) that persistence is already a good benchmark for lvp forecasts. Horizontal visibility and humidity measures are selected next. The ceiling has a minor importance and is only selected in the + 90 and + 120-min model. The wind direction (dd.se) is selected in all models but improves the models only marginally. Of note is the increasing importance of solar zenith angle (sza) with lead time. The vertical temperature stratification (dt.tow) and wind speed (ff) were never selected as predictors.

Looking at the physical significance of the predictors selected by the objective selection method, the first choice of persistence via lvp and the horizontal visibility is explained by the high autocorrelation over such short lead times out to 2 h. Humidity variables are next as clouds and fog require saturation. Moist soil with the proxy of previous precipitation (rr12) and near-surface temperature gradient (dt.surf) influence radiative fog. A third important part within the models is the climatology captured by the solar zenith angle. Least important are the inputs modeling advection fog like the wind direction sectors. The variable selection implies that the information about upcoming low visibility is lead-time dependent and generally shifts from persistence predictors to others for longer lead times.

The model coefficients \(\beta\) from the best models using the step wise algorithm are shown in Table 3. The sign of the coefficients indicates the sign of the effects, i.e. positive coefficients indicate higher probabilities of high lvp states when the predictor values increase. The first three parameters are the threshold coefficients \(\alpha\) (see Eq. 2) between the four different lvp states. They can be interpreted as intercepts for the three thresholds. A further special feature of these models is the predictor lvp, which is an ordered variable with four levels. Within the model, lvp is replaced by three categorical variables which indicate if the latest lvp is higher lvp 0, higher lvp 1 and higher lvp 2. The predictors for the different lead times vary just slightly and remain the same for the lead times + 90-min and longer. As already shown in Fig. 5, the model for the shortest lead time performs best with fewer predictors compared to longer lead times.
Table 3

Coefficients \(\beta\) of the OLR models for the four different lead times

 

+ 30-min

+ 60-min

+ 90-min

+ 120-min

\(\alpha _1\)

10.177

11.006

11.800

12.436

\(\alpha _2\)

12.379

12.479

12.972

13.440

\(\alpha _3\)

16.778

15.972

16.022

16.200

\(lvp_{1|0}\)

2.773

1.870

1.507

1.252

\(lvp_{2|1}\)

1.930

1.134

0.846

0.698

\(lvp_{3|2}\)

2.684

1.749

1.290

0.907

rvr

− 0.668

− 0.523

− 0.426

− 0.372

cei

0.010

0.012

vis

− 0.316

− 0.338

− 0.313

− 0.281

rh

0.134

0.131

0.131

0.132

dpd

− 0.721

− 0.731

− 0.714

− 0.649

rr12

− 0.451

− 0.511

− 0.543

− 0.573

dt.surf

0.063

0.071

0.059

dd.se

0.195

0.065

0.060

0.046

dd.n

0.099

sza

0.003

0.005

0.007

0.009

For the predictor names see Table 2, \(\alpha _{1-3}\) indicate the thresholds between the lvp classes. Contrary from Table 2 units from rvr and vis in km, unit from cei in kft

Fig. 6

Effect of the predictors for the a + 30-min and b + 120-min forecast. The four colors indicate the probability for each lvp state dependent on the particular predictors. The observed distribution of the particular predictors is shown as bar with gray shading below each effect plot. Typical conditions before an lvp event (red markers see Table 4) are chosen as representative values for the marginal effects. The effect plot with the fade out indicate that these predictors have not been selected in the particular model and therefore have constant effects

Table 4

Fixed values for the marginal effects shown in Fig. 6

Parameter

Value

Parameter

Value

lvp

0

rr12

Yes

rvr

1500

dt.surf

1

cei

25,000

dd.se

Yes

vis

1500

dd.n

No

rh

95

sza

100

dpd

1

  

Figure 6 shows the variation in lvp forecasts due to a individual predictor when all other predictors are kept constant at representative values. Representative values (red marks in Fig. 6) are values typical before low-visibility events (see Table 4). Figure 6a shows that high lvp categories become more likely when rvr, visibility (vis) and dew point depression (dpd) decrease or when relative humidity (rh) or solar zenith angle (sza) increase. Decreasing values of sza indicate stronger radiative cooling and, therefore, favor low visibility. The predictor lvp has the strongest effect, which again confirms the importance of the persistence for short lead times. The effects for lvp, visibility and humidity parameters are nonlinear while all the other predictors produce linear effects. The factor variable dd.se with winds from the southeast sector has only a minor effect. However, the probability for high lvp states increases with winds from southeast compared to other directions. Surprisingly, the probability for high lvp states increases with higher ceiling height (Fig. 6b). An interpretation of this negative effect is that a low ceiling increases the long-wave downward radiation and prevents strong radiative cooling of the ground. Thus, it is less likely for a fog layer to develop. The negative effect of the precipitation within the preceding 12 h (rr12) is also unexpected, but again a possible explanation is that the remaining precipitation clouds reduce radiative cooling of the ground.

Comparing the effect plots from the + 30-min (Fig. 6a) and + 120-min (Fig. 6b), forecasts illustrate in general lower probabilities for the higher lvp states at the + 120-min forecasts. This indicates that the OLR model forecasts tend to converge to climatology with increasing lead times. Furthermore, the large effect of the predictor lvp (see Fig. 6a) indicates that the forecast information is concentrated on this predictor at the lead time + 30-min. In contrast at the lead time + 120-min, the effect of lvp becomes weaker while the other effects remain similar. This behavior is consistent with the model coefficients shown in Table 3, where the coefficients for lvp decrease with increasing lead times while the coefficients of several other predictors, e.g. sza, dpd, or dt.surf, increase or at least remain approximately unchanged.

4.2 Model Stability

To estimate the sensitivity of the OLR models to the training data set length, Fig. 7 shows RPS of + 60-min forecasts based on one up to six cold seasons training data. The forecast are evaluated out-of-sample for three cold seasons. The box plots for the OLR models indicate a slight model improvement with an increasing training data set from one cold season to two cold seasons. The OLR models based on two or more cold seasons have similar skill but the model based on four cold seasons has the best RPS. The shortest training data set length of one cold season includes about 5000 observations which are already an appropriate data set to estimate an OLR model with the applied model configuration. Taken together, forecast accuracy improves with increasing training data set length up to two cold seasons and remains similar for longer training data sets.
Fig. 7

Model stability of the OLR due to variation of training data set length from one to six cold seasons (CS). RPS values are out-of-sample on a three-cold season test data set for the + 60-min forecast. Persistence is shown as reference model

4.3 Model Performance

Fig. 8

Model skill based on RPS as function of the lead time for a the full data set and b for the subset where the human observations are available. The dots represent the 50% quantile, the shadings in the lighter color show the 95% confidence interval. The model acronyms stand for climatology (CLIM), persistence (PERS), ordered logistic regression (OLR), and the human forecaster (HUMAN). The human forecasts are available from September 2012 to March 2017 for the lead times + 60-min and + 120-min between 5 and 20 UTC only

The results of the final lvp forecasts are shown in Fig. 8. The models are based on data of nine cold seasons using the best model configuration resulting from the stepwise predictor selection. Figure 8a illustrates the verification for OLR compared to the climatology and the persistence forecasts. The OLR outperforms climatology and persistence for each lead time. The skill of all forecasts decreases with increasing lead time but the benefit of OLR over persistence increases. OLR and persistence converge to climatology but within the lead times considered here, the performance of climatology as a forecast of lvp is clearly worse. The spread of the RPS indicates the uncertainty due to the model estimation determined by bootstrapping (see Sect. 2.3). The uncertainty is smallest for the OLR model compared to persistence and climatology, and this uncertainty does not vary substantially with lead time.
Fig. 9

Model performance evaluated with ROC curves for the two (instead of four) category nowcasts for operations unaffected (lvp 0) or affected (lvp 1–lvp 3) by low-visibility conditions. Models are climatology (CLIM), persistence (PERS) and the ordered logistic regression (OLR) for the lead times + 30-min (solid line) and + 120-min (dashed line)

When the nowcasting performance is evaluated only for two (un/affected by low visibility conditions) instead of four categories, the ROC curves in Fig. 9 also show the OLR to outperform persistence already at the shortest lead time of + 30-min and even more so for + 120-min. A climatological forecast lies on the diagonal of the ROC diagram, which (per definition) represents the forecasting skill of randomly drawing from the climatological distribution.

Human forecasts are available as third reference (Fig. 8b), albeit only over a shorter period (September 2012 to March 2017), part of the day (5–20 UTC) and 60 min lead-time intervals instead of 30 min. The human forecast is comparable to persistence; therefore, OLR also outperforms the human forecasts. Due to the rare occurrence of lvp events, and the larger number of lvp 0 during the day, when considering only the daytime OLR, persistence and climatology perform better compared to the results on the full data set in Fig. 8a.

5 Discussion and Conclusion

Predicting lvp (low-visibility procedure) states is very challenging because small changes in the physical drivers of low-visibility formation and dissipation processes have a large impact on lvp. The OLR (ordered logistic regression) model has been used to develop a probabilistic nowcasting system producing direct airport-specific, low-visibility forecasts tested at Vienna International Airport.

The OLR model leads to promising forecasts and outperforms persistence and climatology, although persistence offers predictions to be reckoned with especially at the very short lead times. This nowcasting approach is also competitive with human forecasts. A benefit of this method is to produce probabilistic forecasts of a variable relevant for to the end user within one consistent model. The probabilistic forecasts of such a model allows different end users, e.g., air traffic controllers or air traffic managers, to extract the information regarding their requirements.

The variable influencing the forecasts most is lvp. Additional important variables are the horizontal visibility and the humidity measures. Contrary to our expectations, wind speed and the temperature stratification of the first 100 m above the ground did not contribute information for future lvp states at Vienna International Airport.

The presented data-driven nowcasting system is transferable to other airports but it needs to consider site-specific predictors. For instance, considering the lvp climatology guides us to focus on the cold season only and justifies omission of the warm season, during which low-visibility events almost never occur. The nowcasting system uses the available measurements on the site and its vicinity and objectively selects the important variables to form the best model. Dependent on the regime of the site (e.g., coastal or continental) different physical effects drive the development and dissipation of low-visibility conditions. These effects need to be adequately represented in the model variables to achieve accurate forecasts. The usage of additional model inputs and increase of the forecast temporal resolution would have an additional gain for the operational use of the nowcasting system (cf. Dietz et al. 2017).

In summary, these findings suggest that the OLR model is a suitable tool for computationally fast probabilistic visibility forecasts which can provide useful guidance to aviation decision makers.

Notes

Acknowledgements

Open access funding provided by University of Innsbruck and Medical University of Innsbruck. This study has been supported by the Austrian Research Promotion Agency FFG 843457 and benefits greatly from Markus Kerschbaum, Andreas Lanzinger and the operational personnel of Vienna International Airport. The authors thank the Austrian Aviation Agency ACG for providing access to the data.

References

  1. Bartokov, I., Bott, A., Bartok, J., & Gera, M. (2015). Fog prediction for road traffic safety in a coastal desert region: Improvement of nowcasting skills by the machine-learning approach. Boundary-Layer Meteorology, 157(3), 501–516.  https://doi.org/10.1007/s10546-015-0069-x.CrossRefGoogle Scholar
  2. Bergot, T. (2016). Large-Eddy simulation study of the dissipation of radiation fog. Quarterly Journal of the Royal Meteorological Society, 142(695), 1029–1040.  https://doi.org/10.1002/qj.2706.CrossRefGoogle Scholar
  3. Bocchieri, J. R., & Glahn, H. R. (1972). Use of model output statistics for predicting ceiling height. Monthly Weather Review, 100(12), 869–879.  https://doi.org/10.1175/1520-0493(1972)100%3c0869:UOMOSF%3e2.3.CO;2.CrossRefGoogle Scholar
  4. Bremnes, J. B., & Michaelides, S. C. (2007). Probabilistic visibility forecasting using neural networks. Pure and Applied Geophysics, 164(6–7), 1365–1381.  https://doi.org/10.1007/s00024-007-0223-6.CrossRefGoogle Scholar
  5. Christensen. R.H.B. (2015). Ordinal—Regression Models for Ordinal Data. R package version 2015.6-28. http://www.cran.r-project.org/package=ordinal/
  6. Dietz, S. J., Kneringer, P., Mayr, G. J., & Zeileis, A. (2017). Forecasting low-visibilit procedure states with tree-based statistical methods. Working papers, Faculty of Economics and Statistics, University of Innsbruck. https://eeecon.uibk.ac.at/wopec2/repec/inn/wpaper/2017-22.pdf. Accessed 15 Sept 2017.
  7. Dutta, D., & Chaudhuri, S. (2015). Nowcasting visibility during wintertime fog over the airport of a metropolis of India: Decision tree algorithm and artificial neural network approach. Natural Hazards, 75(2), 1349–1368.  https://doi.org/10.1007/s11069-014-1388-9.CrossRefGoogle Scholar
  8. Egli, S., Thies, B., Drnner, J., Cermak, J., & Bendix, J. (2017). A 10 year fog and low stratus climatology for europe based on meteosat second generation data. Quarterly Journal of the Royal Meteorological Society, 143(702), 530–541.  https://doi.org/10.1002/qj.2941.CrossRefGoogle Scholar
  9. Fabbian, D., de Dear, R., & Lellyett, S. (2007). Application of artificial neural network forecasts to predict fog at Canberra International Airport. Weather and Forecasting, 22(2), 372–381.  https://doi.org/10.1175/WAF980.1.CrossRefGoogle Scholar
  10. Gultepe, I., Hansen, B., Cober, S. G., Pearson, G., Milbrandt, J. A., Platnick, S., et al. (2009). The fog remote sensing and modeling field project. Bulletin of the American Meteorological Society, 90(3), 341–359.  https://doi.org/10.1175/2008BAMS2354.1.CrossRefGoogle Scholar
  11. Gultepe, I., Tardif, R., Michaelides, S. C., Cermak, J., Bott, A., Bendix, J., et al. (2007). Fog research: A review of past achievements and future perspectives. Pure and Applied Geophysics, 164(6–7), 1121–1159.  https://doi.org/10.1007/s00024-007-0211-x.CrossRefGoogle Scholar
  12. Haeffelin, M., Bergot, T., Elias, T., Tardif, R., Carrer, D., Chazette, P., et al. (2009). PARISFOG: Shedding new light on fog physical processes. Bulletin of the American Meteorological Society, 91(6), 767–783.  https://doi.org/10.1175/2009BAMS2671.1.CrossRefGoogle Scholar
  13. Herman, G. R., & Schumacher, R. S. (2016). Using reforecasts to improve forecasting of fog and visibility for aviation. Weather and Forecasting, 31(2), 467–482.  https://doi.org/10.1175/WAF-D-15-0108.1.CrossRefGoogle Scholar
  14. Hilliker, J. L., & Fritsch, J. M. (1999). An observations-based statistical system for warm-season hourly probabilistic forecasts of low ceiling at the San Francisco International Airport. Journal of Applied Meteorology, 38(12), 1692–1705.  https://doi.org/10.1175/1520-0450(1999)038%3c1692:AOBSSF%3e2.0.CO;2.CrossRefGoogle Scholar
  15. Marzban, C., Leyton, S., & Colman, B. (2007). Ceiling and visibility forecasts via neural networks. Weather and Forecasting, 22(3), 466–479.  https://doi.org/10.1175/WAF994.1.CrossRefGoogle Scholar
  16. Murphy, A. H., & Winkler, R. L. (1984). Probability forecasting in meteorology. Journal of the American Statistical Association, 79(387), 489–500.  https://doi.org/10.1080/01621459.1984.10478075.Google Scholar
  17. Vislocky, R. L., & Fritsch, J. M. (1997). An Automated, observations-based system for short-term prediction of ceiling and visibility. Weather and Forecasting, 12(1), 31–43.  https://doi.org/10.1175/1520-0434(1997)012%3c0031:AAOBSF%3e42.0.CO;2.CrossRefGoogle Scholar
  18. Wilks, D. S. (2011). Statistical methods in the atmospheric sciences. Oxford: Academic Press.Google Scholar
  19. Winkelmann, R., & Boes, S. (2006). Analysis of microdata. Berlin: Springer.Google Scholar
  20. World Meteorological Organization. (2006). Guide to meteorological instruments and methods of observation. Tech. rep, Genf, Schweiz: Secretariat of the WMOGoogle Scholar

Copyright information

© The Author(s) 2018

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.Department of Atmospheric and Cryospheric SciencesUniversity of InnsbruckInnsbruckAustria
  2. 2.Department of StatisticsUniversity of InnsbruckInnsbruckAustria

Personalised recommendations