1 The rotating RSTD model

1.1 Loss function

Thorarinsdottir and Løland recommend that the probabilistic properties of the RRSTD model be fully utilized along with the statistical properties of the loss function, for example its asymmetry. We concur with this recommendation. Hering and Genton (2010) explicitly defined a loss function based on the power curve of turbines; see also Zhu and Genton (2012) for a review of possible loss functions. As pointed out by Pinson, the optimal point forecast under the economic dispatch model is unlikely to correspond to the median of the predictive distribution. We used the median because it is optimal for the mean absolute error (MAE) that was used in the economic dispatch. It is possible to use the cost associated with the power system economic dispatch model as a loss function and optimize the RRSTD model parameters under that loss function. This appears to be computationally very challenging although it seems worthy of more investigation.

1.2 Power curve transformation

We agree with Thorarinsdottir and Løland that due to the shape of the power curve, the transformation of errors is very heteroscedastic and this could be taken into account in our framework. Even more challenging is the fact that we use an “idealized” power curve provided by the turbine’s manufacturer. Moreover, wind power is in fact affected by more than just wind speed. In addition to the type of turbines, environmental factors such as the direction of wind, the air density, and the humidity, to name but a few, all affect the transformation of the power curve. Lee et al. (2014a) proposed a multivariate density estimation of power output in that setting and demonstrated the improvement of that approach. Lee et al. (2014b) further extended this methodology to quantifying the upgrade of wind turbines.

1.3 Forecasting wind speed versus forecasting wind power

As mentioned by Thorarinsdottir and Løland, a general consensus on the debate between speed-oriented prediction and power-oriented prediction has not been reached in the research community. In industrial practice, direct power forecasting is used in very short-term horizon applications (e.g., next 5-min dispatches). The power data are collected directly from operating wind farms. For longer-term horizon applications (30 min–24 h ahead), wind speed forecasting plus speed/power conversion is commonly used because unpredicted operating conditions (e.g., curtailment) may affect the quality of direct power prediction. In our paper, we predicted wind power using wind speed forecasting for the following three reasons. First, we think that predicting wind power using wind speed is more flexible and applicable to applications with the time horizon discussed in this paper. Since wind farms have different wind turbine models and designs, wind speed forecasting is better adapted to converting to power production forecasts based on the power curves provided by the turbine manufacturers. Second, forecasting based directly on wind power is available only when there is historical wind power data collected from wind farms. However, due to various operational conditions (e.g., wind curtailment, outage, pitch-angle control, etc.), historical wind power outputs do not equal the wind generation potential. Third, the data we considered in this paper are from a benchmark dataset that was used with the advanced space-time wind speed forecasting models proposed by Gneiting et al. (2006) and Hering and Genton (2010). We thought that it would be interesting to observe how our model performed with the same dataset. However, we have also conducted another study (Xie et al. 2014) with wind data from West Texas and with a wind speed forecasting model that incorporates geostrophic wind speed to further improve the forecasts (Zhu et al. 2014). We appreciate and agree that more delicate and complicated speed/power conversion (e.g., conditional dynamic power curves by Jeon and Taylor (2012) can further improve the quality of the forecast. Further comprehensive studies examining both speed-oriented and power-oriented frameworks are needed.

1.4 Comparisons with RSTD and TDD models

As mentioned in Section 3.4, we made comparisons of the RRSTD model with the RSTD and TDD models. Because the results were very similar, we did not report them in the paper. Thorarinsdottir and Løland commented that such a comparison would be interesting, and Hering in fact made that comparison based on a more complete set of data spanning 3 years at the same three locations as used in our paper. Hering found that the performance of the RRSTD model was slightly worse than that of the RSTD and TDD models on her dataset. We believe that this is due to the fact that there is a change in the wind patterns between the training set (2004–2005) and the testing set (2006). This can be seen in Fig. 1 below. For instance, we point to the MAE value for Vansicle in December in Table 1 of Hering’s discussion: the RSTD model reaches 1.38, whereas the RRSTD model only 1.41. From Figure 4 in Hering’s discussion, we see that the regimes switch based on the wind direction at Goodnoe Hills. However, when we compare the wind rose in December at Goodnoe Hills in the training set of 2004–2005 (Figure 2 of Hering’s discussion) with the wind rose in December at Goodnoe Hills in the testing set of 2006 (Fig. 1 below), we notice a striking difference. We believe that this and other similar wind pattern changes between the training and testing sets are the reasons for occasional slightly better performance of the RSTD model than the RRSTD model. In general, we still believe that the RRSTD model should improve upon the RSTD model because the latter is a particular case of the former. However, differences between the forecast models (and accordingly in the power dispatch) at different locations and different periods of the year are expected and this was noticed by Muñoz.

Fig. 1
figure 1

Monthly wind roses for Kennewick in 2006 (courtesy Hering)

1.5 RRSTD model fitting

The implementation of the RRSTD model by Hering exactly follows our implementation. Hering commented that with many unequally spaced regimes, an exhaustive search for the best definition of regimes is computationally very costly and she proposed other approaches for the identification of wind regimes. Although we agree with the aforementioned computational difficulties, we would like to add that the search procedure can be parallelized easily on a multi-processor computer.

1.6 Combining forecasts

We agree with Hering’s opinion that combining different models together to achieve an overall better forecast may be a valuable approach. Here too, we think that by putting different models under a parallel computation framework, with appropriate expert systems or pattern recognitions, a hybrid approach of these statistical models could be very promising.

1.7 Forecasting wind at new locations

Both Hering and Muñoz commented that the RRSTD model (and also the RSTD and TDD models) does not make spatial predictions; hence, it cannot be used to evaluate potential wind power production at certain new locations before building wind farms there. We completely agree with this remark. Our intention was to mention that if a wind speed tower is used at a location of interest for a future wind farm, then improved wind forecasts at that location could be used to evaluate the possibility of building a wind farm around that location. Because wind power projects are designed to last many years, at this timescale, a further question is whether climate change may become an influential factor on wind patterns. Goddard et al. (2014) examined the potential effects of climate change on the average diurnal power production cycles at various locations in North America by combining output from high-resolution regional climate models comprising a current and future period. They found significant changes at several locations of interest to wind farming.

2 Integrating wind power into a power system

2.1 Probabilistic forecast

We agree with Hering, Thorarinsdottir and Løland, and Pinson’s comments about utilizing the probabilistic distribution of the forecast rather than simply using the point forecast. As mentioned in the discussion section of our paper, we are currently working on a stochastic dispatch framework that naturally takes into account the whole probabilistic distribution of the forecast. We expect that further economic gains can then be obtained. As Pinson pointed out, there is a subtle distinction between the forecaster’s “loss function” and the forecast user’s “loss function”, and evaluating the relationship between the operational benefits of the power system and forecast improvements and incorporating the results of the comparison into decision-making are promising. If the computational power allows, a robust programming-based or a stochastic programming-based approach will likely exhibit better performance.