Abstract
Predictive ability of time series models is easily compromised in the presence of structural breaks, common among financial and economic variables amidst market shocks and policy regime shifts. We address this problem by estimating a semiparametric mixed-frequency model, that incorporate high frequency data either in the conditional mean or the conditional variance equation. The inclusion of high frequency data through non-parametric smoothing functions complements the low frequency data to capture possible non-linear relationships triggered by the structural change. Simulation studies indicate that in the presence of structural change, the varying frequency in the mean model provides improved in-sample fit and superior out-of-sample predictive ability relative to low frequency time series models. These hold across a broad range of simulation settings, such as varying time series lengths, nature of structural break points, and temporal dependencies. We illustrate the relative advantage of the method in predicting stock returns and foreign exchange rates in the case of the Philippines.
Avoid common mistakes on your manuscript.
1 Introduction
Structural breaks occur as sudden changes in the behavior of the time series reflected as changes in mean, variance, or the general autocorrelation structure. As time series models are generalization of the historical behavior of the data generating process, structural breaks can create challenges in estimation and subsequent utility of the estimated model in forecasting. Localized changes in the pattern of the data can render the general model inappropriate given the significant shift in the model parameter(s) over time (Hansen, 2001). Structural change can also drive fluctuations in the time series to the extent of more volatile or erratic behavior. This is commonly observed in economic and financial time series (e.g., asset prices and returns, gross domestic product growth) in the event of market shocks and policy regime shifts.
Detection and implication of structural change has been an important theme of the literature on time series analysis. Amsler and Lee (1995) noted that a single time series break will not affect the asymptotic behavior of a unit root test. In the absence of prior information about the time series and the occurrence of structural break, Doerr et al. (2017) proposed genetic algorithm as a flexible modeling strategy applicable in a wide range of scenarios. As global stationarity is more difficult to achieve, Safikhani and Shojaie (2022) proposed a method of simultaneous detection of changepoints and estimation of parameters in a high dimensional data by imposing a penalty function based on total variance. High frequency data is often characterized as a full record of transactions and their associated characteristics. High frequency data resembles the characteristics of a nearly continuous-time process as opposed to discrete time points in low frequency data (Engle, 2000). In particular, annual data is considered as very low frequency, with ultra-high frequency data being “tick-by-tick” data recorded in intervals less than one second. Any periodicity in-between would lie along a spectrum of high and low frequencies (Dacorogua et al., 2001; Huth & Abergel, 2014; Breymann et al., 2003). In this paper, high frequency data is defined as time series data collected at a finer time scale in contrast to low frequency data. Specifically, the high-low frequency data dichotomy considered is relative, e.g., with weekly data as low frequency and daily data as high frequency data (stock prices), monthly data as low frequency and weekly data as high frequency data (economic indicators).
Models for data with low frequency variables are especially vulnerable to structural change as available information on the structural change may be limited, e.g., in-between two observed time points. Consider for example an autoregressive moving average (ARMA) models and its extensions that emphasizes the impact of recent observations rather than distant ones as evidenced by the rapid decay of the autocorrelation functions (Mahir & Al-Khazaleh, 2008). Likewise, in modeling volatility, the generalized autoregressive conditional heteroskedastic (GARCH) family of models place more importance on recent observations rather than distant ones as the autocorrelation function of squared residuals show an exponential rate of decay (Hafner, 2009).
In the presence of structural change, these short-memory low frequency models may perform poorly as behavior in the recent past does not adequately reflect both the general features of the data as well as impact of structural change. This is especially problematic when the parameter shift happens towards the end of the time series or when the time series is relatively short.
To address such problems, a mixed-frequency semiparametric framework is postulated. High frequency variables are incorporated into the mean or variance components of the models through smoothing functions, as this better captures the complex nonlinear relationships arising from the structural change. The proposed models are meant to assist financial traders or economic managers in forecasting time series amidst structural change, allowing decisions to respond more proactively to market disruptions rather than post-hoc policy adjustments. This is more beneficial to decision makers who rely solely on low frequency models in the event of structural change. When information on the structural change is carried by indicators collected at high frequencies, decision-makers can adopt and possibly mitigate the vulnerability of the systems from such disruptions.
Even though changes in individual stock prices are considered to evolve in a random walk, it is still important for investors to use tools that help reduce subjectivity and assist in rationale decision-making. This matters more with structural change where traditional data patterns are upended. Using high frequency data may provide better estimates and forecasts of stock returns and their volatilities, enabling investors to devise better trading strategies such as target price levels for stop-loss orders (Benito et al., 2024).
Initially, structural change is ignored as its effect can be negligible, but as the time series evolves to manifest the lingering impact, strategies to incorporate change episodes into the model or mitigate the effect on statistical inference has become valuable. Yoon (1998) noted that even a naïve forecast, i.e., ignoring structural change, will still produce smaller errors even when structural change actually occurred. However, Coelho and Nunes (2011) and Agostino et al. (2013) reported that better forecasts are generated when structural change is incorporated into the model. Campano and Barrios (2011) and Altansukh and Osborn (2022) subsampled the time series to isolate structural change into one or fewer segments, then estimates are averaged over all subsamples. But with multiple structural breaks, Pesaran et al. (2006) used a Bayesian approach to generate better forecasts.
2 Modeling Time Series with Structural Change
Time-varying parametric models are used to account for structural change in time series data. State-space representations, Kalman filtering and smoothing techniques allow the coefficient vector to vary over time (Durbin & Koopman, 2012). They are further extended to multivariate systems, such as the time-varying parameter vector autoregressive models (TVP-VAR) and time-varying parameter dynamic factor models (TVP-DFM). These models, however, have the potential to be over-parametrized imposing challenges in estimation.
In modeling time series with structural breaks, Davis et al. (2006) used piecewise autoregressive (AR) processes with unknown number of segments and order of AR process and estimated with a genetic algorithm-based method, reported good empirical results. A combination of moving sum and cumulative sum can capture important features of multiple breaks in high dimensional time series, noted Chen et al. (2022). Meanwhile, modeling and testing in panel data in the presence of structural breaks was generalized in Karavias et al. (2023) and Lumsdaine et al. (2023).
2.1 Semiparametric Structural Change Models
In the presence of structural change, Campano and Barrios (2011) used a hybrid of block bootstrap and forward search algorithm to generate estimate of the model. Simulation studies indicate robustness of the estimates in the presence of temporary structural change, provided the time series is relatively longer. Villejo et al. (2017) estimated a dynamic spatio-temporal model with varying spatial and temporal effects. To mitigate the effect of structural change, the forward search algorithm and bootstrap are embedded into the backfitting algorithm. Esmenda and Barrios (2018) also modeled the outbreak of epidemics with a spatio-temporal multilevel model with a similar algorithm.
These studies focus on robust estimation of time series with temporary structural change. It is imperative, however, in the presence of persistent structural change to incorporate information on the structural change to arrive at better predictive ability for estimated models.
2.2 Mixed Frequency Models
Ghysels et al. (2004) introduced mixed data sampling (MIDAS) regression, which involves time series data collected at different frequencies. MIDAS models specify the conditional expectation of the response variable, recorded at the lower frequency, as a function of distributed lag regressors of the higher frequency variable(s).
Mariano and Ozmucur (2015) combined mixed frequency modeling with dynamic latent factor modeling (DFM) to forecast Gross Domestic Product (GDP) in the Philippines. Monthly economic variables like inflation, money supply, interest rates, and exports were used to forecast GDP (measured quarterly). In comparison to MIDAS models, mixed data DFM requires a smaller data set but is more computationally intensive (Mariano & Ozmucur, 2020).
Benito et al. (2024) considered volatility modeling with mixed-frequency spatiotemporal models. High frequency data were introduced in the mean or variance equation of the volatility models through non-parametric smoothing functions. Simulations revealed that varying frequency data in the mean is more robust in a wider range of scenarios.
3 Mixed-Frequency Model with Structural Change
Following a similar framework with Benito et al. (2024), we propose two models: (i) varying frequency in the mean with structural change or VF-MeanS, and (ii) varying frequency in the variance with structural change or VF-VarianceS. The high frequency data are incorporated in the mean or variance components of the models through smoothing functions. While this paper extends the model of Benito et al. (2024), we account structural change through a high frequency indicator. This will help mitigate the problem of decision-makers with the delay on the impact of structural change to the low frequency indicator. Policy adjustments can be done ahead of the next measurement for the low frequency indicator for as long as the high frequency indicator aptly reflect structural change. The structural change is then equated with volatility behavior of the time series and is subsequently incorporated into the model (mixed frequency).
The models accounting for structural change are compared to the baseline models using both simulated and actual data. In particular, the baseline models included ARMA(1,1) with intervention using a step function, and ARMA(1,1)-GARCH(1,1). Note further that the baseline parametric models are assumed to be characterized at lower frequencies.
The ARMA(1,1) with intervention is considered as a baseline model as it accounts for a parameter shift in the mean through a step function, a dummy variable that differentiates the time points pre- and post-structural change. Furthermore, ARMA(1,1)-GARCH(1,1) is also considered as another baseline model as it can account for changes in the conditional variance over time. Both models assume that all time series data are gathered at the same frequencies (low).
We assume that information from high frequency variables capture the dynamics of the structural change and sudden shifts in high frequency variables will send signals to the model that structural change has occurred, prior to the next measurement of the low frequency response variable. Thereafter, an appropriate adjustment process will soon be realized in the dynamics of the low frequency system. As high frequency data are collected at shorter intervals, the proposed models will contain more information on the structural change and with the use of smoothing functions to integrate high frequency data into the system, complex non-linearities arising from the structural change are aptly incorporated into the model.
3.1 Varying Frequency in the Mean with Structural Change (VF-MeanS)
The VF-MeanS model is an extension of the ARMA(1,1)-GARCH(1,1) model that incorporates the high frequency variables into the conditional mean equation. High frequency variables manifest the change in the system resulting from a stimulus at almost real-time. Without aggregation of high frequency variables, sudden shifts can supplement the information from the lower frequency variables, aptly capturing the dynamics of the system. The approach of incorporating high frequency covariates through smoothing splines is not limited to an ARMA-GARCH framework, but is used in this paper to facilitate comparison with the baseline models.
Adding high frequency information through smoothing splines is motivated by the optimality properties of generalized additive models (GAM). Hastie and Tibshirani (1986) proposed GAMs, which depart from traditional likelihood-based models that assume a linear parametric form for the independent variables. In the GAM framework, the linear predictor \(\sum {\beta }_{j}{X}_{j}\) is replaced by a sum of smooth functions \(\sum {s}_{j}\left({X}_{j}\right).\) The smooth functions are then estimated non-parametrically using scatterplot smoothers, e.g., splines and kernels, providing more flexibility to better capture (possibly) nonlinear covariate effects.
The use of smoothing splines for the high frequency data further enhances the model since linearity and distributional assumptions are relaxed yielding more flexible models. The conditional mean equation of the VF-MeanS model is given in Eq. (1).
Where \(d\) is the number of time points for high frequency indicators in-between two low frequency time points. \({x}_{i,t-1}\) represent high frequency predictors of \({y}_{t}\) that captures the structural change arising from specific stimuli and \({a}_{t}\) follows an ARCH process so that \({a}_{t}={\sigma }_{t}{\epsilon }_{t}\)where Et ∼ N (0, 1) and \({\sigma }_{t}\) is a sequence of volatilities. The conditional variance equation of the VF-MeanS model then follows a GARCH(1,1) process as shown in Eq. (2) to capture volatility clustering.
3.2 Varying Frequency in the Variance with Structural Change (VF-VarianceS)
The second model includes the higher frequency data in the conditional variance equation while retaining an ARMA(1,1) specification for the conditional mean. This extends a GARCH(1,1) model to include high frequency variables. We assume that the high frequency indicator is implicated with volatility resulting from structural change, hence, inclusion of high frequency variables in the conditional variance equation can address volatility clustering. The VF-VarianceS model is shown in Eq. 3 and Eq. 4. The conditional mean equation follows an ARMA(1,1) process as with Eq. 3 while its conditional variance equation is shown in Eq. 4:
Where \(d\) is the number of time points for high frequency indicators in-between two low frequency time points. Note that \({a}_{t}\)follows an ARCH process such that \({a}_{t}={\sigma }_{t}{\epsilon }_{t}\)where Et ∼ N (0, 1) and \({\sigma }_{t}\) is a sequence of volatilities. \(\alpha\) and \(\beta\) are the usual GARCH(1,1) parameters. For comparability with baseline models, GARCH is considered but other conditional variance models can also be used.
4 Model Estimation
The models are estimated through the iterative process of the backfitting algorithm. In an additive model, the backfitting algorithm is optimal and guaranteed to converge, see for example Buja et al. (1989) and Opsomer (2000).
4.1 VF-MeanS
In the VF-MeanS model, the non-parametric components of the mean are first estimated by fitting the dependent variable on the high frequency variables in a generalized additive model (GAM), see Hastie and Tibshirani (1986).
The response variable is then updated, with the residuals of the non-parametric model, to allow for the estimation of the parametric components of the mean. Finally, the residuals of the parametric components of the mean are used to fit a GARCH model, e.g., Bolleslev (1986), to obtain the parametric components of the variance. In particular, Algorithm 1 summarizes the estimation procedure for varying frequencies in mean model.
Algorithm 1
-
Step 1: Fit \({y}_{t}\) and \({x}_{i,t-1}\) into a GAM with cubic regression splines. This will yield the non-parametric components of the mean, which we denote as \({\widehat{y}}_{t,GAM}\) obtained from: \({\widehat{y}}_{t,GAM}={y}_{t}- {\epsilon }_{t,GAM}.\)
-
Step 2: Fit \({\widehat{y}}_{t,GAM}\) into an ARMA(1,1) model to obtain the remaining parametric components \((\widehat{{\varnothing }_{o}},\widehat{\varnothing },\)\(\widehat{\theta })\) of the mean. Then, obtain its residuals \({\epsilon }_{t,ARMA}\) and the fitted values of the ARMA(1,1) model by subtracting \({\epsilon }_{t,ARMA}\) from the original dependent variable. That is: \({\widehat{y}}_{t,ARMA}= {y}_{t}-\)\({\epsilon }_{t,ARMA}.\)
-
Step 3: By this time, the non-parametric and parametric components of the mean have already been obtained. The remaining step is to extract the parametric components of the variance equation. Hence, \({\epsilon }_{t,ARMA}\) is fitted into a GARCH(1,1) model to obtain the parametric components \(\left(\widehat{\omega },\widehat{\alpha }, \widehat{\beta }\right)\) of the variance equation. That is: \({\epsilon }_{t,ARMA}= {\widehat{y}}_{t,GARCH}+ {\epsilon }_{t,GARCH}.\) The fitted values \({\widehat{y}}_{t,GARCH}\) and its residuals \({\epsilon }_{t,GARCH}\) are also obtained.
-
Step 4: At this point, the components of the mean (non-parametric and parametric) and the components of the variance (parametric) have already been estimated. The dependent variable is then updated using \({y}_{t}\) and the residuals estimated from step 3 denoted by \({\epsilon }_{t,GARCH}\). That is: \({y}_{t,new}= {y}_{t}- {\epsilon }_{t,GARCH}\).
-
Step 5: The algorithm iterates from steps 1 to 4 with the new values obtained for the response variable in step 4. That is, \({y}_{t,new}\) will now be the \({y}_{t}\) in the first step of the algorithm. The convergence criterion is set until there are minimal changes in the Mean Squared Error (MSE < 0.001). Finally, the fitted values of the algorithm are obtained by summing up the individual components separately estimated: \({\widehat{y}}_{t,Model 1}= {\widehat{y}}_{t,GAM}+ {\widehat{y}}_{t,ARMA}+ {\widehat{y}}_{t,GARCH}\). Note that the MSE is taken to be the average of the squared errors between the latest estimate of the model and the estimate from the previous iteration such that \(MSE=\frac{1}{T}\sum _{i=1}^{T}{\left({y}_{t,new} -{y}_{t} \right)}^{2}\)
4.2 VF-VarianceS
In the VF-VarianceS model, the parametric components in the mean equation are first estimated by fitting the dependent variable into an ARMA(1,1) model. Afterwards, the residuals of the ARMA(1,1) model are fitted into a GARCH(1,1) model to obtain the parametric components of the variance. The residuals of the GARCH(1,1) model are then obtained and fitted into a GAM, through smoothing functions, to yield the non-parametric components of the variance equation. In particular, estimation of VF-VarianceS model is summarized in Algorithm 2.
Algorithm 2
-
Step 1: Fit \({y}_{t}\) into an ARMA(1,1) model to obtain the parametric components of the mean \((\widehat{{\varnothing }_{o}},\widehat{\varnothing },\widehat{\theta })\), as well as the fitted values \({\widehat{y}}_{t,ARMA}\). That is: \({\widehat{y}}_{t,ARMA}= y-{\epsilon }_{t,ARMA}\)
-
Step 2: Fit \({\epsilon }_{t,ARMA}\) into a GARCH(1,1) model to obtain the parametric components \(\left(\widehat{\omega },\widehat{\alpha }, \widehat{\beta }\right)\) of the variance equation. That is: \({\epsilon }_{t,ARMA}= {\widehat{y}}_{t,GARCH}+ {\epsilon }_{t,GARCH}\) Obtain the residuals \({\epsilon }_{t,GARCH},\)and the fitted values, \({\widehat{y}}_{t,GARCH}\).
-
Step 3: Fit \({\epsilon }_{t,GARCH}\) into a GAM using cubic regression splines with the high frequency variables \({x}_{i,t-1}\). That is: \({\epsilon }_{t,GARCH}= {\widehat{y}}_{t,GAM}+ {\epsilon }_{t,GAM}.\) This will yield the non-parametric components in the variance equation. The fitted values \({\widehat{y}}_{t,GAM}\)and the residuals \({\epsilon }_{t,GAM}\) are then obtained.
-
Step 4: At this point, we have the components of the mean (parametric), and the components of the variance (parametric and non-parametric). The dependent variable is then updated using \({y}_{t}\) and the residuals estimated from step 3 denoted by \({\epsilon }_{t,GAM}\). That is, the following equation is obtained: \({y}_{t,new}= {y}_{t}- {\epsilon }_{t,GAM}\).
-
Step 5: The algorithm iterates from steps 1 to 4 using the new values obtained for the response variable in step 4. That is, \({y}_{t,new}\) will now be the \({y}_{t}\) in the first step of the algorithm. The convergence criterion is set until there are minimal changes in the MSE (MSE < 0.001). The fitted values of the algorithm are obtained by summing up the individual components that were estimated separately: \({\widehat{y}}_{t,Model 2}= {\widehat{y}}_{t,ARMA}+{\widehat{y}}_{t,GARCH}+ {\widehat{y}}_{t,GAM}\). The MSE is defined in the same way as the VF-MeanS model.
5 Simulation Studies
The data generating process in simulations would induce a shift in the mean, variance, or autoregressive (AR) parameter at a certain structural break point. The daily data was first simulated to provide the high frequency information. A Geometric Brownian Motion (GBM) process was used to simulate the evolution of stock prices subject to assumed drift and volatility parameters, see for example Reddy and Clinton (2016), Bratian et al. (2022). A GBM process has the following Eq. 5.
Where \({X}_{t}\) is the simulated daily stock price at day \(t\)
\({Z}_{t}\) is generated from \(N\left(\text{0,1}\right)\)
\(t=1,\dots , T\) in relation to the length of the time series
\(j=1 or 2\) denoting the periods before and after the structural change
\({\mu }_{j}\) is the daily drift parameter
\({\sigma }_{j}\) is the daily volatility parameter
\({X}_{o}\)is an initial value
\({v}_{t}\) is a noise term from \(U(-\text{1,1})\)
Structural change was induced by shifting the mean (\({\mu }_{j})\)or variance parameter (\({\sigma }_{j}\)) as the sample path evolves. Meanwhile, daily drift and volatility parameters were obtained from assumed annual levels, i.e., \({\mu }_{j}={\mu }_{daily}=\frac{{\mu }_{annual}}{255}\) and \({\sigma }_{j}= {\sigma }_{daily}=\frac{{\sigma }_{annual}}{\sqrt{255}}\).
To simulate a change in the AR parameter, \({Z}_{t}\) was assumed to follow an AR(1) process in Eq. 6. The structural change was then induced by shifting the value of the AR parameter\({\varphi }_{j}\).
The high frequency \({\{X}_{t}\}\) were then sequentially grouped into five, representing the trading days. Separate covariates were generated for every trading day (from Monday to Friday), which would contain the high frequency information. The lower frequency response variable, which approximates weekly stock returns, was then computed following Eq. 7.
Figures 1, 2 and 3 illustrate the simulated high frequency data, representing for example the stock price, as well as the dlog-transformed data, which approximate stock returns. For a change in the mean (Fig. 1), the variability of daily stock returns generally remains the same although the range of values widens in terms of possible gains and losses.
For a change in the variance (Fig. 2), daily stock returns show an obvious increase in volatility as the graph has a more erratic pattern post-structural change.
As for a change in the AR parameter (Fig. 3), stronger temporal dependencies post-structural change is demonstrated by the more persistent patterns over time for the daily stock returns.
Aside from the type of parameter shift, various break point locations and time series lengths were simulated. Temporal dependencies were also varied to evaluate the performance of the proposed models when there is autocorrelation. Different initial values were simulated for robustness of results. Note that 100 replicates were simulated per simulation setting. Table 1 summarizes the simulation settings examined in this study.
6 Results and Discussion
To compare the performance of the proposed models with the baseline models, data was split into a training set and testing set. The first \((T-4)\) observations served as the training set while the last four observations were taken to be the test set where out-of-sample predictive accuracy would be evaluated.
Mean Absolute Deviation (MAD) and Root Mean Square Error (RMSE) were used to assess in-sample model fit. MAD tracks the average of the absolute deviations between actual and fitted data:
Meanwhile, Mean Squared Error (MSE) takes the average of the squared errors between the actual and fitted values. Its square root then corresponds to the RMSE.
For out-of-sample forecast accuracy, Median Absolute Percentage Error (MdAPE) was used. The median was obtained as structural change and volatility clustering are prone to outlying values resulting to skewed the forecast statistics. In particular, MdAPE is given in Eq. (10).
Model metrics per simulation setting were summarized by getting the mean of the MdAPE of each replicate (given 100 replicates per simulation scenario).
The performance of the models was evaluated based on simulated data from 288 scenarios specified in Table 1 and from the comparison with the baseline low frequency models.
6.1 Effect of Change in the Mean (In-Sample Forecast)
Table 2 shows the model performance metrics for a change in the mean from \({\mu }_{1}=10\%\) to \({\mu }_{2}=30\%\). The in-sample fit improves as the length of time series increases, which is expected given more data points provide more information on the data generating process. The VF-MeanS model has superior in-sample fit across all simulation settings compared to the baseline ARMA-Intervention model
We also consider in Table 2a change in the mean when the simulated data are autocorrelated with \(\varphi =0.5\). Temporal dependencies leads to the decline of in-sample predictive ability in comparison with the independent data, but VF-MeanS once again has superior in-sample predictions compared to the ARMA-Intervention model across all settings.
6.2 Effect of Change in the Variance (In-Sample Forecast)
Table 3 shows the model performance metrics for a change in the variance from \({\sigma }_{1}=5\%\) to \({\sigma }_{2}=15\%\) when the simulated data are independent. This setting simulates a scenario where the structural change leads to greater volatility in the simulated time series.
In-sample predictive ability also improves with longer time series while it deteriorates when the structural break point occurred in a distant past. Nevertheless, VF-VarianceS has lower MAD and RMSE statistics compared to the baseline GARCH model indicating improvement of in-sample predictive ability.
Table 3 also shows model performance when there is a change in the variance and autocorrelation is present (\(\varphi =0.5).\) The results mirror the case of an independent data generating process. Once again, in-sample predictions improves with longer time series and VF-VarianceS once again dominates GARCH.
6.3 Effect of Change in the Temporal Dependencies (In-Sample Forecast)
Table 4 shows in-sample predictive ability when there is a change in the AR parameter from, i.e., independent vis-à-vis autocorrelated process where \(\varphi =0.5.\) In-sample prediction improves as the time series increases in length. Similarly, in-sample predictive ability deteriorates when the structural break occurred in the distant past.
Table 5 shows model performance when there is a change in the AR parameter from \(\varphi =0.5\)to \(\varphi =0.9.\) Expectedly, in-sample predictive ability of the models is inferior compared to the case where only a moderate shift in the AR parameter occurred. Still, in-sample predictive ability improves with longer time series. VF-MeanS is again the best model across all simulation settings in terms of in-sample predictive ability with a change in the AR parameter.
The simulations exhibit superior predictions from the proposed models compared to benchmark models regardless of the type of parameter shift, the structural break point, and the length of the series. This goes for the VF-MeanS model compared to the ARMA-Intervention model, as well as the VF-VarianceS model compared to the GARCH model. As for a shift in the AR parameter, the VF-MeanS model particularly dominates all the other models in terms of in-sample predictive ability. Also, the performance of the models improves when the time series is longer, while the presence of autocorrelation reduces in-sample predictive ability.
6.4 Effect of Change in the Mean (out of Sample Forecast)
Consider a change in the mean from \({\mu }_{1}=10\%\)to \({\mu }_{1}=30\%\)and the data are independent, the VF-MeanS model has superior out-of-sample predictive ability relative to the ARMA-Intervention model when the structural change occurred more recently or the time series is short. The instances where ARMA-Intervention edged out VF-MeanS were cases where the structural change occurred in the more distant past, see Table 6 for details.
Even with autocorrelation (\(\varphi =0.5)\), the results mirror that of the independent data generating process when there is a shift in the mean, i.e., the VF-MeanS model has superior out-of-sample predictive performance compared to the ARMA-Intervention model, especially when the structural change is recent or the time series is short, see Table 6 for details.
6.5 Effect of a Change in the Variance (Out-of-Sample Forecast)
Though having poorer in-sample predictive ability, baseline GARCH model still has superior out-of-sample forecast accuracy across all simulation scenarios compared to the VF-VarianceS model (Table 7). This is true even with a major shift in variance and considering independent data.
The baseline GARCH model again outperforms the VF-VarianceS model in terms of out-of-sample predictive ability across all simulation scenarios with a major shift in the variance when the data are autocorrelated (Table 7). This confirms the earlier result that though the VF-VarianceS model has improved in-sample predictive ability, out-of-sample forecast is not necessarily enhanced compared to the baseline GARCH model.
6.6 Effect of Change in the AR Parameter (Out-of-Sample Forecast)
As for structural change with the presence of autocorrelation (\({\varphi }_{1}=0.0\) to \({\varphi }_{2}=0.5)\), the VF-MeanS and GARCH have comparable out-of-sample predictive performance. Across all simulation scenarios, VF-MeanS and GARCH alternate as the best and second-best models in terms of out-of-sample predictive ability (Table 8).
On the other hand, Table 9 shows a shift in the AR parameter from \({\varphi }_{1}=0.5\) to \({\varphi }_{2}=0.9\). This is scenario is associated with extreme temporal dependencies as the AR parameter is nearly 1 post-structural change (i.e. approaching non-stationarity). Under this condition, the VF-MeanS and GARCH models again yielded the lowest out-of-sample forecast errors. Their performance is comparable as both models do not explicitly account for a shift in the autoregressive parameter – the VF-MeanS mainly considers a shift in the mean parameter, while the baseline GARCH model allows for the conditional variance to vary over time.
7 Applications
The models are used to forecast economic and financial data indicators, important inputs that aid economic agents to arrive at investment and policy decisions.
7.1 Financial Data
The models are used to predict stock returns for financial data with structural change. Weekly and daily levels of stock prices and returns were obtained to demonstrate the performance of the proposed models. A total of 485 daily observations, corresponding to 97 weekly observations, from January 1, 2019 to December 31, 2020 were collected for the Jollibee Foods Corporation (JFC) stock from published data.
The structural change occurred in March 2020 with the imposition of the community quarantines at the start of the COVID-19 pandemic. Figure 4 shows the high frequency data starting with the daily stock price (left panel) and the dlog-transformed data representing the daily stock returns (right panel). From as high as PhP325.2 per share in January 2019, JFC stock prices plunged to as low as PhP91.1 per share with the COVID-19 shock in March 2020. This is punctuated by a one-day loss of 28.4% when the enhanced community quarantine was announced by the Philippine national government. The stock returns graph also shows that the imposition of community quarantines resulted in more erratic behavior post-March 2020.
Figure 5 then shows the weekly price and stock returns of JFC, representing the low frequency information. The structural change is still evident with the rapid drop in the JFC stock price (left panel) and the more volatile behavior of weekly stock returns post-imposition of community quarantines (right panel).
JFC stock data from January 2019 to December 2020 were split into training and test sets. The first 93 weekly observations comprised the training set while the last 4 weekly observations were set aside for out-of-sample forecast evaluation. Note that the structural break occurred at the 60th observation out of 97 total weekly observations, coinciding with the imposition of enhanced community quarantine.
Model performance was then compared across the proposed (VF-MeanS and VF-VarianceS) and baseline models (ARMA-Intervention and GARCH) since the source of structural change is not known (shift in the mean, variance, or AR parameter, or combination).
The proposed models are superior to the benchmark models in terms of in-sample predictions as shown by the MAD and RMSE statistics. This is consistent with the results of the simulations as the high frequency data provide supplementary information on the structural change and better capture non-linear relationships through smoothing functions. As for predictive ability, VF-MeanS resulted in MdAPE of 77.19%, outperforming all other models (Table 10).
7.2 Economic Indicators
Low frequency data is monthly data, while high frequency data is weekly data. Note that most official statistics and economic indicators are tracked at monthly (or even at quarterly) intervals. This simulates a situation where economic managers may make better decisions with supplementary information from high frequency data, say weekly data.
The low frequency dependent variable would be the average monthly exchange rate between the US dollar and the Philippine peso. Meanwhile, the high frequency independent variables would be the common weekly gasoline pump prices (octane rating 91) in Manila as provided by the Philippine Department of Energy.
Per economic theory, more expensive fuel prices would pose depreciation pressures on the local currency. The Philippines imports practically all of its fuel requirements and rising oil prices would raise the demand for dollars, given oil prices are quoted in US dollars. Ceteris paribus, the rise in oil prices will translate to depreciation of the local currency.
Average monthly foreign exchange data (USD/PhP) were obtained from January 2018 to April 2022 for a total of 52 observations. A total of 208 weekly observations on gasoline fuel pump prices were also collected for the same time period. Figure 6 shows the monthly foreign exchange rates, with a breakpoint in March 2021 corresponding to the military build-up of Russia in the Ukrainian border.
Figure 7 shows the weekly fuel pump prices for gasoline (RON 91) in the same time period. Once again, the structural breakpoint is shown in March 2021 with the escalation of Russia-Ukraine border tensions.
The monthly foreign exchange data from January 2018 to April 2022 were split into training and test sets. The first 47 monthly observations (of the dlog-transformed dataset) comprised the training set while the last 4 monthly observations were set aside for out-of-sample prediction. Note that the structural break occurs at the 38th observation out of 51 total monthly observations, denoting the escalation of border tensions between Russia and Ukraine. One observation was removed from the original 52 monthly observations given that first differencing was applied to achieve stationarity.
Model performance was again compared across the proposed (VF-MeanS and VF-VarianceS) and baseline models (ARMA-Intervention and GARCH). In terms of results, the proposed models are again superior to the benchmark models in terms of in-sample fit as evidenced by the MAD and RMSE statistics. As for predictive ability, VF-MeanS resulted in an MdAPE of 93.10%, outperforming all the other models. The baseline GARCH model came in second with an MdAPE of 99.08 (see Table 11).
8 Conclusion
In a structural change in the mean, the VF-MeanS model produced superior in-sample predictive ability across all scenarios compared to the benchmark ARMA-Intervention model. In-sample predictive ability also improved as the length of the time series increases. For out-of-sample predictive ability, the VF-MeanS model consistently produced superior forecasts against the baseline ARMA-Intervention model when the structural change occurs in a recent time or that the time series is short. This is because such cases focusing at low frequency data will provide minimal information on the structural change that can be visible at high frequencies.
Structural change reflecting a shift in the variance is best captured by the conditional variance model VF-VarianceS, with better in-sample predictive ability compared to the baseline GARCH model.
With structural change manifesting in the shift in AR parameter, the proposed models outperformed the baseline parametric models in terms of in-sample predictive ability. Meanwhile, the VF-MeanS model performed well in out-of-sample forecasts, comparable to the baseline GARCH model.
It is quite common for structural change to occur in-between measurements of economic and financial indicators. In the meantime, there could be lags in policy adjustments aggravating the impact of structural change on policy outcome indicators prior to the intervention. With availability of indicators that reflect structural change measured at high frequency, the model is able to manifest these movements and can lead to decision/policy-makers to anticipate its impact to the low frequency outcome indicator. With more observations at high frequencies, MIDAS can easily suffer from overparameterization, this can be resolved with a nonparametric link between the high frequency structural change-sensitive indicator to the low frequency outcome indicator. This will capture the dynamic behavior of the systems amidst indicators compiled at varying frequencies.
Code Availability
Simulated data and codes that support findings of this study are available from: https://github.com/AMG824/Semi-parametric-VF-Models-with-Structural-Change/tree/GBM-Runs.
References
Agostino, A., Gambetti, L., & Giannone, D. (2013). Macroeconomic forecasting and structural change. Journal of Applied Economics, 28, 82–101.
Altansukh, G., & Osborn, D. (2022). Using structural break inference for forecasting time series. Empirical Economics, 63, 1–41.
Amsler, C., & Lee, J. (1995). An LM test for a unit root in the presence of a structural change. Econometric Theory, 11, 359–368.
Benito, J., Lansangan, J., & Barrios, E. (2024). Semiparametric volatility model with varying frequencies. forthcoming in Communications in Statistics - Simulation and Computation, https://doi.org/10.1080/03610918.2024.2356236
Bolleslev, T. (1986). Generalized Autoregressive conditional heteroscedasticity. Journal of Econometrics, 81(1), 193–221.
Bratian, V., Acu, A., Mihaiu, D., & Serban, R. (2022). A geometric brownian motion (GBM) of stock indexes and financial market uncertainty in the context of Non-crisis and Financial Crisis scenarios. Mathematics, 10, 309. https://doi.org/10.3390/math10030309.
Breymann, W., Dias, A., & Embrechts, P. (2003). Dependence structures for Multivariate high-frequency data in Finance. Quantitative Finance, 3(1), 1–14.
Buja, A., Hastie, T., & Tibshirani, R. (1989). Linear smoothers and Additive models. The Annals of Statistics, 17(2), 453–555.
Campano, W., & Barrios, E. (2011). Robust estimation of a Time Series Model with Structural Change. Journal of Statistical Computation and Simulation, 81(7), 909–927.
Chen, L., Wang, W., & Wu, W. (2022). Inference of breakpoints in high-dimensional time series. Journal of the American Statistical Association, 117(540), 1951–1963.
Coelho, E., & Nunes, L. (2011). Forecasting mortality in the event of a structural change. Journal of the Royal Statistical Society A, 174(3), 713–736.
Dacorogua, M., Gencay, R., Muller, U., Olsen, R., & Picktet, O. (2001). An introduction to high frequency finance (p. 4). Academic.
Davis, R., Lee, T., & Rodriguez-Yam, G. (2006). Structural break estimation for nonstationary time series models. Journal of the American Statistical Association, 101(473), 223–239.
Doerr, B., Fischer, P., Hilbert, A., & Witt, C. (2017). Detecting structural breaks in time series via genetic algorithms. Soft Computing, 21, 4707–4720.
Durbin, J., & Koopman, S. (2012). Time Series Analysis by State Space methods (2nd ed.). Oxford University Press.
Engle, R. F. (2000). Econometrics of high frequency data. Econometrica, 68(1), 1–22.
Esmenda, M., & Barrios, E. (2018). Robust estimation of a Multilevel Model with Structural Change. Communications. Statistics – Simulation and Computation, 47(4), 1014–1027.
Ghysels, E., Santa-Clara, P., & Valkanov, R. (2004). The MIDAS Touch: Mixed Data Sampling Regression Model. CIRANO Working Paper, 2004s-20.
Hafner, C. M. (2009). GARCH Modeling in Meyers, R. Ed. Complex Systems in Finance and Econometrics. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-7701-4_26, pp. 464–483.
Hansen, B. (2001). The New Econometrics of Structural Change: Dating breaks in U.S. Labor Productivity. Journal of Economic Perspectives, 15(4), 117–128.
Hastie, T., & Tibshirani, R. (1986). Generalized additive models. Statistical Science, 1(3), 297–318.
Huth, N., & Abergel, F. (2014). High frequency Lead/Lag relationships-empirical facts. Journal of Empirical Finance, 26, 41–58.
Karavias, Y., Narayan, P., & Westerlund, J. (2023). Structural breaks in interactive effects panels and the stock market reaction to COVID-19. Journal of Business and Economic Statistics, 41(3), 653–666.
Lumsdaine, R., Okui, R., & Wang, W. (2023). Estimation of panel group structure models with structural breaks in group memberships and coefficients. Journal of Econometrics, 233, 45–65.
Mahir, A., & Al-Khazaleh, A. (2008). Estimation of Missing Data by Using the Filtering Process in Time Series Modeling. Pre-print, arXiv: 0811.0659.
Mariano, R., & Ozmucur, S. (2015). High-mixed-frequency dynamic latent factor forecasting models for GDP in the Philippines. Estudios De Economia Aplicada, 33(3), 451–461.
Mariano, R., & Ozmucur, S. (2020). Predictive performance of mixed-frequency nowcasting and forecasting models with application to philippine inflation and GDP growth. PIER Working Paper No. 20–029, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania.
Opsomer, J. D. (2000). Asymptotic properties of Backfitting estimators. Journal of Multivariate Analysis, 73, 166–179. https://doi.org/10.1006/jmva.1999.1868.
Pesaran, M., Pettenuzzo, D., & Timmermann, A. (2006). Forecasting time series subject to multiple structural breaks. Review of Economic Studies, 73, 1057–1084.
Reddy, K., & Clinton, V. (2016). Simulating stock prices using geometric brownina motion: Evidence from Australian companies. Australian Accounting Business and Finance Journal, 10(3), 23–47.
Safikhani, A., & Shojaie, A. (2022). Joint structural break detection and parameter estimation I high dimensional nonstationary VAR models. Journal of the American Association, 117(537), 251–264.
Villejo, S., Barrios, E., & Lansangan, J. (2017). Robust estimation of a dynamic spatio-temporal Model with Structural Change. Journal of Statistical Computation and Simulation, 87(3), 505–518.
Yoon, G. (1998). Forecasting with structural change: Why is the random walk model so damned difficult to beat? Applied Economics Letters, 5(1), 41–42.
Funding
The authors did not receive support from any organization for the submitted work.
Open Access funding enabled and organized by CAUL and its Member Institutions
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical Approval
Simulated data was used in the assessment of the proposed methods. Published data are used in the illustration.
Conflict of Interest
The authors have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Glova, A.M.G., Barrios, E.B. Modelling Mixed-Frequency Time Series with Structural Change. Comput Econ (2024). https://doi.org/10.1007/s10614-024-10672-8
Accepted:
Published:
DOI: https://doi.org/10.1007/s10614-024-10672-8