# Identification of asymmetric conditional heteroscedasticity in the presence of outliers

- 1k Downloads
- 2 Citations

## Abstract

The identification of asymmetric conditional heteroscedasticity is often based on sample cross-correlations between past and squared observations. In this paper we analyse the effects of outliers on these cross-correlations and, consequently, on the identification of asymmetric volatilities. We show that, as expected, one isolated big outlier biases the sample cross-correlations towards zero and hence could hide true leverage effect. Unlike, the presence of two or more big consecutive outliers could lead to detecting spurious asymmetries or asymmetries of the wrong sign. We also address the problem of robust estimation of the cross-correlations by extending some popular robust estimators of pairwise correlations and autocorrelations. Their finite sample resistance against outliers is compared through Monte Carlo experiments. Situations with isolated and patchy outliers of different sizes are examined. It is shown that a modified Ramsay-weighted estimator of the cross-correlations outperforms other estimators in identifying asymmetric conditionally heteroscedastic models. Finally, the results are illustrated with an empirical application.

## Keywords

Cross-correlations Leverage effect Robust correlations EGARCH## JEL Classification

C22## 1 Introduction

One of the main topics that has focused the research of Agustín over a long period of time is seasonality. However, this is not his only topic of interest. Agustín’s contributions to the Econometric Time Series literature are much broader and include, among others, the treatment of outliers in time series; see, for example, Maravall and Peña (1986), Peña and Maravall (1991), Gómez et al. (1999) and Kaiser and Maravall (2003). In these papers, Agustín and his coauthors consider the effects and treatment of outliers in macroeconomic data and, consequently, deal primarily with linear time series models. However, outliers are also present in the context of financial time series mainly when they are observed over long periods of time. It is important to note that, in this framework, the interest shifts from conditional means to conditional variances and, consequently, to non-linear models. Agustín has also contributions in this area; see Fiorentini and Maravall (1996) for an analysis of the dynamic dependence of second order moments.

When dealing with financial data, many series of returns are conditionally heteroscedastic with volatilities responding asymmetrically to negative and positive past returns. In particular, the volatility is higher in response to past negative shocks (‘bad’ news) than to positive shocks (‘good’ news) of the same magnitude. Following Black (1976) this feature is commonly referred to as *leverage effect*. Incorporating the leverage effect into conditionally heteroscedastic models is important to better capture the dynamic behaviour of financial returns and improve the forecasts of future volatility; see Bollerslev et al. (2006) for an extensive list of references and Hibbert et al. (2008) for a behavioral explanation of the negative asymmetric return–volatility relation. The identification of conditional heteroscedasticity is often based on the sample autocorrelations of squared returns. Carnero et al. (2007) show that the presence of outliers biases these autocorrelations with misleading effects on the identification of time-varying volatilities. On the other hand, the identification of leverage effect is often based on the sample cross-correlations between past and squared returns. Negative values of these cross-correlations indicate potential asymmetries in the volatility; see, for example, Bollerslev et al. (2006), Zivot (2009), Rodríguez and Ruiz (2012) and Tauchen et al. (2012). In this paper, we analyse how the identification of asymmetries, when based on the sample cross-correlations, can also be affected by the presence of outliers.

This paper has two main contributions. First, we derive the asymptotic biases caused by large outliers on the sample cross-correlation of order *h* between past and squared observations generated by uncorrelated stationary processes. We show that *k* large consecutive outliers bias such correlations towards zero for \(h\ge k\), rendering the detection of genuine leverage effect difficult. In particular, one isolated large outlier biases all the sample cross-correlations towards zero and so it could hide true leverage effect. Moreover, the presence of two big consecutive outliers biases the first-order sample cross-correlation towards 0.5 (\(-0.5\)) if the first outlier is positive (negative) and so it could lead to identify either spurious asymmetries or asymmetries of the wrong sign.

The second contribution of this paper is to address the problem of robust estimation of serial cross-correlations by extending several popular robust estimators of pairwise correlations and autocorrelations. In the context of bivariate Gaussian variables, there are several proposals to robustify the pairwise sample correlation; see Shevlyakov and Smirnov (2011) for a review of the most popular ones. However, the literature on robust estimation of correlations for time series is scarce and mainly focused on autocovariances and autocorrelations. For example, Hallin and Puri (1994) propose to estimate the autocovariances using rank-based methods. Ma and Genton (2000) introduce a robust estimator of the autocovariances based on the robust scale estimator of Rousseeuw and Croux (1992, 1993). More recently, Lévy-Leduc et al. (2011) establish its asymptotic and finite sample properties for Gaussian processes. Ma and Genton (2000) also suggest a possible robust estimator of the autocorrelation function but they do not further discuss its properties neither apply it in their empirical application. Finally, Teräsvirta and Zhao (2011) propose two robust estimators of the autocorrelations of squares based on the Huber’s and Ramsay’s weighting schemes. The theoretical and empirical evidence from all these papers strongly suggests using robust estimators to measure the dependence structure of time series.

We analyse and compare the finite sample properties of the proposed robust estimators of the cross-correlations between past and squared observations of stationary uncorrelated series. As expected, these estimators are resistant against outliers remaining the same regardless of the size and the number of outliers. Moreover, even in the presence of consecutive large outliers, the robust estimators considered estimate the true sign of the cross-correlations although they underestimate their magnitudes. Among the robust cross-correlations considered, the modified version of the Ramsay-weighted serial autocorrelation suggested by Teräsvirta and Zhao (2011) provides the best resistance against outliers and the lowest bias.

To illustrate the results, we compute the sample cross-correlations and their robust counterparts of a real series of daily financial returns. We show how consecutive extreme observations bias the usual sample cross-correlations and could lead to wrongly identifying potential leverage effect. These empirical results enhance the importance of using robust measures of serial correlation to identify both conditional heteroscedasticity and leverage effect.

The rest of the paper is organized as follows. Section 2 is devoted to the analysis of the effects of additive outliers on the sample cross-correlations between past and squared observations of stationary uncorrelated time series that could be either homoscedastic or heteroscedastic. Section 3 considers four robust measures of cross-correlation and compares their finite sample properties in the presence of outliers. The difficulty of extending the Ma and Genton (2000) proposal to the estimation of serial cross-correlation is discussed in Sect. 4. The empirical analysis of a time series of daily Dow Jones Industrial Average index is carried out in Sect. 5. Section 6 concludes the paper with a summary of the main results and proposals for further research.

## 2 Effects of outliers on the identification of asymmetries

In this section, we derive analytically the effect of large additive outliers on the sample cross-correlations between past and squared observations generated by uncorrelated stationary processes that could be either homoscedastic or heteroscedastic. The main results are illustrated with some Monte Carlo experiments.

### 2.1 Asymptotic effects

*k*consecutive outliers with the same sign and size, \(\omega \). The observed series is then given by

*h*, \(h\ge 1,\) between past and squared observations of \(z_{t}\), which is given by

*T*, is large relative to the cross-correlation order,

*h*,

*h*is smaller than the number of consecutive outliers, i.e. \(h<k\), expression (4) can be written in terms of the original uncontaminated series \(y_{t}\), as follows

*k*large consecutive outliers always biases \(r_{12}(h)\) towards zero for lags \(h\ge \)

*k*and, for smaller lags, it generates positive or negative cross-correlations depending on whether the outliers are positive or negative. For example, if

*T*is large, two huge positive (negative) consecutive outliers generate a first order cross-correlation tending to 0.5 (\(-0.5\)), being all the others close to zero; see Maronna et al. (2006) and Carnero et al. (2007) for a similar result in the context of sample autocorrelations of levels and squares, respectively. Therefore, if a heteroscedastic time series without leverage effect or an uncorrelated homoscedastic series is contaminated by several large negative consecutive outliers, the negative cross-correlations generated by the outliers can be confused with asymmetric conditional heteroscedasticity.

^{1}In practice, we will not face such huge outliers as to reach the limiting values of \(r_{12}(h)\) in (7), but the result is still useful because it provides a clue on the direction of the bias of the cross-correlations.

Note also that the results above are still valid if the outliers have different sizes. In this case, we can write \(\omega _{t}=\omega +\delta _{t}\) in (1) instead of \(\omega \) and the results will be the same when \(|\omega |\rightarrow \infty .\)

### 2.2 Finite sample effects

^{2}

*h*, we compute their average over all replicates. The first row of Fig. 1 plots the average sample cross-correlations from the uncontaminated white noise process (left panel) and for the uncontaminated EGARCH process (right panel). The average sample cross-correlations computed from the corresponding contaminated series with one and two outliers are plotted in the second and third rows, respectively. In all cases, the red solid line represents the true cross-correlations.

As we can see, when a series generated by the EGARCH model is contaminated with one single large negative outlier, we may wrongly conclude that there is not leverage effect since all the cross-correlations become nearly zero. On the other hand, when the series is contaminated with two consecutive outliers of different sign, being the first one negative, only the first cross-correlation will be different from zero and approximately equal to \( -0.5\) regardless of whether the series is homoscedastic or heteroscedastic. Therefore, in this case, we can identify either a negative leverage effect when there is none (the series is truly a Gaussian white noise) or a much more negative leverage effect than the actual one (as in the case of the EGARCH model). Similar results would be obtained if the two outliers were positive, but in this case the first cross-correlation would be biased towards 0.5. Consequently, we could wrongly identify asymmetries in a series that is actually white noise or we could identify a positive leverage effect when it is truly negative as in the EGARCH process.

As we can see, the sample cross-correlations start being distorted when the outliers are larger (in absolute value) than 5 standard deviations. Furthermore, when the size of the outliers is over 20, the corresponding sample cross-correlations are already quite close to their limiting values (\( -0.5\) in the first order cross-correlation and 0 in the second order cross-correlation). Moreover, the size of two consecutive outliers does not need to be very large to distort the first order sample cross-correlation. However, a single outlier needs to be of larger magnitude to bias this correlation towards zero. In homoscedastic series, two consecutive outliers have a tremendous effect on the first order sample cross-correlation, even if they are not very big, and could lead to wrongly identify asymmetries in a series that is actually white noise. On the other hand, the first cross-correlations of a heteroscedastic series contaminated with one single outlier as big as 15 or 20 could be confused with those of a white noise. Similar results would be obtained if the series were contaminated with positive outliers but they are not reported here to save space.

## 3 Robust cross-correlations

In the previous section we have shown that the sample cross-correlations between past and squared observations of a stationary uncorrelated series are very sensitive to the presence of outliers and could lead to a wrong identification of asymmetries. In this section we consider robust cross-correlations to overcome this problem. In particular, we generalize some of the robust estimators for the pairwise correlations described in Shevlyakov and Smirnov (2011) and one of the robust autocorrelations proposed by Teräsvirta and Zhao (2011). We discuss their finite sample properties and compare them to the properties of the sample cross-correlations.

### 3.1 Extensions of robust correlations

*sample cross-correlation median estimator:*

*med*(

*x*) stands for the sample median of

*x*and

*MAD*denotes the sample median absolute deviation, i.e. \(MAD(x)=med(|x-med(x)|)\). Unless otherwise stated, the median is calculated over the whole sample. When the median is calculated over a subsample, this is specifically stated, as in (9), where \(med_{t\in \{h+1,...,T\}}\) denotes the sample median calculated over the subsample indexed by \(t\in \{h+1,...,T\}\).

*Blomqvist cross-correlation coefficient:*

*X*and

*Y*, denoted by \(\rho \), can also be based on a scale approach, by means of the following identity:

*X*and

*Y*, respectively. In order to get robust estimators for \(\rho \), Gnanadesikan and Kettenring (1972) propose replacing the variances and standard deviations in (11) and (12), respectively, by robust estimators as follows

*median cross-correlation coefficient*, would be:

*h*proposed is given by

### 3.2 Monte Carlo experiments

Several conclusions emerge from Fig. 3. First, as expected, robust measures of cross-correlations are resistant to the presence of outliers, either isolated or in patches; note that the plots displayed in the first row are nearly the same to those displayed in the other two rows. Second, in EGARCH processes, the robust cross-correlations estimate the sign of the true cross-correlations properly but they underestimate their magnitude. In fact, the first three robust cross-correlations (\(r_{12,COMED}\), \(r_{12,B}\) and \(r_{12,MED}\)) estimate asymmetries which are much weaker than the true ones. However, the weighted cross–correlation, \(r_{12,W},\) performs quite well because its bias is much lower than those of the other robust measures even in the presence of two big consecutive outliers. Actually, the values of \(r_{12,W}\) are very close to their theoretical counterparts. This could be due to the fact that the first three robust measures considered are direct extensions of the corresponding robust estimators originally designed to estimate the pairwise correlation coefficient for a bivariate Gaussian distribution. In such framework, some of these measures, like the Blomqvist quadrant correlation and the median correlation coefficient are asymptotically minimax with respect to bias or variance. However, in time series data, and, in particular, in conditional heteroscedastic time series, none of these assumptions hold and hence the behaviour of these measures is not that good as postulated for the bivariate Gaussian case. Unlike, the Ramsay-weighted autocorrelation estimator proposed by Teräsvirta and Zhao (2011) was already designed to cope with time series data, and this could be the reason for the good performance of \(r_{12,W}\) in estimating cross-correlations.

We also perform a similar analysis to that in Sect. 2.2, by studying the effect of the size of the outliers on the four robust cross-correlations for the two types of contamination, namely contamination with one isolated outlier of size \(\{-\omega \}\) and with two consecutive outliers of sizes \( \{-\omega ,\omega \}\), where \(\omega =\{1,2,...,50\}.\) The results, which are not displayed here to save space but are available upon request, are as expected. Robust cross-correlations remain the same regardless of the size and the number of outliers. Moreover, they all subestimate the magnitude of the leverage effect, but the bias in the weighted cross-correlation, \( r_{12,W},\) is negligible as compared to the alternative robust cross-correlations considered.

^{3}Table 1 reports the Monte Carlo means and standard deviations (in parenthesis) of the first order sample cross-correlation, as defined in (2), and of the four robust cross-correlations introduced in Sect. 3.1, for the two processes, Gaussian white noise and EGARCH, and for the two types of contamination; Fig. 4 displays the corresponding box-plots.

Monte Carlo means and standard deviations of several estimators of the first-order cross-correlation between past and current squared observations from uncorrelated stationary processes

Estimator | No outliers \(w=0\) | One single outlier \(w=-50\) | Two consecutive outliers \(w=\{-50,50\}\) |
---|---|---|---|

White noise (\(\rho _{12}(1)=0\)) | |||

\(r_{12}(1)\) | 0.0014 (0.0313) | \(-0.0009\) (0.0168) | \(-0.4548\) (0.0112) |

\(r_{12,COMED}(1)\) | 0.0005 (0.0247) | 0.0005 (0.0247) | 0.0002 (0.0246) |

\(r_{12,B}(1)\) | 0.0004 (0.0311) | 0.0005 (0.0312) | 0.0006 (0.0311) |

\(r_{12,MED}(1)\) | 0.0014 0.0504) | 0.0012 (0.0504) | 0.0012 (0.0502) |

\(r_{12,W}(1)\) | 0.0015 (0.0334) | 0.0015 (0.0322) | 0.0015 (0.0320) |

EGARCH (\(\rho _{12}(1)=-0.0662\)) | |||

\(r_{12}(1)\) | \(-0.0606\) (0.0555) | \(-0.0007\) (0.0171) | \(-0.4562\) (0.0172) |

\(r_{12,COMED}(1)\) | \(-0.0225\) (0.0250) | \(-0.0227\) (0.0249) | \(-0.0233\) (0.0250) |

\(r_{12,B}(1)\) | \(-0.0309\) (0.0301) | \(-0.0303\) (0.0308) | \(-0.0311\) (0.0307) |

\(r_{12,MED}(1)\) | \(-0.0252\) (0.0497) | \(-0.0255\) (0.0497) | \(-0.0251\) (0.0497) |

\(r_{12,W}(1)\) | \(-0.0551\) (0.0377) | \(-0.0596\) (0.0421) | \(-0.0590\) (0.0442) |

\(r_{12,Q}(1)\) | \(-0.0189\) (0.0340) | \(-0.0190\) (0.0340) | \(-0.0190\) (0.0340) |

As expected, when the series is a homoscedastic Gaussian white noise and there are no outliers or there is one isolated outlier, all estimators behave similarly and the sample cross-correlations perform very well. Note that, in this case, the sample correlation is the maximum likelihood estimator of its theoretical counterpart and therefore it is consistent and asymptotically unbiased and efficient. Unlike, the robust estimators have, in general, slightly larger dispersion since they are not as efficient as maximum likelihood estimators. However, when there are two consecutive outliers, the sample cross-correlation breaks down and it becomes unreliable: its distribution is completely pushed downwards and it would be estimating a large negative asymmetry when there is none. Unlike, all the robust estimators considered perform very well in terms of bias and \( r_{12,COMED}(1)\) also performs quite well in terms of variance.

When the simulated process is an EGARCH, another picture comes up. When the series is not contaminated, either the sample cross-correlation, \(r_{12}(1)\) , or the weighted cross–correlation, \(r_{12,W}(1)\), performs better than any of the other robust measures originally designed to estimate pairwise correlations in bivariate Normal distributions. However, when the EGARCH series is contaminated by one single negative outlier, the sample cross-correlation is pushed upwards towards zero, as postulated from the theoretical results in Sect. 2, and it would be unable to detect the true leverage effect in the data. The situation becomes even worse in the presence of two consecutive outliers, where the sample cross-correlation becomes completely unreliable due to its huge negative bias. As expected, the distribution of all the robust cross-correlations remain nearly the same regardless of the presence of outliers. However, the estimators \( r_{12,COMED}(1)\), \(r_{12,B}(1)\) and \(r_{12,MED}(1)\), in spite of their robustness, are upwards biased towards zero and so they will underestimate the true leverage effect. Unlike, the weighted sample cross-correlation with the modified Ramsay’s weights, \(r_{12,W}(1)\), performs surprisingly well in terms of bias, even in the presence of two big outliers. As it happened with the simulated white noise process, the estimator \(r_{12,MED}(1)\) has the largest standard deviation of all the estimators considered; see Table 1. Therefore, it seems that the robust cross-correlation \(r_{12,W}(1)\) is preferable to any other measure considered in this section for the identification of asymmetries in conditionally heteroscedastic models.

## 4 Discussion

In the previous section, we analyse the finite sample performance of several robust estimators of the cross-correlations, including the estimator in (13) with \(\widehat{S}\) defined as the Hampel’s median of absolute deviations. Other possible choices for \(\widehat{S}\) are the robust scale estimators \( S_{n}\) and \(Q_{n}\) proposed by Rousseeuw and Croux (1993). Shevlyakov and Smirnov (2011) show that the robust estimator of the pairwise correlation between bivariate Gaussian variables based on \(Q_{n}\) performs better than other robust correlation estimators. Ma and Genton (2000) suggest bringing this approach to estimate the autocorrelation of Gaussian time series. In this section, we show that this extension is not so straight when the processes involved are non-Gaussian.

*k*-th order statistic of

*X*, \( k\approx \left( {\begin{array}{c}n\\ 2\end{array}}\right) /4\) for large

*n*and \(c(F_{X})\) is a constant, that depends on the shape of the distribution function \(F_{X},\) introduced to achieve Fisher consistency. In particular, if \(F_{X}\) belongs to the location-scale family \(F_{\mu ,\sigma }(x)=F((x-\mu )/\sigma )\), the constant is chosen as follows

*X*and \(X^{\prime }\) independent random variables with distribution function

*F*; see Rousseeuw and Croux (1993). In particular, in the Gaussian case \((F=\Phi )\), the constant is:

^{4}

*X*and

*Y*in (12) represent two variables, \(Y_{t-h}\) and \(Y_{t}\), with the same model distribution and, consequently, \(\sigma _{X}=\) \(\sigma _{Y}\). Therefore, using identity (12) with \(\sigma _{X}=\) \(\sigma _{Y}\), plugging the scale estimator \(Q_{n}\) in (13) and taking into account that \(Q_{n}\) is affine equivariant, i.e. \( Q_{n}(aX+b)=|a|Q_{n}(X)\), the robust estimator of \(\rho (h)\) would be:

*X*and

*Y*in (12) will stand for \(Y_{t-h}\) and \(Y_{t}^{2}\) and so the constraint \( \sigma _{X}=\) \(\sigma _{Y}\) no longer holds. Hence, the first step to compute the robust estimator of \(\rho _{12}(h)=Corr(Y_{t-h},Y_{t}^{2}),\) based on identities (11) and (12), will be to ‘standardize’ the two series involved. Let \(\widetilde{Y_{t}}=Y_{t}/Q_{T}(\mathbf {y})\) and \(\widetilde{ Y_{t}^{2}}=Y_{t}^{2}/Q_{T}(\mathbf {y}^{2})\) denote the robust ‘standardized’ forms of the series \(Y_{t}\) and \(Y_{t}^{2}\), respectively, where \(Q_{T}( \mathbf {y})=c(F_{Y})Q_{T}^{*}(\mathbf {y})\) and \(Q_{T}(\mathbf {y} ^{2})=c(F_{Y^{2}})Q_{T}^{*}(\mathbf {y}^{2})\). The second step will be to form the vector of sums and the vector of differences:

Therefore, it is clear that the estimator (19) will require computing four constants, \(c(F_{Y})\), \(c(F_{Y^{2}})\), \(c(F_{\widetilde{U}})\) and \(c(F_{ \widetilde{V}})\), but this task seems to be unfeasible. Note that, even if *Y* were Gaussian, \(Y^{2}\) would be no longer Gaussian, neither \(Y+Y^{2}\) nor \(Y-Y^{2}\) would be. Moreover, even if we could compute the constants in such case, the assumption of Gaussianity for *Y* would be unsuitable, because the distribution of financial returns is known to be heavy-tailed.

To further illustrate what would it happen if we ignored the constants and proceed as in the Gaussian setting, we repeat the same Monte Carlo experiment described in previous sections for the EGARCH model, computing for each replicate the estimator \(r_{12,Q}(1)\) in (19) with all constants equal to \(c(\Phi )\). The Monte Carlo means and standard deviations are reported in the last row of Table 1. As expected, the results are disappointing: the estimator \(r_{12,Q}\) turns out to be the most biased among the robust estimators considered and it also has larger variance than both \(r_{12,COMED}(1)\) and \(r_{12,B}(1)\).

Hence, one should be very cautious before implementing robust estimators originally designed for bivariate Gaussian distributions in a time series setting with potential non-Gaussian variables.

## 5 Empirical application

In order to show how the potential outliers can mislead the detection of the leverage effect, as measured by the cross-correlations between past and squared returns, we use a rolling window scheme, where the sample size used to compute the cross-correlations is \(T=1000.\) Therefore, we first estimate the cross-correlations over the period from 2 October 1928 to 28 September 1932. When a new observation is added to the sample, we delete the first observation and re-estimate the cross-correlations. This process is repeated until we reach the last 1000 observations in the sample, from November 2, 2009 to August 30, 2013. This amounts to considering 20, 410 different subsamples covering periods of different volatility levels and different types and sizes of outliers. For instance, the first subsample, runing from 2 October 1928 to 28 September 1932, includes outliers associated with the 1929 Stock market crash, while the 13900*th* subsample, corresponding to observations from 21 March 1984 to 4 March 1988, includes outliers due to the 1987 Stock market crash. For each subsample considered, we compute the first order sample cross-correlation, \(r_{12}(1),\) using Eq. (2) and the corresponding robust weighted cross-correlation, \(r_{12,W}(1),\) as defined in Eq. (15), for both the original return series and the outlier-adjusted return series of Charles and Darné (2014)^{5}. Figure 6 displays the values of these cross-correlations for the 20410 subsamples considered. Note that the dates in the x-axis refer to the end-of-window dates. Figure 6 also displays the 95 % confidence bands based on the asymptotic distribution of the sample cross-correlations under the null of zero cross-correlations; see Fuller (1996). These bands are only shown for guidance, since not all the conditions for the asymptotic results to hold are fulfilled in our setting. Nevertheless, it is worth noting that the standard deviation predicted by the asymptotic theory for samples of size \(T=1000\) is aproximately 0.032, which is just the value of the standard deviation of \(r_{12,W}(1)\) in our Monte Carlo experiments with a white noise process; see Table 1.

Several conclusions emerge from Fig. 6. First, this figure clearly reveals how extreme observations can bias the sample cross-correlation and could lead to a wrong identification of asymmetries. As expected, the 1st order sample cross-correlation, \(r_{12}(1)\), presents several sharp drops and rises when it is computed for the original returns (top panel) and it is quite different from its robust counterpart. These changes are generally associated with the entrance and/or exit of outlying observations in the corresponding subsample. For instance, the entrance of the “Black Monday” October 19, 1987, where the DJIA sustained its largest 1-day drop (\( y_{14804}=-22.61\)), following another large negative return (\( y_{14804}=-4.60 \)), conveys a sudden fall in the value of \(r_{12}(1)\) from nearly zero to a negative value around \(-0.17\). Unlike, the next sudden rise in the value of \(r_{12}(1),\) from nearly \(-0.11\) to a positive value around 0.10, is due to the consecutive exit from the corresponding subsamples of the “Black Monday” and two adjacent extreme observations, \( y_{14805}=5.88\) (19/10/1987) and \(y_{14806}=10.15\) (21/10/1987). When these three observations, the first one being negative and the other two positive, are in the subsample, the value of \(r_{12}(1)\) is pushed downwards to a negative value, but when the first of these observations leaves the sample and only the positive outliers remain, \(r_{12}(1)\) is pushed upwards to a value even larger than zero, as postulated by our theoretical result in Sect. 2. Moreover, the bunch of lowest negative values of \(r_{12}(1)\), ranging from \(-0.25\) to \(-0.3\), is related to the entrance/exit in the corresponding subsamples of two consecutive extreme observations, namely \( y_{8422}=-5.71\) (28/5/1962) and \(y_{8423}=4.68\) (29/5/1962), the former being identified as an outlier in Charles and Darné (2014). According to our theoretical result in Sect. 2, the entrance of these two observations, the first one being negative and the second positive, biases downwards the first-order sample cross-correlation, but when the first of these observations leaves the sample and only the positive outlier remains, \( r_{12}(1)\) is again pushed to a value closer to zero. Similarly, the following sharp rise in \(r_{12}(1)\) from around \(-0.17\) to \(-0.05\) is due to an isolated positive outlier, namely \(y_{8799}=4.50\) (26/11/1963).

Another remarkable feature from Fig. 6 is the difference between the values of the sample cross-correlation in the top and bottom panels, enhancing the little resistance of \(r_{12}(1)\) to the presence of outliers. Unlike, the weighted cross-correlation, \(r_{12,W}(1)\), is robust to the presence of potential outliers: its values remain nearly the same in the two panels, indicating that the leverage effect suggested by the sample cross-correlation could be misleading in some cases.

Noticeable, the weighted robust and the sample 1st order cross-correlations are quite similar when computed for the outlier-corrected series (bottom panel), but the latter still exhibits some breaks even in this case. These breaks are associated with extreme observations that were not identified as outliers neither corrected in Charles and Darné (2014). For instance, the first sharp drop in \(r_{12}(1)\) from around \(-0.10\) to \(-0.24\) and its immediate rise again to \(-0.10\), have to do with the presence/absence of two couples of outliers: a doublet positive outlier made up of \(y_{1130}=9.03\) (19/4/1933) and \(y_{1131}=5.80\) (20/4/1933) and a doublet negative outlier made up of \(y_{1194}=-7.07\) (20/7/1933) and \(y_{1195}=-7.84\) (21/7/1933). A similar situation arises at one of the last subsamples, where the value of \( r_{12}(1)\) decays towards \(-0.22\); such a big drop is associated with the entrance of three consecutive extreme observations at the end of the subsample, namely \(y_{20162}=-5.07\) (19/11/2008), \(y_{20163}=-5.56\) (20/11/2008) and \(y_{20164}=6.54\) (21/11/2008), which, according to our theoretical result in Sect. 2, will bias downwards the first-order sample cross-correlation.

Finally, Fig. 6 highlights that the value of the robust cross-correlation, \(r_{12,W}(1)\), does not remain constant across all the subsamples considered. This feature suggests time-varying leverage effects, with periods where \(r_{12,W}(1)\) is nearly zero (possibly indicating no leverage) followed by periods where \(r_{12,W}(1)\) clearly takes negative values (leverage effect). In particular, there seems to be three sample periods where the leverage effect, as measured by \(r_{12,W}(1)\), seems to be stronger: a first period at the beginning of the sample, from April 1933 till June 1936, a second long period that spans from July 1940 till April 1971, aproximately, and a final period from around September 1989 till the end of the sample. Notice that only along these periods the robust sample cross-correlations are outside the approximated 95 % asymptotic confidence bands. Obviously, this feature requires further investigation; see, for instance, the recent papers of Bandi and Renò (2012), Yu (2012) and Jensen and Maheu (2014) dealing with time-varying leverage effects.

## 6 Conclusions

This paper shows that outliers can severely affect the identification of the asymmetric response of volatility to shocks of different signs when this is performed based on the sample cross-correlations between past and squared returns. In particular, the presence of one isolated outlier biases such cross-correlations towards zero and hence could hide true leverage effect while the presence of two big outliers could lead to detect either spurious asymmetries or asymmetries of the wrong sign. As a way to protect against the pernicious effects of outliers, we suggest using robust cross-correlations. Our Monte Carlo experiments show that, among the robust measures considered in this paper, the weighted cross-correlation based on a slight modification of the serial correlation with Ramsay’s weights proposed by Teräsvirta and Zhao (2011), seems to be the more appropriate when dealing with conditionally heteroscedastic models. These results are further illustrated in the empirical application. It is shown that the first order sample cross-correlation between past and squared daily DJIA returns is harmfully affected by the presence of outliers, while its robust counterpart is not. In fact, depending on which measure of cross-correlation is used, the detection of asymmetries could be misleading. It is also shown that some observations which are not identified as outliers may still have a distorting effect on the identification of asymmetries in the volatility, enhancing the advantages of using robust methods as a protection against outliers rather than detecting and correcting them. The empirical application also prompts to the existence of possible time-varying leverage effects. We leave this topic for further research along with the problem of robust estimation of asymmetric GARCH models.

## Footnotes

- 1.
Note that the limits in (7) are valid regardless of whether \(y_{t}\) is homoscedastic or heteroscedastic.

- 2.
We have also performed simulations with other EGARCH models with different parameter values and with a TGARCH model (Zakoian 1994) with the same variance and kurtosis as the EGARCH model in Eq. (8). In all these cases, we have obtained similar results that are not displayed here but are available upon request.

- 3.
The results for other lags are available from the authors upon request.

- 4.
Note that the algorithm in Croux and Rousseeuw (1992) also includes a correction factor to improve finite sample unbiasednes of \(Q_{n}\). We ignore this factor because it does not make any difference to our discussion.

- 5.
The results for cross-correlations of orders larger than one are available from the authors upon request.

## Notes

### Acknowledgments

We thank Amélie Charles and Olivier Darné for kindly sending us the data for the empirical application. We also thank the Guest Editor, Gabriele Fiorentini, and two anonimous referees for their precise comments that have improved the paper. Financial support from the Spanish Government under project ECO2014-58434-P by the first author and under project ECO2012-32401 by the second and third authors, is gratefully acknowledged. The first and second authors also acknowledge financial support from Generalitat Valenciana under project PROMETEO/2013/037 and from Comunidad de Castilla y León under project VA066U13, respectively. As usual, we are responsible for any remaining errors.

## References

- Bandi F, Renò R (2012) Time-varying leverage effects. J Econom 169:94–113CrossRefGoogle Scholar
- Black R (1976) Studies in stock price volatility changes. In: Proceedings of the 1976 business meeting of the business and economics statistics sections, American Statistical Association, pp 177–181Google Scholar
- Bollerslev T, Litvinova J, Tauchen G (2006) Leverage and volatility feedback effects in high-frequency data. J Financ Econom 4:353–384CrossRefGoogle Scholar
- Bollerslev T, Mikkelsen HO (1999) Long-term equity anticipation securities and stock market volatility dynamics. J Econom 92(1):75–99CrossRefGoogle Scholar
- Carnero MA, Peña D, Ruiz E (2007) Effects of outliers on the identification and estimation of GARCH models. J Time Ser Anal 28(4):471–497CrossRefGoogle Scholar
- Charles A, Darné O (2014) Large shocks in the volatility of the Dow Jones Industrial Average index: 1928–2013. J Bank Financ 43:188–199CrossRefGoogle Scholar
- Croux C, Rousseeuw PJ (1992) Time-effcient algorithms for two highly robust estimators of scale. In: Dodge Y, Whittaker J (eds) Comput Stat, vol 1. Physika-Verlag, Heidelberg, pp 411–428CrossRefGoogle Scholar
- Falk M (1998) A note on the comedian for elliptical distributions. J Multivar Anal 67(2):306–317CrossRefGoogle Scholar
- Fiorentini G, Maravall A (1996) Unobserved components in ARCH models: an application of seasonal adjustment. J Forecast 15(3):175–201CrossRefGoogle Scholar
- Fried R, Gather U (2005) Robust trend estimators for AR(1) disturbances. Austrian J Stat 34(2):139–151Google Scholar
- Fuller WA (1996) Introduction to statistical time series, 2nd edn. Wiley, HobokenGoogle Scholar
- Gnanadesikan R, Kettenring JR (1972) Robust estimates, residuals and outlier selection with multiresponse data. Biometrics 28:81–124CrossRefGoogle Scholar
- Gómez V, Maravall A, Peña D (1999) Missing observations in ARIMA models: skipping approach versus additive approach. J Econom 88:341–363CrossRefGoogle Scholar
- Hallin M, Puri ML (1994) Aligned rank tests for linear models with autocorrelated error term. J Multivar Anal 50:175–237CrossRefGoogle Scholar
- Hentschel L (1995) All in the family: nesting symmetric and asymmetric GARCH models. J Financ Econo 39:71–104CrossRefGoogle Scholar
- Hibbert AM, Daigler RT, Dupoyet B (2008) A behavioral explanation for the negative asymmetric return-volatility relation. J Bank Financ 32:2254–2266CrossRefGoogle Scholar
- Jensen MJ, Maheu JM (2014) Estimating a semiparametric stochastic volatility model with a Dirichlet process mixture. J Econom 178:523–538CrossRefGoogle Scholar
- Kaiser R, Maravall A (2003) Seasonal outliers in time series. Estadística (J Interam Stat Inst) 15:101–142Google Scholar
- Laurent S, Lecourt C, Palm FC (2014) Testing for jumps in Gaussian ARMA-GARCH models, a robust approach. Comput Stat Data Anal. doi: 10.1016/j.csda.2014.05.015
- Lévy-Leduc C, Boistard H, Moulines E, Taqqu MS, Reisen VA (2011) Robust estimation of the scale and of the autocovariance function of Gaussian short- and long-range dependent processes. J Time Ser Anal 32:135–156CrossRefGoogle Scholar
- Ma Y, Genton MG (2000) Highly robust estimation of the autocovariance function. J Time Ser Anal 21:663–684CrossRefGoogle Scholar
- Maravall A, Peña D (1986) Missing observations and additive outliers in time series models. In: Mariano RS (ed) Advances in statistical analysis and statistical computing. JAI Press, StanfordGoogle Scholar
- Maronna R, Martin D, Yohai V (2006) Robust Statistics: theory and methods. Wiley, HobokenCrossRefGoogle Scholar
- Nelson DB (1991) Conditional heteroskedasticity in asset returns: a new approach. Econometrica 59(2):347–370CrossRefGoogle Scholar
- Peña D, Maravall A (1991) Interpolation, outliers and inverse autocorrelations. Commun Stat Theory Methods 20(10):3175–3186CrossRefGoogle Scholar
- Rodríguez MJ, Ruiz E (2012) GARCH models with leverage effect: differences and similarities. J Financ Econom 10:637–668CrossRefGoogle Scholar
- Rousseeuw PJ, Croux C (1992) Explicit scale estimators with high breakdown point. In: Dodge Y (ed) L\(_{1}\)-statistical analysis and related methods. Elsevier, Amsterdam, pp 77–92Google Scholar
- Rousseeuw PJ, Croux C (1993) Alternatives to the median absolute deviation. J Am Stat Assoc 88:1273–1283CrossRefGoogle Scholar
- Shevlyakov G (1997) On robust estimation of a correlation coefficient. J Math Sci 83(3):434–438CrossRefGoogle Scholar
- Shevlyakov G, Smirnov P (2011) Robust estimation of the correlation coefficient: an attempt of survey. Austrian J Stat 40(1):147–156Google Scholar
- Tauchen G, Bollerslev T, Sizova N (2012) Volatility in equilibrium: asymmetries and dynamic dependencies. Rev Financ 16:31–80CrossRefGoogle Scholar
- Teräsvirta T, Zhao Z (2011) Stylized facts of return series, robust estimates and three popular models of volatility. Appl Financ Econ 21:67–94CrossRefGoogle Scholar
- Yu J (2012) A semiparametric stochastic volatility model. J Econom 167:473–482CrossRefGoogle Scholar
- Zakoian JM (1994) Threshold heteroskedastic models. J Econ Dyn Control 18:931–955CrossRefGoogle Scholar
- Zivot E (2009) Practical Issues in the analysis of univariate GARCH models. In: Mikosch T, Kreiß J-P, Davis RA, Andersen TG (eds) Handbook of financial time series. Springer-Verlag, Berlin, pp 113–155CrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International
License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.