1 Introduction

This paper re-examines the statistical properties of a number of US financial series (such as stock market prices, dividends, earnings, consumer prices and long-term interest rates) contained in the well-known dataset which can be downloaded from Robert Shiller’s homepage, and which also are described in chapter 26 of Shiller’s (Shiller 1989) book on ‘Market Volatility’.

In the existing literature, the efficient markets hypothesis (EMH) has recently been tested using the present value (PV) model of stock prices, since if stock market returns are not predictable, as implied by the EMH, stock prices should equal the present value of expected future dividends, and therefore, stock prices and dividends should be cointegrated, as pointed out by Campbell and Shiller (1987). In their seminal paper, they tested the PV model of stock prices adopting Engle and Granger (1987) cointegration procedure, an approach which is valid provided stock prices and dividends are stationary in first differences rather than in levels.Footnote 1 They used the standard and poor’s (S&P’s) dividends and value-weighted and equally weighted New York stock exchange (NYSE) 1926–1986 datasets. In the case of the S&P series they rejected the unit root hypothesis for dividends, but not for stock prices, whilst they could not reject it for either when using the NYSE data. As for cointegration, their results were also mixed, some test statistics rejecting the null hypothesis of no-cointegration, other failing to reject it. These inconclusive results may be a consequence of assuming integer orders of differentiation as in the case of standard integration and cointegration models not allowing for non-integer values. Other empirical papers analysing cointegration in stock markets are Hakkio and Rush (1989), Baillie and Bollerslev (1989), Richards (1995), Crowder (1996) and Rangvid (2001).

However, as already mentioned, the discrete options I(1) and I(0) of classical cointegration analysis are rather restrictive: the equilibrium errors might in fact be a fractionally integrated I(d)-type process, with stock and dividends being fractionally cointegrated. This is stressed by Caporale and Gil-Alana (2004), who propose a simple two-step residuals-based strategy for fractional cointegration based on the approach of Robinson (1994): first the order of integration of the individual series is tested, and then the degree of integration of the estimated residuals from the cointegrating regression. They find that the cointegrating relationship between stock prices and dividends possesses long memory, implying that the adjustment to equilibrium takes a long time and that PV models of stock prices are valid only over a long horizon.

The present study makes the following twofold contribution. Firstly, it applies univariate tests based on long memory in order to establish the order of integration of the individual series, extending the analysis from the I(1)/I(0) cases to the more general case of fractional integration. Noting that the results can vary substantially depending on the methodology used we employ a battery of non-parametric, semiparametric and parametric techniques. Secondly, it examines bivariate relationships among the variables using the most recent fractional cointegration techniques, which also allows for slow adjustment to equilibrium. To our knowledge, although numberless studies exist analysing such relationships, ours is the first to do so within such a framework. The implications of the findings are also discussed. In particular, we argue that it is the presence of long memory in the cointegrating relationships (already documented in Caporale and Gil-Alana (2004) that can explain the inconclusiveness of the results of other studies only allowing for integer degrees of differentiation. The layout of the paper is the following. Section 2 reviews the concepts of fractional integration and cointegration and the methods applied in this study. Section 3 describes the data and reports the empirical results. Section 4 offers some concluding remarks.

2 Methodology

The methodology employed in this study is based on the concept of long memory or long-range dependence. Given a zero-mean covariance stationary process \(\{x_t , t=0,\pm 1,...\}\) with autocovariance function \({\upgamma }_{u} = E({x}_{t}\ {x}_{{t+u}})\), in the time domain, long memory is defined such that:

$$\begin{aligned} \sum _{{u \, =\, -\infty }}^{\infty } {\left| {{\upgamma }_{u}} \right| } = \infty . \end{aligned}$$

Now, assuming that \(x_{t}\) has an absolutely continuous spectral distribution function, with a spectral density function given by:

$$\begin{aligned} f({\lambda } ) = \frac{1}{2 \uppi }\left( {{\upgamma }_{0} + 2\sum _{{u \, =\, 1}}^{\infty } {{\upgamma }_{u} \cos ({\lambda } {{u}})} } \right) , \end{aligned}$$

according to the frequency domain definition of long memory the spectral density function is unbounded at some frequency \({\lambda }\) in the interval [\(0, \pi \)), i.e.

$$\begin{aligned} f(\lambda ) \rightarrow \infty ,\quad \hbox {as}\,\, {\lambda } \rightarrow {\lambda }^{*},\, {\lambda }^{*} \in [0 , \pi ]. \end{aligned}$$

Most of the empirical literature in the last twenty years has focused on the case where the singularity or pole in the spectrum occurs at the 0 frequency, i.e.

$$\begin{aligned} f({\lambda } ) \rightarrow \infty , \quad \hbox {as}\,\, {\lambda } \rightarrow 0^{+}. \end{aligned}$$

This is the standard case of I(d) models of the form:

$$\begin{aligned} ({1 - L})^{d} {x}_{t} = {u}_{t} ,\quad {t} = 1, 2,\ldots \end{aligned}$$
(1)

with \({x}_{t} = {u}_{t} = 0\) for \({t}\le 0\), where L is the lag-operator (\({ Lx}_{t} = {x}_{{t-1}})\) and \({u}_{t}\) is I(0), which is defined as a covariance stationary process with a spectral density function that is positive and bounded at all frequencies. Note that we adopt the Type II definition of fractional integration (see Marinucci and Robinson 1999). Finally, one should also note that fractional integration may also occur at some other frequencies away from 0, as in the case of seasonal/cyclical models (see Arteche 2002; Arteche and Robinson 2000; Hassler et al. 2009 among others).

In the multivariate case, the natural extension of fractional integration is the concept of fractional cointegration. Though the original idea of cointegration, as in Engle and Granger (1987), allows for fractional orders of integration, all the empirical work carried out during the 1990s were restricted to the case of integer degrees of differencing. Only in recent years have fractional values also been considered. In what follows, we briefly describe the methodology used in this paper for testing fractional integration and cointegration in the case of Shiller’s financial time series data.

2.1 Fractional integration

There exist several methods for estimating and testing the fractional differencing parameter d. Some of them are parametric, while others are semiparametric or even non-parametric, and can be specified in the time or in the frequency domain. In this paper, we use first a parametric approach developed by Robinson (1994). This is a testing procedure based on the Lagrange multiplier (LM) principle that uses the Whittle function in the frequency domain. It tests the null hypothesis:

$$\begin{aligned} H_{0} : d \,=\, d_{0} , \end{aligned}$$
(2)

for any real value \({d}_{0}\), in a model given by the equation (1), where \({x}_{t}\) can be the errors in a regression model of the form:

$$\begin{aligned} y_{t} = {\beta }^{T} z_{t} + x_{t},\, t = 1 , 2 , ... , \end{aligned}$$
(3)

where \({y}_{t}\) is the observed time series, \(\beta \) is a (k \(\times \) 1) vector of unknown coefficients and \({z}_{t}\) is a set of deterministic terms that might include an intercept (i.e. \({z}_{t} = 1\)), an intercept with a linear time trend (\({z}_{t} = (1, {t})^{T})\), or any other type of deterministic processes. Robinson (1994) showed that, under certain very mild regularity conditions, the LM-based statistic \((\hat{{r}} )\):

$$\begin{aligned} {\hat{r}} {\rightarrow _{d}}\,\, {N}(0 , 1)\quad \hbox {as}\quad {n} \rightarrow \infty , \end{aligned}$$

where \( {\hat{r}} \rightarrow _{d}\) stands for convergence in distribution, and this limit behaviour holds independently of the regressors \({z}_{t}\) used in (3) and the specific model for the I(0) disturbances \({u}_{t}\) in (1). Other parametric approaches (Sowell 1992; Beran 1995) were also employed in the empirical analysis and produced very similar results to those obtained using the method of Robinson (1994).

In addition, we employ a semiparametric method (Robinson 1995) which is essentially a local ‘Whittle estimator’ in the frequency domain, using a band of frequencies that degenerates to zero. The estimator is implicitly defined by:

$$\begin{aligned} \hat{{d}}&= \arg {\min \nolimits _{d}} \left( {\log \overline{C(d)} - 2 d \frac{1}{m}\sum _{s = 1}^{m} {\log {\lambda }_{s}}} \right) , \\ \overline{C(d)}&= \frac{1}{m} \sum _{s = 1}^{m} {I({\lambda }_{ s} ) {\lambda }_{ s}^{ 2 d} } , {\lambda }_{s } = \frac{2 \pi s}{n} , \frac{1}{m} + \frac{m}{n} \rightarrow 0, \nonumber \end{aligned}$$
(4)

where \({I}({\lambda }_{s})\) is the periodogram of the raw series, \({x}_{t}\), and \({d} \epsilon (-0.5,0.5)\). Under finiteness of the fourth moment and other mild conditions, Robinson (1995) proved that:

$$\begin{aligned} \sqrt{{m} } (\hat{{{d}}} - {d}^{*}) {\rightarrow _{d}} N(0, 1/4) \quad \hbox {as}\,\, {n} \rightarrow \infty , \end{aligned}$$

where \({d}^{*}\) is the true value of d. This estimator is robust to a certain degree of conditional heteroscedasticity and is more efficient than other semiparametric competitors.Footnote 2

2.2 Fractional cointegration

Engle and Granger (1987) suggested that, if two processes \({x}_{t}\) and \({y}_{t}\) are both I(d), then it is generally true that for a certain scalar a \(\ne 0\), a linear combination \({w}_{t} = {y}_{t} - {ax}_{t}\), will also be I(d), although it is possible that \({w}_{t}\) be \(I(d - b)\) with \({b} > 0\). Given two real numbers d, b, the components of the vector \({c}_{t}\) are said to be cointegrated of order d, b, denoted \({c}_{t}\sim \) CI(d, b) if:

  1. (i)

    all the components of \({c}_{t}\) are I(d),

  2. (ii)

    there exists a vector \(\upalpha \ne 0\) such that \({s}_{t}={\upalpha }'c_{t} \sim {I}({\upgamma }) = I(d - b), b > 0\).

Here, \(\upalpha \) and \(\hbox {s}_{t}\) are called the cointegrating vector and error, respectively. This prompts consideration of an extension of Phillips’ (1991) triangular system, which for a very simple bivariate case is:

$$\begin{aligned}&y_{t} = {\nu } x_t + u_{1t} (-{\gamma }), \end{aligned}$$
(5)
$$\begin{aligned}&x_{t} = u_{2t} (-d), \end{aligned}$$
(6)

for \({t} > 0\), where for any vector or scalar sequence \({w}_{t}\), and any \(\zeta \), we introduce the notation \({w}_{t}(\zeta ) = (1 - L)^{\zeta }{w}_{t}\). Note that \({u}_{t} = ({u}_{{1t}}, {u}_{{2t}})^{T}\) is now a bivariate zero mean covariance stationary I(0) unobservable process and \(\upnu \ne 0, {\upgamma } < d\). Under (5) and (6), \({x}_{ t}\) is I(d), as is \({y}_{t}\) by construction, while the cointegrating error \({y}_{t} - \upnu {x}_{t}\) is \(\hbox {I}({\upgamma })\). Model (5) and (6) reduce to the bivariate version of Phillips’ (1991) triangular form when \({\gamma } = 0\) and \(d = 1\), which is one of the most popular models displaying CI(1, 1) cointegration considered in both the empirical and theoretical literature.

Next, we focus on the estimation of the cointegrating relationship, specifically on the estimation of \(\upnu \) in (5). The simplest approach is to estimate it using the well-known ordinary least squares (OLS) estimator

$$\begin{aligned} {{\hat{\upnu }}_\mathrm{{ols}}}^{t}=\frac{{\sum \nolimits _{t=1}^{n}} {x_{t} y_{t} } }{{\sum \nolimits _{t=1}^{n}} {x_{t}^{2}}} , \end{aligned}$$
(7)

where the superscript t indicates time domain estimation. Here, in the standard cointegrating setting, with \({\upgamma } = 0\) and \(d = 1\), it has been shown that in general \(\hat{{\upnu }}_\mathrm{ols} ^{t}\) is n-consistent with non-standard asymptotic distribution. In fractional settings, the properties of ols could be different from those within this framework. When the observables are purely non-stationary (so that \(d \ge 0.5\)), consistency of \(\hat{{\upnu }}_\mathrm{ols} ^{t}\) is retained, but its rate of convergence and asymptotic distribution depend crucially on \({\upgamma }\) and d.

An alternative method of estimating \(\upnu \) is in the frequency domain. Consider the estimator

$$\begin{aligned} \hat{{\upnu }}_\mathrm{{ols}} ^{f}=\frac{\sum \nolimits _{j=0}^{n-1} {I_{xy} \left( {{\lambda }_j } \right) } }{\sum \nolimits _{j=0}^{n-1} {I_x \left( {{\lambda }_j } \right) } }, \end{aligned}$$
(8)

where \(\lambda _{j} = 2\uppi \!{j/n, j} = 1, ..., {n}\), are the Fourier frequencies, and for arbitrary sequences \({\xi }_{t}, {\zeta }_{t},\) we define the discrete Fourier transform and (cross)-periodogram

$$\begin{aligned} w_\xi \left( {\lambda } \right) =\frac{1}{\sqrt{2\pi n}}{\sum \limits _{t=1}^{n}} {{\xi }_{t} {e}^{it\lambda }} , I_{\xi \zeta } \left( \lambda \right) =w_\xi \left( \lambda \right) {w}'_\zeta \left( {-\lambda } \right) , I_\xi \left( \lambda \right) =I_{\xi \xi } \left( \lambda \right) . \end{aligned}$$

Here, the discrete Fourier transform at a given frequency captures the components of the series related to this particular frequency. Robinson (1994) proposed the narrow band least squares (NBLS) estimator

$$\begin{aligned} \hat{{\upnu }}_\mathrm{NBLS} =\frac{{\sum \limits _{j=0}^{m}} {s_{j} \hbox {Re}\, I_{xy} \left( {{\lambda }_{j} } \right) } }{{\sum \limits _{j=0}^{m}} {s_j I_x \left( {{\lambda }_j } \right) }} , \end{aligned}$$
(9)

where \(1 \le {m} \le {n/2}; {s}_{j} = 1\) for \(j = 0\), \(n/2\) and 2, otherwise; and (1/m) + (m/n) \(\rightarrow \) 0 as n \(\rightarrow \quad \infty \). He showed the consistency of this estimator even under stationary cointegration. As in the case of ols, in general NBLS has a non-standard limiting distribution.

Assuming that the process \({u}_{t}\) in (5) and (6) has a parametric spectral density \(f\left( \lambda \right) =f\left( {\lambda ;\theta } \right) ,\) where \(\theta \) is an unknown vector of short-memory parameters, Robinson and Hualde (2003), based on generalized least squares (GLS)-type corrections, propose methods to estimate optimally (under Gaussianity) \(\upnu \) when \({d} - {\upgamma } > 0.5\) (named strong cointegration). Denoting

$$\begin{aligned} z_t ({c,d})&= (y_t ({c}),x_t ({d}){)}', \zeta =(1,0{)}', p(\lambda ;{h})={\zeta }'f(\lambda ;{h})^{-1}, \\ {a(c,d,h)}&= \sum _{j=1}^n {p({\lambda _{j}} ;{h})w_{x({c})} (-{\lambda _{j}} )w_{z({c,d})} ({\lambda _{j}})} , q(\lambda ;\mathrm{h})={\zeta }'f(\lambda ;{h})^{-1}\zeta , \\ {b(c,d)}&= \sum _{j=1}^{n} {q({\lambda _{j}} ;{h})} I_{x({c})} ({\lambda _{j}} ), \hbox {and defining}\,\, \hat{{\upnu }}\left( {{c,d,h}} \right) =\frac{{a}\left( {{c,d,h}} \right) }{{b}\left( {{c,h}} \right) }, \\ \end{aligned}$$

they considered five different estimators given by:

$$\begin{aligned} \hat{{\upnu }}({\gamma }, d,{\theta } ), \hat{{\nu }}({\gamma }, d,\hat{{\theta }}), \hat{{\nu }}({\gamma } ,\hat{d},\hat{{\theta }}), \hat{{\nu }}(\hat{{\gamma }},\hat{d}, \hat{{\theta }}), \hat{{\nu }}(\hat{{\gamma }},\hat{d},\hat{{\theta }}), \end{aligned}$$
(10)

where \(\hat{{\gamma }}, \hat{d}, \hat{{\theta }},\) are corresponding estimators of the nuisance parameters \({\gamma }\), d and \(\theta \). The estimators in (10) reflect different knowledge about the structure of the model, the first being in general unfeasible, the second only assuming knowledge of the integration orders (as was done previously in the standard cointegrating literature), whereas the last estimator represents the most realistic case. Under regularity conditions, Robinson and Hualde (2003) showed that any of the estimators in (10) is \({n}^{d-{\gamma }}\)-consistent with identical mixed-Gaussian asymptotic distributions, leading to Wald tests on the parameter \(\nu \),

$$\begin{aligned} W({\gamma } ,\delta ,\theta ), W({\gamma } ,\delta ,\hat{{\theta }}), W({\gamma } ,\hat{{\delta }},\hat{{\theta }}), W(\hat{{\gamma }},\delta ,\hat{{\theta }}), W(\hat{{\gamma }},\hat{{\delta }},\hat{{\theta }}), \end{aligned}$$
(11)

where \(W(c,d,h)=b(c,h)\{\hat{{\nu }}(c,d,h)-1\}^{2},\) with a chi-squared limit distribution for the values of d and \({\gamma }\). Hualde and Robinson (2007) propose an estimator of \(\nu \) in (5) and (6) in the case when \(d - {\gamma } < 0.5\) (named weak cointegration). As in Robinson and Hualde (2003), this method is based on a GLS-type correction.

3 Data and empirical results

The monthly series analysed have been collected by Robert Shiller and his associates, and are available on http://www.econ.yale.edu/~shiller/. The sample period goes from 1871m1 to 2010m6. They are described in chapter 26 of Shiller’s (1989) book on ‘Market Volatility’, where further details can be found, and are constantly updated and revised. Specifically, they are the following series: stock market prices (monthly averages of daily closing S&P prices, computed from the S&P four-quarter tools for the quarter, since 1926, with linear interpolation to monthly figures), dividends (an index), earnings (also an index), a consumer price index (Consumer Price Index—All Urban Consumers) used for computing real values of the previous variables, a long-term interest rate (GS10, which is the yield on the 10-year Treasury bonds) and also a cyclically adjusted price-earnings ratio.

3.1 Univariate analysis: fractional integration

We first employ the parametric approach of Robinson (1994) described in Sect. 2, assuming that the disturbances are white noise. Thus, time dependence is exclusively modelled through the fractional differencing parameter d. In particular, we consider the set-up in (3) and (1), with \({z}^{T} = (1,{t})^{T}\), testing \(\hbox {H}_{0}\) given by equation (2), i.e. \({d} = {d}_{0}\), for \({d}_{0}\) in [0, 0.001, 0.002, ..., 2]. In other words, the model under the null becomes:

$$\begin{aligned} y_{t} = \beta _{0} + \beta _{1} t + x_{t} ;\, (1 - L)^{d_{o}} x_{t} = u_{t}\, t = 1 , 2 , ... , \end{aligned}$$
(12)

and white noise \({u}_{t}\).

Table 1 displays the estimates of d (obtained as the values of \(d_{0}\) that produce the lowest \(\hat{{r}}-\) statistics in absolute value) along with the 95 % confidence band of the non-rejection values of \(d_{0}\) using Robinson’s (1994) parametric approach. For each series, we display the three cases commonly examined in the literature, i.e. the cases of no deterministic terms (i.e. imposing \(\beta _{0}=\beta _{1} = 0\) a priori), an intercept (\(\beta _{0}\) unknown and \(\beta _{1} = 0\) a priori) and an intercept with a linear time trend (\(\beta _{0}\) and \(\beta _{1}\) unknown). The inclusion of a time trend may appear unrealistic in the context of financial variables. Note, however, that in the case of fractional (or integer) differentiation the time trend disappears in the long run.Footnote 3

Table 1 Estimates of d in a model with white noise disturbances

The first noticeable feature in this table is that all the estimated values of d are above 1 and the unit root null hypothesis (i.e. \(d = 1\)) is rejected in all cases at the 5  % level. In general, the values are very similar for the three cases with deterministic terms, although the results change substantially from one series to another. Specifically, values of d above 1.5 are found in the case of dividends, earnings and real earnings. For the remaining series, the values are slightly above 1, but still significantly different from 1. However, these results are based on a model characterised by the lack of (weak)-autocorrelation for the error term. Therefore, in what follows we assume that the disturbances are weakly autocorrelated and model them first using the exponential spectral model of Bloomfield (1973). This is an approach to modelling the I(0) error term that produces autocorrelations decaying exponentially as in the AR(MA) case. Therefore, it approximates ARMA structures with a small number of parameters, and performs extremely well in the context of Robinson’s (1994) tests (see Gil-Alana 2004). The results using this approach are displayed in Table 2.

Table 2 Estimates of d in a model with bloomfield-type disturbances
Table 3 Estimates of d in a model with seasonal AR(1) disturbances

It can be seen that the values are much smaller than in the previous case of white noise disturbances. One series (long-term interest rates) has values which are strictly below 1, implying mean-reverting behaviour; for dividends and real stock prices the unit root null cannot be rejected. It is slightly rejected (at the 5  % level, but not at the 1  % level) for stock prices, consumer price index and price/earning ratio, and it is decisively rejected in favour of higher orders of integration for the remaining two series (earnings and real earnings). As a final specification, given the monthly frequency of the data, we assume that the error term follows a seasonal AR(1) process. The inclusion of seasonal dummies produced insignificant coefficients in all cases. The results (displayed in Table 3) are very similar to those based on white noise disturbances, with estimates of d which are all strictly above 1. Deeper inspection indicates that time trends are not required in any case, the intercept being sufficient for the deterministic component. Moreover, LR tests and other residuals-based tests suggest that the d-differenced series may all be weakly (non-seasonally) autocorrelated, implying that the model with Bloomfield disturbances may approximate accurately the order of integration of the series. Nevertheless, in view of the sensitiveness of the results to the specification of the error term, we also apply a semiparametric method that does not specify a functional form for the I(0) disturbance term.

Fig. 1
figure 1

Estimates of d based on the semiparametric estimate of Robinson (1995). The horizontal axis refers to the bandwidth parameter, while the vertical one corresponds to the estimated values of d. We report the estimates of d along with the 95  % confidence band of the I(1) hypothesis

Figure 1 displays for each series the estimates of d based on the semiparametric method of Robinson (1995), i.e. \(\hat{{d}}\) as given by (4). The estimates of d are shown for a whole range of values of the bandwidth parameter \(m = 1, 2, {\ldots }, n/2\) (on the horizontal axis). Alternatively, we could have chosen an optimal bandwidth parameter as suggested in Henry (2001). Note that the choice of the bandwidth is crucial in view of the trade-off between bias and variance: the asymptotic variance is decreasing with m, while the bias is growing with m; the 95  % confidence bands corresponding to the I(1) hypothesis are also displayed. It can be seen that, for small values of m, the unit root null is rejected in favour of mean reversion (\(d < 1\)) in the case of earnings, real dividends, real earning and price-earning ratio. For the remaining series (still with a small m), the estimated values of d are within the I(1) interval, except for the CPI series for which d is found to be strictly above 1. However, when the bandwidth parameter is large, the estimates are clearly above 1 in all cases, the only exception being long-term interest rates, with many values in the I(1) interval. Table 4 reports the numerical values for different bandwidth parameters, \(m = 25\), 41 (= \({n}^{0.5})\), 100, 200, 300 and 500: at the 5  % level, there are several cases where the unit root null cannot be rejected. Specifically, in the case of \(m = (n)^{0.5}\), which has been widely considered in the empirical literature, the unit root null hypothesis cannot be rejected for stock prices, dividends, long-term interest rates, real stock prices and real dividends, while it is rejected in favour of mean reversion (i.e. \(d < 1\)) for earnings and real earnings, and in favour of \(d > 1\) for the consumer price index.

Table 4 Estimates of d using Robinson’s (1995) semiparametric method for different bandwidth numbers

Overall, the univariate results provide no evidence of mean reversion: all series appear to be I(1) or I(d) with \({d} > 1\), implying permanent effects of shocks, with evidence of long memory in many cases for the first-differenced series as well.

3.2 Multivariate analysis: fractional cointegration

A number of cointegrating (bivariate) relationships might exist between the individual variables examined in the previous subsection, in particular between:

  1. (a)

    Stock prices and dividends

  2. (b)

    Real stock prices and real dividends

  3. (c)

    Price/earning ratio and long-term interest rates and

  4. (d)

    Real stock prices and real earnings.

Some of these relationships have been extensively analysed in the literature. Campbell and Shiller (1987) and DeJong (1992) tested a present value model of the stock market using time series data for real US annual stock prices and dividends from 1871 to 1986. In the first of these studies, they carried out ADF tests, with and without a time trend, on both individual series, and their results suggested that both series were integrated of order 1. When using the DF and ADF tests on the residuals from the cointegrating regressions, their results were mixed: the former test rejected the null hypothesis of no cointegration at the 5  % level, while the latter narrowly failed to reject it at the 10  % level. DeJong (1992) used a Bayesian approach to model these two variables and found evidence in favour of trend-stationary representations. Similarly, Koop (1991), using a different dataset, came to the same conclusion that both variables are stationary around a linear trend, and, even when assuming unit roots, he found little evidence of cointegration with I(0) errors.

Pereira-Garmendia (2010) finds that real stock prices and real earnings are related through inflation. The relationship among stock prices, earnings and bond yield is analysed by Durre and Giot (2007). Papers examining long-run linkages between the price/earnings ratio and interest rates include Philips (1999), Campbell and Shiller (1998), Campbell and Shiller (2001) and Asness (2003) inter alia.

In all cases, we follow the same strategy. We first estimate individually the orders of integration of the series using the log-periodogram-type estimator devised by Robinson (1995). This is defined as:

$$\begin{aligned} \hat{{d}}(l) = \sum _{j\, =\, l+1}^m {\left( {a_j - \bar{{a}}} \right) \log I({\lambda _{j}})/S_{l}}, \end{aligned}$$
(13)

where

$$\begin{aligned} a_j&= - \log \left( {4 \sin ^{2}\left( {\frac{\lambda _{j}}{2}} \right) } \right) ,\quad \bar{{a}} =\frac{1}{m - l}\sum _{j=1}^p {a_j u_{t-j} +\varepsilon _t },\\ S_l&= \sum _{j=l+1}^m {\left( {a_j - \bar{{a}}} \right) ^{2},}\, {\lambda _{j}} = \frac{2 \pi j}{n}, \end{aligned}$$

and \(0 \le {l} < {m} < {n}\). The results for the individual series possibly involved in cointegration relationships are displayed in Table 5 (for \({m} = {n}^{0.5}\) and l = 0, 1, ..., 5).Footnote 4

Table 5 Estimates of d using Robinson’s (1995) log-periodogram semiparametric method for different values of l and fixed m = \((n)^{0.5}\)
Table 6 Testing the homogeneity in the order of integration (Robinson and Yajima 2002)

Next, we test the homogeneity of the orders of integration in the bivariate systems (i.e. \({H}_{0}\!: {d}_{x} = {d}_{y})\), where \({d}_{x}\) and \({d}_{y}\) are now the orders of integration of the two individual series, by using an adaptation of Robinson and Yajima (2002) statistic \(\hat{{T}}_{xy} \) to log-periodogram estimation. The statistic is:

$$\begin{aligned} \hat{{T}}_{xy} = \frac{m^{1/2}\left( {\hat{{d}}_x - \hat{{d}}_y } \right) }{\left( {\frac{1}{2} \left( {1 - \hat{{G}}_{xy} / (\hat{{G}}_{xx} \hat{{G}}_{yy} } \right) } \right) ^{1/2}+ h(n)}, \end{aligned}$$
(14)

where h(n) \(>\) 0 and \(\hat{{G}}_{{xy}} \) is the \({(xy)}{{\mathrm{th}}}\) element of

$$\begin{aligned} \hat{{G}} = \frac{1}{{m}} \sum _{{j=1}}^{m} {\hbox {Re} \left[ {\hat{{\Lambda }}({\lambda _{j}})^{-1}\hbox {I}({\lambda _{j}})\hat{{\Lambda }}({\lambda _{j}})^{-1*}} \right] } , \hat{{\Lambda }}({\lambda _{j}}) = \mathrm{diag}\left\{ {e^{i\pi \hat{{d}}_{x}/2}{\lambda }^{-\hat{{d}}_{x} }, e^{i\pi \hat{{d}}_{y}/2}{\lambda }^{-\hat{{d}}_{y}}} \right\} , \end{aligned}$$

(see Gil-Alana and Hualde 2009 for evidence on the finite sample performance of this procedure). The results using this approach are displayed in Table 6. In general, we cannot reject the null hypothesis of equal orders of integration.Footnote 5 In the following step, we perform the Hausman test for no cointegration of Marinucci and Robinson (2001) comparing the estimate \(\hat{{d}}_x \) of \({d}_{x}\) with the more efficient bivariate one of Robinson (1995), which uses the information that \({d}_{x} = {d}_{y} = {d}_{*}\). Marinucci and Robinson (2001) show that

$$\begin{aligned} H_{\mathrm{im}} = \hbox {8m} \left( {\hat{{d}}_{*} - \hat{{d}}_\mathrm{i}} \right) ^{2} {\rightarrow _{d}}\, {\chi }_{1}^{2}\,\, \hbox {as}\,\, \frac{1}{{m}} + \frac{{m}}{{n}} \rightarrow 0, \end{aligned}$$
(15)

with \(i = x, y\), and where \(m < [n/2]\) is again a bandwidth parameter, analogous to that introduced earlier; \(\hat{{d}}_i \) are univariate estimates of the parent series and \(\hat{{d}}_*\) is a restricted estimate obtained in the bivariate context under the assumption that \({d}_{x} = {d}_{y}\). In particular,

$$\begin{aligned} {\hat{d}}_{*} = -\frac{{\sum \limits _{j=1}^{s}} {1_{2}^{\prime } \hat{{\Omega }}^{-1}Y_{j} v_{j}} }{2 1_2^{\prime } \hat{{\Omega }}^{-1}1_{2} {\sum \limits _{j=1}^{s}} {v_j^2 } }, \end{aligned}$$
(16)

with \(1_{2}\) indicates a (\(2 \times 1\)) vector of 1s, \(\hat{{\Omega }}\) refers to the variance covariance matrix of \({Y}_\mathrm{j} = [\hbox {log} I_{{xx}}({\lambda _{j}}), \hbox {log} I_{{yy}}({\lambda _{j}})]^{T}\), and \(v_{j} = \log j - \frac{1}{s} {\sum \limits _{j=1}^{s}} {\log j} .\) The limiting distribution above is presented heuristically, but the authors argue that it seems sufficiently convincing for the test to warrant serious considerations. The results using this approach are displayed in Table 7, and although when using the Hausman-type tests like those employed here a rejection of the null of no cointegration does not necessarily imply that the alternative (in our case, fractional cointegration) holds, we provide in the following pages supportive evidence of cointegration in many of the cases examined.

Table 7 Hausman test for no cointegration (Marinucci and Robinson 2001)

In the final part of the analysis, we apply the methods of Robinson and Hualde (2003) and Hualde and Robinson (2007). We identify parametric models for f(\(\lambda )\) with \({u}_{t}\) in (5) and (6) having the form,

$$\begin{aligned} u_t = A(L) \varepsilon _t , \end{aligned}$$
(17)

where \(\upvarepsilon _{t}\) is supposed to be an i.i.d. process, and A(L) is initially assumed to be diagonal, thus treating \({u}_{\mathrm{1t}}\) and \({u}_{{2t}}\) separately. We approximate the two series as

$$\begin{aligned} \tilde{{u}}_{{1t}} = ({1 - L})^{\tilde{\upgamma }}\left[ {{y}_{t} - \hat{{v}}_{\mathrm{ols}} {x}_{t}} \right] , \end{aligned}$$
(18)

and

$$\begin{aligned} \tilde{{u}}_{{2t}} = ({1 - L})^{\tilde{{d}}} {x}_{t}, \end{aligned}$$
(19)

to obtain estimates of \({\upgamma }\) and d previously estimated using other methods, and follow Box–Jenkins-type procedures to identify the models within the ARMA class. The results based on this method are displayed in Table 10.

Next we examine each of the bivariate relationships.

3.2.1 Stock market prices and dividends

Fig. 2
figure 2

a Stock market prices and dividends. The thick line refers to the stock market prices and the thin one is for dividends. b Real stock market prices and real dividends. The thick linerefers to real stock market prices and the thin oneto real dividends. c Price-earning ratio and long interest rate. The thick line refers to the long-term interest rate and the thin oneis for the price-earning ratio. d Real stock market prices and real earnings. The thick linerefers to the real stock market prices and the thin oneis for real earnings

Figure 2a displays the plots of the two series. Both of them are relatively stable until the end of World War II, when they start increasing and also exhibit a higher degree of volatility.

Focusing first on the univariate results using the Whittle semiparametric estimator (Robinson 1995), it can be seen that for small values of m the unit root null cannot be rejected (see Table 4). Specifically, for \(m = (n)^{0.5} = 4l\), the estimates are 0.953 and 1.105, respectively, for stock prices and dividends. Similar evidence of unit roots, though with slightly higher values, is obtained with the log-periodogram estimator of Robinson (1995) (see Table 5). For example, for l = 0, 1, 2, ..., 5, and \(m = (n)^{0.5}\), the estimates of d for stock prices range between 1.041 and 1.080 and those for dividends between 1.026 and 1.222. Testing now the homogeneity condition with Robinson and Yajima’s (Robinson and Yajima (2002)) procedure (see Table 6), it is found that the two orders of integration are equal. Here h(n) is set equal to \(\hbox {b}^{{-5-2i}}\), with i = 1, 2, 3, 4 and 5 and \(\hbox {b} = (n)^{0.5}\), which is the bandwidth used in the estimation. The Hausmann test of no cointegration (Marinucci and Robinson 2001) (see Table 7) indicates that the estimates of d for the individual series using the bivariate representation (\({\hat{d}_{*}}\) in (15)) are very close to 1 and not significantly different from 1 (using three different values for s in (15)), but evidence of cointegration is only obtained in one case out of the six considered (\({H}_{\mathrm{as}}\) with s = 25—see Table 7).

3.2.2 Real stock market prices and real dividends

The same relationship as above, but in real terms is examined in this subsection. A time series plot of the two series is displayed in Fig. 2b. They exhibit a similar pattern to the previous case, although with more volatility in the early part of the sample, and may have a common stochastic trend. Starting again with the univariate tests (see Table 4), it is found that, when applying the Whittle semiparametric method of Robinson (1995), for \(m = (n)^{0.5} = 4\textit{l}\), the estimates of d are 0.888 and 0.896, respectively, for real stock prices and real dividends, and the unit root null cannot be rejected for either series. Similar evidence is obtained with the log-periodogram estimator (see Table 5), with values of d ranging from 0.972 and 1.085 for real stock prices and from 0.822 and 0.997 for real dividends. The test of homogeneity of the orders of integration (Table 6) implies equality in the values of d, while testing the null of no cointegration with the Hausman test of Marinucci and Robinson (2001) (in Table 7) suggests that the two series might be cointegrated.

3.2.3 Price/earning ratio and long-term interest rates

These two series are plotted in Fig.  2c. Interest rates appear to be more stable than the price/earning ratio during the first half of the sample; however, during the second half, there is a sharp increase in interest rates, but not in the price/earning ratio. As for the Whittle estimates of d (see Table 4), it is found that for the price/earning ratio the values of d are very sensitive to the bandwidth parameter: for small values (e.g. 25, 41 or 100) the unit root is rejected in favour of values of d below 1; on the contrary, the unit root null cannot be rejected for m = 200, and it is rejected in favour of \(d > 1\) for m = 300 and 500. For the long-term interest rates, the results are more stable and the unit root null cannot be rejected for any bandwidth parameter. These results are corroborated by the log-periodogram estimates, displayed in Table 5. Thus, for the price/earning ratio, different results are obtained depending on whether or not the series is first-differenced, while for long-term interest rates the evidence strongly support the I(1) case. Interestingly, when performing the homogeneity tests of Robinson and Yajima (2002) we cannot reject the null of equal orders of integration, and the Hausman test reject in all cases the null hypothesis of no cointegration in favour of fractional cointegration (Table 7).

3.2.4 Real stock market prices and real earnings

Plots of the two series are displayed in Fig.  2d. They both have a very similar upward trend, which suggests that they may be cointegrated around a common stochastic trend. The estimated values of d using the Whittle method and for \(m = (n)^{0.5}\) (see Table 4) are 1.071 for real stocks and 0.933 for real earnings, and in both cases we cannot reject the null of I(1) series. The same evidence in favour of unit roots is obtained with the log-periodogram estimates in Table 5, and the homogeneity restriction cannot be rejected in any single case (see Table 6). The Hausman tests of Marinucci and Robinson (2001) also indicate that the two series might be cointegrated, since the null hypothesis of no cointegration is rejected in all cases in Table 7 in favour of long-memory cointegrating errors.

We estimated the cointegrating coefficients for each of the four relations using various methods such as \(\hbox {OLS}^{t}\) (Eq. 7); \(\hbox {OLS}^{f}\) ((8)) and NBLS with different bandwidths ((9)); the results were similar for all procedures. On the basis of these coefficients, we estimated the orders of integration in the residuals of the cointegrating regression. First, we used the parametric approach of Robinson (1994). However, the results varied considerably depending on the specification of the error term. Owing to this disparity, we estimate d with semiparametric methods.

Table 8 Estimates of d on the residuals using the estimate of Robinson (1995)
Table 9 Estimates of d on the residuals using the estimate of Robinson (1995)

Table 8 displays the estimates of d based on the log- periodogram regression estimator of Robinson (1995) for \(m = n^{0.5}\) and l = 0 and l = 2. In many cases, the estimates are significatively smaller than l, especially for the price/earning ratio—long-term interest rates and real stock prices—real earning relationships. Table 9 reports the results from the semiparametric Whittle method of Robinson (1995), again applied to the estimated residuals from the cointegrating relationships. Two different bandwidth parameters, m = 25 and \(m = n^{0.5} = 4l\) are considered. Virtually all estimated values are strictly below 1. For the first two relationships (stock prices and dividends and their real terms), the values for the order of integration in the residuals range between 0.6 and 0.8. Smaller values are obtained for the price/earning ratio—long-term interest rate relationship: if m = 41, the estimated value of d is about 0.55, however, using m = 25, the values are in all cases 0.50 suggesting that the residual series may be stationary. There is a wider range of values in the case of the real stock prices—real earnings relationship, although most of them are also in the interval (0.5, 1).

Table 10 Estimates of the orders of integration of the individual series and of the cointegrating regression

Finally, we identify parametric models for \(f(\lambda )\) with \(u_t\) in (5) and (6) on the basis of Eqs.  (1719), using wide-ranging values for the orders of integration from the previous tables. Here, we employ both the Robinson and Hualde (2003) and Hualde and Robinson (2007) approaches based on the approximate difference between the order of integration of the parent series and the estimated residuals. Using a Box–Jenkins-type methodology we identified at most AR(1) structures in all cases. Therefore, we simply consider combinations of white noises and AR(1) processes in each bivariate relation. For each model, we apply the univariate Whittle procedure of Velasco and Robinson (2000), using untapered versions, and, as usual, the first-differenced data, then adding 1 to the estimated value. The results for the four bivariate relationships are summarised in Table 10 and are fairly similar for the different types of I(0) errors.

Although we do not report it, we also estimated a multivariate version of the Bloomfield (1973) model for I(0) autocorrelation, with fairly similar results to those presented in Table 10. In general, there is a statistically significant reduction in the order of integration in all cases of about 0.3/0.4 from the original series to the cointegrating relationship. The orders of integration in the latter are about 0.7 for three of these relations: stock prices/dividends; real prices/real dividends and real prices/real earnings. For the price-earning ratio/interest rates relationship, the reduction is slightly bigger, and the order of integration of the cointegrating relationship seems to be slightly above 0.5.Footnote 6

Overall, the four relationships examined in this paper appear to be fractionally cointegrated, with orders of integration for the individual series equal or slightly above 1, and being in the interval [0.5, 1) for the cointegrating regression, which implies a slow mean-reverting behaviour in the long run.

4 Conclusions

In this paper we have examined bivariate relationships among various financial variables using fractional integration and cointegration methods. In particular, we focus on the following bivariate relationships: stock prices and dividends; real stock prices and real dividends; price/earning ratio and long-run interest rates and real stock prices and real earnings, monthly, for the time period 1871m1–2010m6.

The univariate results strongly support the hypothesis that all individual series are non-stationary with orders of integration equal to or higher than 1 in practically all cases. The multivariate results provide evidence of fractional cointegration for the four bivariate relationships with the orders of integration of the cointegrating regressions being in the interval [0.5, 1) which implies mean-reverting behaviour. The implication is that there exist long-run equilibrium relationships consistent with economic theory and that the effects of shocks are temporary, although the fact that fractional cointegration (rather than standard cointegration) holds means that the adjustment process is much slower, and that, therefore, the overall costs of deviations from equilibrium are bigger than standard cointegration approaches would estimate. This is an important result that should be taken into account when formulating policies and deciding on policy actions. It also provides an explanation for the mixed evidence reported in other papers only allowing for integer degrees of differentiation and therefore not modelling long-memory properties. However, it is important to mention that due to the variety of methods employed, some of them parametric and others semiparametric, along with the sensitivity of the results obtained with the semiparametric methods to the bandwidth parameters, the evidence presented in this study is not entirely conclusive. This is something one has to face when working with fractional models in finite samples due to fact that the differencing parameter has real values.

Other recently developed bivariate or multivariate fractional cointegration testing methods based on co-fractional VAR models (e.g. Johansen 2011; Nielsen 2010; Nielsen and Frederiksen 2011) could also be applied. Moreover, our analysis does not take into account other possible features of the data, such as structural breaks, non-linearities and other issues. Of course, these are also important issues whose relevance for fractional integration tests has already been investigated (see, e.g. Diebold and Inoue 2001; Granger and Hyung 2004; Caporale and Gil-Alana 2008). Our future research will consider them in the context of fractional cointegration.