Advertisement

Information Geometry

, Volume 1, Issue 2, pp 215–235 | Cite as

Superharmonic priors for autoregressive models

  • Fuyuhiko TanakaEmail author
Research Paper

Abstract

Tanaka and Komaki (Sankhya Ser A Indian Stat Inst 73-A:162–184, 2011) proposed superharmonic priors in Bayesian time series analysis as alternative to the famous Jeffreys prior. By definition the existence of superharmonic priors on a specific time series model with finite-dimensional parameter is equivalent to that of positive nonconstant superharmonic functions on the corresponding Riemannian manifold endowed with the Fisher metric. In the autoregressive models, whose Fisher metric and its inverse have quite messy forms, we obtain superharmonic priors in an explicit manner. To derive this result, we developed a systematic way of dealing with symmetric polynomials, which are related to Schur functions.

Keywords

Jeffreys prior Superharmonic priors Autoregressive models Noninformative priors Kullback–Leibler divergence Fisher metric 

1 Introduction

Let us consider a parametric model of the unknown spectral density in the time series analysis. In the present paper, we focus on a parametric family of spectral densities
$$\begin{aligned} \mathcal {M}:= \{ S(\omega | \theta ) : \theta \in \Theta \subseteq \mathbf {R}^k \} . \end{aligned}$$
(1)
According to Amari [1], the above parametric model is considered to be a Riemannian manifold when we adopt a parameter \(\theta \) as a coordinate of \(\mathcal {M} \) we call this manifold a model manifold or shortly a model. The Fisher information matrix of a parametric model of spectral densities \(\mathcal {M} \), which is defined by
$$\begin{aligned} g_{ij} := g \left( \frac{\partial }{\partial \theta ^i }, \frac{\partial }{\partial \theta ^j } \right) = \int ^{\pi }_{-\pi }\frac{\mathrm {d} \omega }{4\pi } \frac{\partial _i S (\omega | \theta )}{S(\omega | \theta ) } \frac{\partial _j S (\omega | \theta )}{S(\omega | \theta ) }, \end{aligned}$$
(2)
is adopted as the Riemannian metric. In information geometry, the above model manifold has been investigated [1, 2, 15].

In Bayesian time series analysis, we need a probability distribution over the parameter space \(\Theta \), which is called a prior distribution or shortly a prior. As a default prior, it is suggested using the Jeffreys prior [10] if we have no strong reason to use others [7]. However, Bayesian analysis using the Jeffreys prior does not necessarily bring desirable results when it is not normalized (often called improper prior) (see, e.g., Bernardo [5] and references therein).

Although it is impossible to choose a specific prior that is always preferable to the Jeffreys prior, we are able to mention the existence of a class of priors better than the Jeffreys prior in a specific model. Along this line, Tanaka and Komaki [18] proposed superharmonic priors in time series analysis. They considered the estimation of spectral densities in the autoregressive moving average (ARMA) model and its submodel according to previous work on Bayesian predictive densities by Komaki [12]. When there exists a superharmonic prior on a stationary ARMA model, the Bayesian spectral density estimator based on a superharmonic prior has better performance than that based on the Jeffreys prior in an asymptotic setting. Indeed, Tanaka and Komaki [17] find a superharmonic prior in the second order autoregressive (AR) model and validated their result by numerical simulation.

In the present paper, we give explicitly superharmonic priors in the p-th order AR models (AR(p) models) with full derivation. Here we emphasize two points. First, a model manifold does not always admit a superharmonic prior and finding superharmonic priors in a model manifold is not a routine like the calculation of the Jeffreys prior (anyway it reduces to the calculation of the determinant of the Fisher metric). Second, our result is significant not only to Bayesian time series analysis but also to information geometry (differential geometry). By definition, a superharmonic prior is a geometrical entity because it is invariant under any coordinate transformation. In addition, the existence of a superharmonic prior on a model manifold with the Fisher metric is equivalent to that of a positive nonconstant superharmonic function on the corresponding Riemannian manifold. The latter is deeply related to the global property of the Riemannian manifold (volume growth ratio) [3].

In the next section, we briefly review our notation in AR model manifolds. For statistical model manifolds and differential geometrical concepts in statistics, see, e.g., Amari and Nagaoka [2]. In Sect. 3, we mention our main result, the explicit form of a superharmonic prior in the AR model. Some discussions follow in Sect. 4. Detailed proof is given in Appendix. It is straightforward but still needs a systematic way of dealing with lots of irreducible fractional polynomial.

2 Basic definition and notation

2.1 Fisher metric on the AR model manifold

Autoregressive (AR) models are widely-known in the field of time series analysis (see, e.g., Brockwell and Davis [8]) and defined as follows. A p-th order AR model with AR parameter \(a_1, \dots , a_p\) is defined by
$$\begin{aligned} x_{t} = - \sum _{i=1}^{p} a_i x_{t-i} + \epsilon _t, \end{aligned}$$
where \(\{ \epsilon _t \}\) is a Gaussian white noise with mean 0 and variance \(\sigma ^2\). Now, we define the shift operator z by \(z x_{t} = x_{t+1}\). Then, \(z^{-i} x_t = x_{t-i}\) and
$$\begin{aligned} x_t = H_{a}(z)^{-1} \epsilon _t, \quad H_{a}(z):= \sum _{i=0}^{p}a_i z^{-i} \hbox { with}\ a_0 = 1. \end{aligned}$$
In the present paper only stationary AR models are considered.
According to Komaki [11], we calculate the Fisher metric on the AR model manifolds. The explicit form of the spectral density of the AR model is given by
$$\begin{aligned} S( \omega | a_1, \dots , a_p, \sigma ^2 ) = \frac{\sigma ^2}{2 \pi } \frac{1}{|H_a(z)|^2 } ,\quad z = e^{ i \omega }. \end{aligned}$$
Here, we adopt another coordinate system, which brings us a more convenient form to consider. The equation \(z^p H_a(z) = z^p + a_1 z^{p-1} + \cdots + a_{p-1}z + a_p \) is a polynomial of degree p and has p complex roots, \(z_1, z_2, \dots , z_p\) (Note that \(|z_i| < 1\) from the stationarity condition). Since \(a_1, a_2, \dots , a_p \) are all real, it consequently has the conjugate roots. Thus, we can put them in the order like, \(z_1, \dots , z_q, z_{q+1}, \dots , z_{2q} \in \mathbf {C} , z_{2q+1}, \dots , z_{2q+r} \in \mathbf {R} \) and \(z_{q+j} = \bar{z_j} (1 \le j \le q)\) (for simplicity, we assume that there are no multiple roots). The roots \(z_1, z_2, \dots , z_p\) correspond to the original parameter \(a_1, a_2, \dots , a_p \) in a one-to-one manner. Now we introduce a coordinate system \((\theta ^1, \theta ^2, \dots , \theta ^{p})\) using these roots
$$\begin{aligned} \theta ^0 := \sigma ^2, \quad \theta ^1 := z_1, \quad \theta ^2 := z_2, \, \dots \, ,\theta ^p := z_p . \end{aligned}$$
In the remainder of the paper indices \(I, J, K, \dots \) run \(0,1, \dots , p\) (from zero) and indices \(i, j, k, \dots \) run \(1, 2, \dots , p\) (from one). The formal complex derivatives are defined by
$$\begin{aligned} \frac{\partial }{\partial z}:= & {} \frac{1}{2} \left( \frac{\partial }{\partial x } + i \frac{\partial }{\partial y} \right) , \\ \frac{\partial }{\partial \bar{z}}:= & {} \frac{1}{2} \left( \frac{\partial }{\partial x } - i \frac{\partial }{\partial y } \right) , \end{aligned}$$
where x and y are both real part and imaginary part of z. See, for example, Gunning and Rossi [9]. Since the conjugate complex coordinates \(z_i\) and \(\bar{z_i} \) correspond to \(x_{i}\) and \(y_{i}\) in a one-to-one manner, each quantity is evaluated in the original real coordinate if necessary. Index i and the imaginary unit \(i:=\sqrt{-1}\) often appear simultaneously but they are clearly distinguished from context.
In the coordinate system given above, from the formula (2), the Fisher metric \(g_{IJ} \) issee [11].

2.2 Jeffreys prior

In Bayesian analysis, it is suggested to use the Jeffreys prior as a noninformative prior [4, 7, 10]. In information geometry, the Jeffreys prior is essentially a volume element on the model manifold and has some geometric meanings [2, 14]. The explicit form is given by
$$\begin{aligned} \pi _{\text {J}} (\theta ) \propto \sqrt{ \det ( g_{IJ}) }, \end{aligned}$$
where \(g_{IJ}\) is a Riemannian metric (Fisher metric).
In the AR model, after straightforward calculation, we obtain the explicit form in the above coordinate.
$$\begin{aligned} \pi _{\text {J}} (\theta ) \propto \frac{1}{\sigma ^{2} } \left| \frac{ \prod _{i<j} (z_i - z_j)^2 }{ \prod _{i=1}^{p} \prod _{j=1}^{p} (1-z_i z_j) } \right| ^{\frac{1}{2} }, \qquad \sigma >0,\ |z_{1}|<1, \dots , |z_{p}| <1 \end{aligned}$$
(4)
We are able to evaluate the order in which the integral of this function diverges near the boundary of the parameter space. In particular, the integral \( \int _{ 1-\epsilon< |z| < 1}\frac{ 1}{ 1 - |z| } \mathrm {d}z \mathrm {d}\bar{z}\) diverges for a positive constant \(\epsilon \). By using this, the Jeffreys prior is shown to be improper (i.e., it is not a probability density.) even when we set \(\sigma =1\) if \(p \ge 2\). As pointed out by many authors [4, 5, 10], the Jeffreys prior often becomes improper and Bayesian analysis based on an improper prior distribution could be improved by that based on another prior distribution.

2.3 Superharmonic prior

In Bayesian time series analysis, superharmonic priors as alternative to the Jeffreys prior are proposed in Tanaka and Komaki [18]. In the present article, we give a complete derivation of superharmonic priors in the AR(p) models. Before mentioning our main result, first, we describe the general definition of a superharmonic prior.

Let \(\mathcal {M}\) denote a Riemannian manifold endowed with a Riemannian metric \(g_{IJ}\). For simplicity, we assume a global coordinate system \(\Theta \subset \mathbf {R}^{k}\). A scalar function \(\phi (\theta )\) on \(\mathcal {M}\) is called a superharmonic function if it satisfies,
$$\begin{aligned} \Delta \phi (\theta ) \le 0 \quad \forall \theta \in \Theta , \end{aligned}$$
where \(\Delta \) is the Laplace–Beltrami operator. The Laplace–Beltrami operator is defined by
$$\begin{aligned} \Delta \phi := \frac{1}{\sqrt{g} } \frac{\partial }{\partial \theta ^{I} } \left( \sqrt{g} g^{IJ} \frac{ \partial }{\partial \theta ^{J}} \phi \right) , \end{aligned}$$
where \(g^{IJ}\) is the inverse of \(g_{IJ}\) and \(g := \det ( g_{IJ})\). If a superharmonic function is positive, i.e., \(\phi (\theta ) > 0,\ \forall \theta \in \Theta \), then it is called a positive superharmonic function.

Note that a positive constant \(\phi (\theta ) =c (> 0)\) is a trivial superharmonic function because \(\Delta \phi =0\). As we shall see later, we are interested only in positive nonconstant superharmonic functions. We also see that there are class of positive nonconstant superharmonic functions if we find one specific positive nonconstant superharmonic function from the following lemmas.

Lemma 2.1

Let \(\phi \) be a positive nonconstant superharmonic function. Then, for every \(s, (0 < s \le 1), \phi ^{s}\) is also a positive nonconstant superharmonic function.

Proof

It is easily seen that
$$\begin{aligned} \Delta \left( \phi ^{s}\right)&= s \phi ^{s-1} \Delta \phi - s(1-s) \phi ^{s-2} g^{IJ} \frac{ \partial \phi }{\partial \theta ^{I}}\frac{ \partial \phi }{\partial \theta ^{J}} \end{aligned}$$
(5)
When \( 0 < s \le 1\), the right hand side in (5) is nonpositive.\(\square \)

By definition, the following lemma also holds.

Lemma 2.2

Let \(\phi _{1}, \dots , \phi _{k}\) be positive superharmonic functions and let \(c_{1}, \dots , c_{k}\) be positive constants. At least one, say, \(\phi _{1}\), is assumed to be a nonconstant. Then, \( \psi = \sum _{j=1}^{k} c_{j} \phi _{j}\) is a positive nonconstant superharmonic function.

Now we go back to our formulation. Let us define a superharmonic prior on a model manifold. When a model manifold endowed with the Fisher metric has a positive nonconstant superharmonic function \(\phi (\theta )\), we call \(\pi _{\text {H}}(\theta ):= \pi _{\text {J}}(\theta )\phi (\theta )\) a superharmonic prior. In the following subsection, we briefly see why superharmonic priors are preferable to the Jeffreys prior.

Unfortunately, not all model manifolds with the Fisher metric admit a superharmonic prior. For example, there is no superharmonic prior in one-dimensional model. Whether a superharmonic prior exists or not depends on the global properties of the model manifold [12].

Since the AR(p) model manifold has a very nontrivial structure, it seemed overwhelmingly difficult to find superharmonic priors. However, as we see in the next section, we succeeded in giving superharmonic priors in the AR(p) model manifold along the above definition, which we hope is also significant to the community of information geometry (and differential geometry).

2.4 Statistical meaning of superharmonic priors

Some readers may wonder why superharmonic priors are preferable to the Jeffreys prior. In this subsection we briefly see this point according to Tanaka and Komaki [18]. Let us consider the estimation of a spectral density among the parametric model (1). We evaluate the closeness with the Kullback–Leibler divergence for the spectral densities.

The Bayes spectral density with respect to a proper prior \(\pi \) is defined by
$$\begin{aligned} S_{\pi }(\omega ) := \int S(\omega | \theta ) \pi (\theta | x^{n} ) \mathrm {d}\theta , \end{aligned}$$
(6)
where \(x^{n}\) is the time series data of length n. It minimizes the average error
$$\begin{aligned} E_{\theta }\left[ E_{X^{n}} [ D( S(\omega | \theta ) || \hat{S}(\omega ; X^{n}) ) ] \right] , \\ \end{aligned}$$
where
$$\begin{aligned} D( S(\omega | \theta ) || \hat{S}(\omega ; x^{n} ) )&:= \int _{-\pi }^{\pi } \frac{\mathrm {d} \omega }{4\pi } \left\{ \frac{ S(\omega | \theta ) }{ \hat{S}(\omega ; x^{n} ) } \right. \\&\quad \left. -\,1 - \log \left( \frac{S(\omega | \theta ) }{ \hat{S}(\omega ; x^{n} ) } \right) \right\} \end{aligned}$$
is the Kullback–Leibler divergence from the true spectral density \(S(\omega | \theta )\) to the estimated spectral density \(\hat{S}(\omega ; x^{n} ) \). If we have no information on the parameter \(\theta \), then the first possible choice is the Jeffreys prior. However, the optimality of the Bayes estimate does not necessarily hold when the prior is improper. As we mentioned in Sect. 2.3, the Jeffreys prior is improper in the AR model.
Tanaka and Komaki [18] showed that every Bayes spectral density based on a superharmonic prior \(\pi _{\text {H}}\) satisfies the following.
$$\begin{aligned}&n^{2} \left\{ E_{X^{n}} [ D( S(\omega | \theta ) || S_{\pi _{\text {J}}}(\omega ) )] - E_{X^{n}} [ D( S(\omega | \theta ) || S_{\pi _{\text {H}}} (\omega ) ) ] \right\} \\&\quad = \frac{1}{2} g^{IJ}(\theta ) \left( \partial _{I}\log \frac{ \pi _{\text {H}}( \theta )}{ \pi _{\text {J}} (\theta ) } \right) \left( \partial _{J}\log \frac{ \pi _{\text {H}}( \theta )}{ \pi _{\text {J}} (\theta ) } \right) \\&\qquad -\, \frac{ \pi _{\text {J}} (\theta ) }{\pi _{\text {H}}( \theta ) } \Delta \left( \frac{ \pi _{\text {H}} (\theta ) }{\pi _{\text {J}}( \theta ) } \right) + O(n^{-1/2} ) \end{aligned}$$
for every \(\theta \in \Theta \). The first term is nonnegative and not identically equal to zero. The second term is nonnegative by definition. Thus, the above difference is nonnegative for every \(\theta \in \Theta \) and not identically equal to zero when the higher order terms are negligible.

From the above evaluation, every Bayes spectral density based on a superharmonic prior \(\pi _{\text {H}}\) has better performance than that based on the Jeffreys prior \(\pi _{\text {J}}\). We also emphasize that there are no uniformly best Bayes spectral density and rather a good class of Bayes spectral densities in this formulation. In this sense, superharmonic priors are preferable to the Jeffreys prior. Thus, at least, from theoretical point of view in Bayesian statistics, it is very important to obtain a class of superharmonic priors explicitly.

We also note that this idea basically goes along the same line as Bayesian predictive densities based on superharmonic priors in Komaki [12]. The approach is completely different from the famous reference prior approach by Bernardo [5, 6], where they consider the maximization of the information quantity.

3 Superharmonic priors for the AR(p) models

In this section, we present superharmonic priors in the AR(p) model manifold with full derivation. Before mentioning our main result, slightly we simplify our problem. We decompose \(\Delta \) into two parts. One part is relevant with \(\theta ^0 = \sigma ^2 \) and the others with \(\theta ^1, \ldots , \theta ^p\). Thus, without loss of generality, we can set \(\sigma ^2=1\) (see Tanaka and Komaki [17], for details). Then, the parameter region on the AR(p) model becomes
$$\begin{aligned} \Omega := \{ \theta = (\theta ^{1},\dots , \theta ^{p}) = (z_1, \dots , z_p): \ |z_1|<1, |z_2|< 1,\dots , |z_p|<1 \}. \end{aligned}$$
due to the stationarity condition.
Now we give superharmonic priors for the AR(p) models. We begin with a positive nonconstant superharmonic function on the AR model manifold. General formula is given by
$$\begin{aligned} \phi (\theta ) = \prod _{i<j}(1-z_i z_j). \end{aligned}$$
(7)
For example, when \(p=3\),
$$\begin{aligned} \phi = (1 -z_1 z_2) (1 -z_1 z_3) (1 - z_2 z_3). \end{aligned}$$
We see that the above superharmonic function (7) is not only a positive superharmonic function, but also the eigenfunction of the Laplace–Beltrami operator \(\Delta \).

Theorem 3.1

When \(p \ge 2\), for the above \(\phi \) (7),
$$\begin{aligned} \Delta \phi = -\frac{p(p-1)}{2} \phi \end{aligned}$$
(8)
holds. Thus, \(\phi \) is a positive nonconstant superharmonic function for the AR(p) model manifold.

Proof

First we check positivity of \(\phi \) because we introduce formal complex variables. Recall that we assume the stationarity condition, which says
$$\begin{aligned} |z_{i}| < 1 . \end{aligned}$$
For all real roots (\(z_{i} \in \mathbf {R}\)), clearly \(\phi \) is positive. If there are complex conjugate pair of roots \(z_{i}, z_{i+r}=\bar{z_{i}}\), then such terms are rewritten as
$$\begin{aligned} \prod _{k} (1 - z_{i} z_{k}) (1- z_{i+r} z_{k})= & {} \prod _{k:\ z_{k} \in \mathbf {R}} (1- z_{i} z_{k}) (1- z_{i+r} z_{k}) \\&{} \times \prod _{k:\ z_{k} \in \mathbf {C}} (1 - z_{i} z_{k}) (1 - z_{i+r} z_{k}) \end{aligned}$$
If \(z_{k} \in \mathbf {R}\), we obtain
$$\begin{aligned} (1- z_{i} z_{k}) (1- z_{i+r} z_{k}) = (1 - z_{i} z_{k}) (1 - \overline{ z_{i} z_{k} } ) = | 1- z_{i} z_{k}|^2 \ge 0. \end{aligned}$$
If \(z_{k} \in \mathbf {C}\), gathering the terms including complex conjugate pair \(z_{k+r} = \bar{z_{k}}\), we obtain
$$\begin{aligned}&(1 - z_{i} z_{k}) (1 - z_{i+r} z_{k}) (1 - z_{i} z_{k+r}) (1 - z_{i+r} z_{k+r}) \\&\quad = (1- z_{i} z_{k}) (1 - \overline{ z_{i} z_{k}}) \times (1 - \bar{z_{i}} z_{k}) (1 - \overline{ \bar{z_{i}} z_{k}} ) \\&\quad = | 1- z_{i} z_{k} |^2 | 1 - \bar{z_{i}} z_{k} |^2 \ge 0. \end{aligned}$$
Thus,
$$\begin{aligned} \prod _{k} (1 - z_{i} z_{k}) (1- z_{i+r} z_{k}) \ge 0. \end{aligned}$$
and \(\phi \ge 0\).
Next, we show Eq. (8). We set \(g := \det g_{ij}\) (Here, recall that indices \(i,j, \dots , \) run \(1,2, \dots , p\)). Then, \(\frac{\Delta \phi }{ \phi }\) is rewritten in the following form.
$$\begin{aligned} \frac{\Delta \phi }{ \phi }= & {} \frac{1}{\sqrt{g}} \frac{ \partial _{i} \left( \sqrt{g} \partial ^{i} \phi \right) }{ \phi } \\= & {} \frac{1}{2} (\partial _{i} \log g )\partial ^{i} \log \phi + \frac{ \partial _{i} (\phi \partial ^{i} \log \phi ) }{ \phi } \\= & {} \frac{1}{2} (\partial _{i} \log g) \partial ^{i} \log \phi + ( \partial _{i} \log \phi ) (\partial ^{i} \log \phi ) + \partial _{i} \partial ^{i}\log \phi \\= & {} f_{i} \partial ^{i} \log \phi + \partial _{i}\partial ^{i}\log \phi , \end{aligned}$$
where we set \(f_{i} := \frac{1}{2} \partial _{i} \log g + \partial _{i} \log \phi \). Now we calculate terms \(f_{i}, \partial _{i}\log \phi \), and \(\partial ^{i}\log \phi \).
First we calculate \(\log (\sqrt{g} \phi ) \).
$$\begin{aligned} \log \left( \sqrt{g} \phi \right)= & {} \log \left[ \left| \frac{ \prod _{i<j} (z_i - z_j)^2 }{ \prod _{i=1}^{p} \prod _{j=1}^{p} (1-z_i z_j) } \right| ^{\frac{1}{2} } \times \prod _{j>i} (1-z_i z_j) \right] \\= & {} \frac{1}{2} \log \prod _{i<j} |z_i -z_j|^2 - \frac{1}{2} \log \left\{ \prod _{i=1}^{p}(1 - z_{i}^2)\right\} \\= & {} \log |{\varDelta } | - \frac{1}{2} \log \left\{ \prod _{i=1}^{p} (1 - z_{i}^2) \right\} , \end{aligned}$$
where \({\varDelta } \) is Vandermond determinant.(See Appendix.). Thus,
$$\begin{aligned} f_{i}= & {} \frac{1}{2} ( \partial _{i} \log g) + \partial _{i} \log \phi \\= & {} \partial _{i}\log ( \sqrt{g} \phi ) \\= & {} \partial _{i} \log |{\varDelta } | + \frac{z_{i}}{ 1- z_{i}^2 }. \end{aligned}$$
From now on, since summation rule is irregular, we indicate summation of terms by \(\sum \). We evaluate \(\partial _{i} \log \phi \),
$$\begin{aligned} \partial _{i} \log \phi= & {} \frac{\partial }{\partial z_{i} } \left( \sum _{j>k} \log (1 - z_k z_j) \right) \\= & {} \sum _{k \ne i} \frac{-z_{k}}{ 1- z_{k} z_{i}} \\= & {} \sum _{k =1}^{p} \frac{-z_{k}}{ 1- z_{k} z_{i}} + \frac{ z_{i} }{ 1- z_i {}^2}. \end{aligned}$$
Finally, we rewrite \(\partial ^{j}\log \phi \).
$$\begin{aligned} g^{ji} \partial _{i} \log \phi= & {} \sum _{i=1}^{p} \sum _{k=1}^{p}g^{ji} \left( \frac{ -z_k }{ 1 -z_k z_i } \right) + \sum _{i=1}^{p} g^{ji} \left( \frac{ z_i }{ 1- z_i{}^2} \right) \\= & {} -z_{j} + \sum _{i=1}^{p} g^{ji} \left( \frac{ z_i }{ 1- z_i{}^2} \right) . \end{aligned}$$
Thus, putting these terms together, we obtain
$$\begin{aligned} \frac{\Delta \phi }{ \phi }= & {} \sum _{i=1}^{p} \left[ \left( \partial _{i } \log |{\varDelta } | + \frac{z_{i}}{ 1- z_{i}^2} \right) \times \left\{ -z_{i} + \sum _{j=1}^{p} g^{ij} \left( \frac{ z_{j} }{ 1- z_{j}^2 } \right) \right\} \right] \\&+\, \sum _{i=1}^{p} \partial _{i} \left\{ -z_{i} + \sum _{j=1}^{p} g^{ij} \left( \frac{ z_{j} }{ 1- z_{j}^2 } \right) \right\} \\= & {} - \sum _{i=1}^{p} z_{i} \left( \frac{ \partial _{i} {\varDelta } }{ {\varDelta } } \right) - \sum _{i=1}^{p} \partial _{i} z_{i} \\&+ \,\left[ \sum _{i=1}^{p} \sum _{j=1}^{p} \left( \frac{ \partial _{i} {\varDelta } }{ {\varDelta } } \right) g^{i j} \left( \frac{ z_{j} }{ 1- z_{j}^2 } \right) + \sum _{i=1}^{p} \sum _{j=1}^{p}\partial _{i} \left\{ g^{i j} \left( \frac{ z_{j} }{ 1- z_{j}^2 } \right) \right\} \right] \\&+ \, \left\{ - \sum _{i=1}^{p} z_{i} \left( \frac{ z_{i} }{ 1- z_{i}^2} \right) + \sum _{i=1}^{p} \sum _{j=1}^{p} g^{i j} \left( \frac{ z_{i} }{ 1- z_{i}^2} \right) \left( \frac{ z_{j} }{ 1- z_{j}^2} \right) \right\} \end{aligned}$$
The first term is shown to be equal to \(-\frac{p(p-1)}{2}\). The second term is clearly equal to \(-p\). The other terms are calculated in Appendix. Final result is as follows.

Lemma 3.1

$$\begin{aligned} (A):= & {} - \sum _{i=1}^{p} z_{i} \left( \frac{ z_{i} }{ 1-{ z_{i} }^2 } \right) + \sum _{i=1}^{p}\sum _{j=1}^{p}g^{ij} \left( \frac{ z_{i} }{ 1-{z_{i}}^2 } \right) \left( \frac{ z_{j} }{ 1-{z_{j}}^2 } \right) \\= & {} {\left\{ \begin{array}{ll} \frac{1}{2} p &{}\quad \hbox {even}\ p \\ \frac{1}{2} (p-1) &{}\quad \hbox {odd}\ p \end{array}\right. } \end{aligned}$$

Lemma 3.2

$$\begin{aligned} (B):= & {} \sum _{i=1}^{p} \sum _{j=1}^{p} \left[ \left( \frac{ \partial _{i} {\varDelta } }{ {\varDelta } } \right) g^{i j} \left( \frac{ z_{j} }{ 1- z_{j}^2 } \right) + \frac{\partial }{\partial z_{i}} \left\{ g^{ij} \left( \frac{ z_{j} }{ 1- z_{j}^2 } \right) \right\} \right] \\= & {} {\left\{ \begin{array}{ll} \frac{1}{2} p &{}\quad \hbox {even}\ p \\ \frac{1}{2} (p+1) &{}\quad \hbox {odd}\ p \end{array}\right. } \end{aligned}$$

Thus, when p is even, \(\frac{\Delta \phi }{ \phi } = -\frac{p(p-1)}{2} + (-p)+ \frac{1}{2} p + \frac{1}{2} p = -\frac{p(p-1)}{2}\). When p is odd, we also obtain the same result. \(\square \)

Now we obtain the final result.

Theorem 3.2

When \(p \ge 2\), a superharmonic prior for the AR(p) process is given by
$$\begin{aligned} \pi _\mathrm{{H}} = \phi (z_{1}, \dots , z_{p}) \pi _\mathrm{{J}} \propto \left| \frac{ \prod _{i < j} (z_{i} - z_{j})^2 }{ \prod _{i=1}^{p} (1 - z_{i}^2) } \right| ^{\frac{1}{2} }, \end{aligned}$$
(9)
where the parameter \(\theta ^{i} = z_{i}, i=1,\dots , p\) are roots of characteristic equation defined by \(\sum _{l=0}^{p}a_{l} z^{p-l}=0\) (see Sect. 2).

Proof

Using Eq. (4) and the result of Theorem 3.1, we easily obtain Eq. (9). \(\square \)

From Lemmas 2.1 and 2.2, if we find one superharmonic prior, then we find many kinds of superharmonic priors.

4 Discussions

Statistical significance of our result is mentioned in previous works. In Tanaka and Komaki [17], numerical simulation of the spectral density estimation for the AR(2) process is also presented. When \(p \ge 3\), numerical simulation is not so trivial because our expression of a superharmonic prior (9) is given by formal complex coordinate (\(z_{1}, \dots , z_{p}\)). Detailed analysis will be presented in another occasion.

From differential geometrical viewpoint, our result is deeply related to the theorem connecting a global property of a Riemannian manifold and local one by Aomoto [3]. He showed that a sufficient condition for the existence of a positive nonconstant superharmonic function, is that the sectional curvature is negative for any plane and at any point.

Tanaka and Komaki [16] showed that the sectional curvature of the AR model manifold (\(p \ge 3\)) is strictly positive for some plane and at some point. Thus, the AR model manifold (\(p \ge 3\)) does not satisfy the Aomoto’s sufficient condition. In spite of this, the AR model manifold admits a positive nonconstant superharmonic function (Theorem 3.1). As far as the author knows, this is the first nontrivial (and statistically meaningful) example that implies the gap between local properties on the sectional curvature and existence of positive nonconstant superhamonic functions (global property). In this sense, our result seems very interesting to pure mathematics.

After finishing the original manuscript, Prof. Oda suggested the relation of our derivation in Appendix with some topics on the orthogonal polynomials, in particular, Schur functions [13]. It would be possible to rewrite our derivation using Schur functions, but it does not seem to make our proof much shorter. To be consistent with previous work [16], we keep the derivation in the original form.

Notes

Acknowledgements

This research was supported by JST PRESTO. The author thanks for sincere encouragement by Prof. Komaki and also grateful for many valuable comments to some mathematicians beyond statistical community in several seminars and workshops.

References

  1. 1.
    Amari, S.: Differential geometry of a parametric family of invertible linear systems—Riemannian metric, dual affine connections, and divergence. Math. Syst. Theory 20, 53–82 (1987)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Amari, S., Nagaoka, H.: Methods of Information Geometry. AMS, Oxford (2000)zbMATHGoogle Scholar
  3. 3.
    Aomoto, K.: L’analyse harmonique sur les espaces riemanniens, à courbure riemannienne négative I. J. Fac. Sci. Univ. Tokyo 13, 85–105 (1966)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Berger, J.: Statistical Decision Theory and Bayesian Analysis, 2nd edn. Springer, New York (1985)CrossRefGoogle Scholar
  5. 5.
    Bernardo, J.M.: Reference posterior distributions for Bayesian inference. J. R. Stat. Soc. B 41, 113–147 (1979)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Bernardo, J.M.: Reference analysis. In: Dey, K.K., Rao, C.R. (eds.) Handbook of Statistics, vol. 25, pp. 17–90. Elsevier, Amsterdam (2005)Google Scholar
  7. 7.
    Box, G., Jenkins, G.: Time Series Analysis: Forecasting and Control, 3rd edn. Prentice-Hall, Englewood Cliffs (1994)zbMATHGoogle Scholar
  8. 8.
    Brockwell, P., Davis, R.: Time Series: Theory and Methods. Springer, New York (1991)CrossRefGoogle Scholar
  9. 9.
    Gunning, C., Rossi, H.: Analytic Functions of Several Complex Variables. Prentice Hall, Englewood Cliffs (1965)zbMATHGoogle Scholar
  10. 10.
    Jeffreys, H.: Theory of Probability. Oxford University Press, Oxford (1961)zbMATHGoogle Scholar
  11. 11.
    Komaki, F.: Estimating method for parametric spectral densities. J. Time Ser. Anal. 20, 31–50 (1999)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Komaki, F.: Shrinkage priors for Bayesian prediction. Ann. Stat. 34, 808–819 (2006)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Macdonald, I.: Symmetric Functions and Hall Polynomials, 2nd edn. Oxford Science Publications, Oxford (1995)zbMATHGoogle Scholar
  14. 14.
    Takeuchi, J., Amari, S.: $\alpha $-parallel prior and its properties. IEEE Trans. Info. Theory 51(3), 1011–1023 (2005)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Tanaka, F.: Curvature form on statistical model manifolds and its application to Bayesian analysis. J. Stat. Appl. Probab. Nat. Sci. Publ. 1, 35–43 (2012)CrossRefGoogle Scholar
  16. 16.
    Tanaka, F., Komaki, F.: The sectional curvature of AR model manifolds. Tensor 64, 131–143 (2003)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Tanaka, F., Komaki, F.: A superharmonic prior for the autoregressive process of the second order. J. Time Ser. Anal. 29, 444–452 (2008)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Tanaka, F., Komaki, F.: Asymptotic expansion of the risk difference of the Bayesian spectral density in the autoregressive moving average model. Sankhya Ser. A Indian Stat. Inst. 73–A, 162–184 (2011)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2017

Authors and Affiliations

  1. 1.Department of Systems Innovation, Graduate School of Engineering ScienceOsaka UniversityToyonakaJapan

Personalised recommendations