Abstract
We treat the problem of searching for hidden multidimensional independent autoregressive processes (autoregressive independent process analysis, ARIPA). Independent subspace analysis (ISA) can be used to solve the ARIPA task. The socalled separation theorem simplifies the ISA task considerably: the theorem enables one to reduce the task to onedimensional blind source separation task followed by the grouping of the coordinates. However, the grouping of the coordinates still involves two types of combinatorial problems: (a) the number of the independent subspaces and their dimensions, and then (b) the permutation of the estimated coordinates are to be determined. Here, we generalize the separation theorem. We also show a noncombinatorial procedure, which—under certain conditions—can treat these two combinatorial problems. Numerical simulations have been conducted. We investigate problems that fulfill sufficient conditions of the theory and also others that do not. The success of the numerical simulations indicates that further generalizations of the separation theorem may be feasible.
This is a preview of subscription content, access via your institution.
Notes
 1.
The possibility of such a decomposition principle was suspected by Cardoso [3], who based his conjecture on numerical experiments.
 2.
 3.
W ^{m}_{ICA} denotes the component of separation matrix W_{ICA} that corresponds to the mth subprocess.
References
 1.
Hyvärinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, New York
 2.
Cichocki A, Amari S (2002) Adaptive blind signal and image processing. Wiley, New York
 3.
Cardoso J (1998) Multidimensional independent component analysis. In: International conference on acoustics, speech, and signal processing (ICASSP ’98), vol 4. pp 1941–1944
 4.
Akaho S, Kiuchi Y, Umeyama S (1999) MICA: multimodal independent component analysis. In: International joint conference on neural networks (IJCNN ’99), vol 2. pp 927–932
 5.
Vollgraf R, Obermayer K (2001) Multidimensional ICA to separate correlated sources. In: Neural information processing systems (NIPS 2001), vol 14. MIT Press, Cambridge, pp 993–1000
 6.
Bach FR, Jordan MI (2003) Beyond independent components: trees and clusters. J Mach Learn Res 4:1205–1233
 7.
Póczos B, Lőrincz A (2005) Independent subspace analysis using knearest neighborhood distances. Artif Neural Netw Formal Models Appl 3697:163–168
 8.
Póczos B, Lőrincz A (2005) Independent subspace analysis using geodesic spanning trees. In: International conference on machine learning (ICML 2005), vol 119. ACM Press, New York, pp 673–680
 9.
Theis FJ (2005) Blind signal separation into groups of dependent signals using joint block diagonalization. In: International Society for Computer Aided Surgery (ISCAS 2005), vol 6. pp 5878–5881
 10.
Van Hulle MM (2005) Edgeworth approximation of multivariate differential entropy. Neural Comp 17:1903–1910
 11.
Póczos B, Takács B, Lőrincz A (2005) Independent subspace analysis on innovations. In: European conference on machine learning (ECML 2005), vol 3720 LNAI. Springer, Berlin, pp 698–706
 12.
Hyvärinen A (1998) Independent component analysis for timedependent stochastic processes. In: International conference on artificial neural networks (ICANN ’98). Springer, Berlin, pp 541–546
 13.
Szabó Z, Póczos B, Lőrincz A (2006) Crossentropy optimization for independent process analysis. In: Independent component analysis and blind signal separation (ICA 2006), vol 3889, LNCS. Springer, Berlin, pp 909–916
 14.
Cheung Y, Xu L (2003) Dual multivariate autoregressive modeling in state space for temporal signal separation. IEEE Trans Syst Man Cybern B 33:386–398
 15.
Theis FJ (2004) Uniqueness of complex and multidimensional independent component analysis. Signal Process 84:951–956
 16.
Rubinstein RY, Kroese DP (2004) The crossentropy method. Springer, Berlin
 17.
Hardy GH, Ramanujan SI (1918) Asymptotic formulae in combinatory analysis. Proc Lond Math Soc 17:75–115
 18.
Uspensky JV (1920) Asymptotic formulae for numerical functions which occur in the theory of partitions. Bull Russian Acad Sci 14:199–218
 19.
Costa JA, Hero AO (2004) Manifold learning using knearest neighbor graphs. In: International conference on acoustic speech and signal processing (ICASSP 2004), vol 4. pp 988–991
 20.
Gray AG, Moore AW (2000) ‘NBody’ problems in statistical learning. In: Proceedings of NIPS, pp 521–527
 21.
Cover TM, Thomas JA (1991) Elements of information theory. Wiley, New York
 22.
Takano S (1995) The inequalities of Fisher information and entropy power for dependent variables. In: Proceedings of the 7th Japan–Russia symposium on probability theory and mathematical statistics
 23.
Fang KT, Kotz S, Ng KW (1990) Symmetric multivariate and related distributions. Chapman and Hall, London
 24.
Frahm G (2004) Generalized elliptical distributions: theory and applications. PhD thesis, University of Köln
 25.
Gupta AK, Song D (1997) L ^{p}norm spherical distribution. J Stat Plan Inference 60
 26.
Lorenz EN (1963) Deterministic nonperiodic flow. J Atmos Sci 20:130–141
 27.
Amari S, Cichocki A, Yang HH (1996) A new learning algorithm for blind signal separation. Adv Neural Inf Process Syst 8:757–763
 28.
Bach FR, Jordan MI (2002) Kernel independent component analysis. J Mach Learn Res 3:1–48
 29.
Theis FJ (2005) Multidimensional independent component analysis using characteristic functions. In: European signal processing conference (EUSIPCO 2005)
 30.
Neumaier A, Schneider T (2001) Estimation of parameters and eigenmodes of multivariate autoregressive models. ACM Trans Math Softw 27:27–57
 31.
Schneider T, Neumaier A (2001) Algorithm 808: ARfit  a matlab package for the estimation of parameters and eigenmodes of multivariate autoregressive models. ACM Trans Math Softw 27:58–65
 32.
Hyvärinen A, Oja E (1997) A fast fixedpoint algorithm for independent component analysis. Neural Comput 9:1483–1492
 33.
Hyvärinen A, Hoyer PO (2000) Emergence of phase and shift invariant features by decomposition of natural images into independent feature subspaces. Neural Comput 12:1705–1720
Acknowledgments
This research has been supported by the EC NEST ‘Perceptual Consciousness: Explication and Testing’ grant under contract 043261. Opinions and errors in this manuscript are the author’s responsibility, they do not necessarily reflect those of the EC or other project members.
Author information
Affiliations
Corresponding author
Appendix
Appendix
The theorems that we present here concern the ISA task that we gain after reducing the ARIPA task and thus, here s ^{m} = e ^{m} (m = 1, …, M). Since the ISA task also concerns source, but these sources exhibit the i.i.d. property, thus we shall use notation s ^{m}. In the present work, the differential entropy H is defined by the logarithm of base e.
Appendix 1: The ISA separation theorem (Proof)
The main idea of our ISA separation theorem is that the ISA task may be accomplished in two steps under certain conditions. In the first step ICA is executed. The second step is search for the optimal permutation of the ICA components.
If EPI (see Eq. 17) is satisfied (on S ^{L}) then a further inequality holds:
Lemma 1
Suppose that continuous stochastic variables\(u_1,\ldots,u_L \in{\mathbb{R}}\)satisfy thewEPI condition (see Eq. 19). Then, they also satisfy
Note 4
wEPI holds, for example, for independent variables u _{ i }, because independence is not affected by multiplication with a constant.
Proof
Assume that w ∈ S ^{L}. Applying ln on condition 19, and using the monotonicity of the ln function, we can see that the first inequality is valid in the following inequality chain
Then,

1.
we used the relation [21]:
$$ H(w_iu_i)=H(u_i)+\ln\left(\leftw_i\right\right) $$(31)for the entropy of the transformed variable. Hence
$$ {\rm e}^{2H(w_iu_i)}={\rm e}^{2H(u_i)+2\ln\left(\leftw_i\right\right)}= {\rm e}^{2H(u_i)}\cdot {\rm e}^{2\ln\left(\leftw_i\right\right)}={\rm e}^{2H(u_i)}\cdot w_i^2. $$(32) 
2.
In the second inequality, we utilized the concavity of ln.
Now we shall use Lemma 1 to proceed. The separation theorem will be a corollary of the following claim:
Proposition 3
Lety = [y^{1},…,y^{M}] = y(W) = Ws, where\({{\mathbf{W}}}\in {{\mathcal{O}}}^D\), y^{m}is the estimation of themthcomponent of the ISA task. Lety ^{m}_{ i } be theithcoordinate of themth component. Similarly, lets ^{m}_{ i } stand for theithcoordinate of themth source. Let us assume that thes^{m}sources satisfy condition11. Then
Proof
Let us denote the (i, j)th element of matrix W by W _{i,j}. Coordinates of y and s will be denoted by y _{ i } and s _{ i }, respectively. Let \({\mathcal{G}}^1, \ldots, {\mathcal{G}}^M\) denote indices belonging to the 1,…,M subspaces, respectively, that is, \({\mathcal{G}}^1:=\{1,\ldots,d\},\ldots, {\mathcal{G}}^M:=\{Dd+1,\ldots,D\}\). Now, writing the elements of the ith row of matrix multiplication y = W s, we have
and thus,
The above steps can be justified as follows:
 1.

2.
Equation 36: New terms were added for Lemma 1.

3.
Equation 37: Sources s ^{m} are independent of each other and this independence is preserved upon mixing within the subspaces, and we could also use Lemma 1, because W is an orthogonal matrix.

4.
Equation 38: Nominators were transferred into the ∑_{ j } terms.

5.
Equation 39: Variables s ^{m} satisfy condition 11 according to our assumptions.

6.
Equation 40: We simplified the expression after squaring.
Using this inequality, summing it for i, exchanging the order of the sums, and making use of the orthogonality of matrix W, we have
Note 5
The proof holds if the dimensions of the subspaces are not equal. The same is true for the ISA separation theorem.
Having this proposition, now we prove our main theorem (Theorem 1).
Proof
ICA minimizes the LHS of Eq. 33, that is, it minimizes \(\sum_{m=1}^M\sum_{i=1}^dH\left(y^m_i\right)\). The set of minima is invariant for permutations and sign changes and according to Proposition 3, {s ^{m}_{ i } }—that is the coordinates of components s ^{m} of the ISA task—belong to the set of minima.
Appendix 2: Sufficient conditions of the separation theorem
In the separation theorem, we assumed that relation 11 is fulfilled for the s ^{m} sources. Below, we present sufficient conditions—together with proofs—when this inequality is fulfilled.
wEPI
According to Lemma 1, if the wEPI property (i.e., Eq. 19) holds for sources s ^{m}, then inequality Eq. 11 holds, too.
Spherically symmetric sources
We shall make use of the following wellknown property of spherically symmetric variables [23, 24]:
Lemma 2
Letvdenote a ddimensional variable, which is spherically symmetric. Then the projection ofvonto lines through the origin have identical univariate distribution.
Lemma 3
The expectation value and the variance of a ddimensional v spherically symmetric variable are
Proof of Proposition 2
Here, we show that the wEPI property is fulfilled with equality for spherical sources. According to Eqs. 44, 45, spherically symmetric sources s ^{m} have zero expectation values and up to a constant multiplier they also have identity covariance matrices:
Note that our constraint on the ISA task, namely that covariance matrices of the s ^{m} sources should be equal to I _{ d }, is fulfilled up to constant multipliers.
Let P _{ w } denote the projection to straight line with direction w ∈ S ^{d}, which crosses the origin, i.e.,
In particular, if w is chosen as the canonical basis vector e _{ i } (all components are 0, except the ith component, which is equal to 1), then
In this interpretation, wEPI (Eq. 19) is concerned with the entropies of the projections of the different sources onto straight lines crossing the origin. The LHS projects to w, whereas the r.h.s. projects to the canonical basis vectors. Let u denote an arbitrary source, i.e., u := s ^{m}. According to Lemma 2, distribution of the spherical u is the same for all such projections and thus their entropies are identical. That is,
Thus:

LHS of wEPI is equal to \({\rm e}^{2H(u_1)}\).

r.h.s. of wEPI can be written as follows:
$$ \sum_{i=1}^d {\rm e}^{2H(w_iu_i)}=\sum_{i=1}^d{\rm e}^{2H(u_i)}\cdot w_i^2={\rm e}^{2H(u_1)}\sum_{i=1}^dw_i^2={\rm e}^{2H(u_1)}\cdot 1={\rm e}^{2H(u_1)} $$(52)At the first step, we used identity (Eq. 32) for each of the terms. At the second step, Eq. 51 was utilized. Then term \({\rm e}^{2H(u_1)}\) was pulled out and we took into account that w ∈ S ^{d}.
Note 6
We note that sources of spherically symmetric distribution have already been used in the context of ISA in [33]. In that work, a generative model was assumed. According to the assumption, the distribution of the norms of sample projections to the subspaces were independent. This way, the task was restricted to spherically symmetric source distributions, which is a special case of the general ISA task.
Sources Invariant to 90° Rotation
By definition, spherical variables are invariant to orthogonal transformations (see Eq. 20). For mixtures of twodimensional components (d = 2), much milder condition, invariance to 90° rotation, suffices. First, we observe that:
Note 7
In the ISA separation theorem, it is enough if some orthogonal transformation of the s ^{m} sources, C ^{m} s ^{m} \(({{\mathbf{C}}}^m\in{{\mathcal{O}}}^d)\) satisfy the condition 11. In this case, the C ^{m} s ^{m} variables are extracted by the permutation search after the ICA transformation. Because the ISA identification has ambiguities up to orthogonal transformation in the respective subspaces, this is suitable. In other words, for the ISA identification the existence of an Orthonormal Basis (ONB) for each \({{\mathbf{u}}}:={{\mathbf{s}}}^m\in{\mathbb{R}}^d\) components is sufficient, on which the
function takes its minimum. (Here, the \(\left<{{\mathbf{w}}}, {{\mathbf{u}}}\right>:=\sum_{i=1}^dw_iu_i\) stochastic variable is the projection of u to the direction w.) In this case, the entropy inequality Eq. 11 is met with equality on the elements of the ONB.
Now we present our theorem concerning to the d = 2 case.
Theorem 2
Let us suppose, that the density function f of stochastic variable\({{\mathbf{u}}}=(u_1,u_2)(={{\mathbf{s}}}^m)\in{\mathbb{R}}^2\)exhibits the invariance
that is, it is invariant to 90° rotation. If function\(h({{\mathbf{w}}})=H[\left\langle{{\mathbf{w}}},{{\mathbf{u}}}\right\rangle]\) has minimum on the set \(\{{{\mathbf{w}}}\ge{{\mathbf{0}}}\}\cap S^2\), it also has minimum on an ONB. (Relation w ≥ 0 concerns each coordinates.) Consequently, the ISA task can be identified by the use of the separation theorem.
Proof
Let
denote the matrix of 90° ccw rotation. Let w ∈ S ^{2}. \(\left\langle{{\mathbf{w}}},{{\mathbf{u}}}\right\rangle\in{\mathbb{R}}\) is the projection of variable u onto w. The value of the density function of the stochastic variable 〈w, u〉 in \(t\in{\mathbb{R}}\) (we move t in direction w) can be calculated by integration starting from the point w t, in direction perpendicular to w
Using the supposed invariance of f and the relation 56 we have
where ‘ = ’ denotes the equality of functions. Consequently, it is enough to optimize h on the set {w≥0}. Let w _{min} be the minimum of function h on the set \(S^{2}\cap\{{{\mathbf{w}}}\ge {{\mathbf{0}}}\}\). According to Eq. 57, h takes constant and minimal values in the
points. {v _{min}, Rv _{min}} is a suitable ONB in Note 7.
Rights and permissions
About this article
Cite this article
Szabó, Z., Póczos, B. & Lőrincz, A. Autoregressive independent process analysis without combinatorial efforts. Pattern Anal Applic 13, 1–13 (2010). https://doi.org/10.1007/s100440090174x
Received:
Accepted:
Published:
Issue Date:
Keywords
 Independent component analysis
 Independent process analysis
 Autoregressive processes