Relative gradient based algorithms for general joint diagonalization of complex matrices

Trainini, Tual; Moreau, Eric

doi:10.1007/s11045-015-0328-5

Relative gradient based algorithms for general joint diagonalization of complex matrices

Published: 22 April 2015

Volume 27, pages 275–293, (2016)
Cite this article

Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Tual Trainini^1,2 &
Eric Moreau^1,2

180 Accesses
Explore all metrics

Abstract

This article deals with the problem of joint diagonalization of hermitian and/or complex symmetric matrices. Within the framework of gradient algorithms, we develop various algorithms which are based on different levels of approximation of the classical diagonalization criterion. The algorithms are based on a multiplicative update and on the derivation of an optimal step-size. One of the algorithms is a generalization of DOMUNG to the complex case. Finally, in the blind source separation context, computer simulations illustrate the relative performances of some proposed algorithms in comparison to the true gradient one.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Sorensen, M., Icart, S., Comon, P., & Deneire, L. (2008). Gradient based approximate joint diagonalization by orthogonal transforms. In Proceedings of European Signal Processing Conference (EUSIPCO’2008) (pp. 25–29).
Labat, C., & Idier, J. (2008). Convergence of conjugate gradient methods with a closed-form stepsize formula. Journal of Optimization Theory and Applications, 136, 43–60.
Article MathSciNet MATH Google Scholar
Dai, Y. H. (2006). A new gradient method with an optimal stepsize property. Computational optimization and applications, 33, 73–88.
Article MathSciNet MATH Google Scholar
Joho, M., & Mathis, H. (2002). Joint diagonalization of correlation matrices by using gradient methods with application to blind signal separation. In Proceedings of Sensor Array and Multichannel Signal Processing (pp 273–277).
Weiss, A. J., & Friedlander, B. (1996). Array processing using joint diagonalization. Signal Processing, 50(3), 205–222.
Article MATH Google Scholar
Wax, M., & Sheinvald, J. (1997). A least-squares approach to joint diagonalization. IEEE Signal Processing Letters, 4(2), 52–53.
Article Google Scholar
Yip, L., Chen, C.-E., Hudson, R. E., & Yao, K. (2008). DOA estimation method for wideband color signals based on least-squares Joint Approximate Diagonalization. In Proceedings of Sensor Array and Multichannel Signal Process (pp. 104–107). Darmstadt: Germany.
Chabriel, G., Kleinsteuber, M., Moreau, E., Shen, H., Tichavsky, P., & Yeredor, A. (2014). Joint matrices decompositions and blind source separation. IEEE Signal Processing Magazine, 31(3), 34–43.
Article Google Scholar
Cardoso, J.-F., & Souloumiac, A. (1993). Blind beamforming for non Gaussian signals. IEE Proceedings-F, 40, 362–370.
Moreau, E. (2001). A generalization of joint-diagonalization criteria for source separation. IEEE Transactions Signal Processing, 49(3), 530–541.
Article Google Scholar
Comon, P., & Jutten, C., eds. (2010). Handbook of blind source separation. Independent Component Analysis and Applications Academic Press.
Moreau, E., & Adali, T. (2013). Blind identification and separation of complex valued signals., Digital Signal and Image Processing Series NY: ISTE-Wiley.
Book Google Scholar
De Lathauwer, L., & De Moor, B. (2002). On the blind separation of non-circular sources. In Proceedings of European Signal Processing Conference (EUSIPCO’2002) (pp. 99–102). Toulouse, France.
Eriksson, J., & Koivunen, V. (2006). Complex random vectors and ICA models: Identifiability, uniqueness and separability. IEEE Transactions Information Theory, 52(3), 1017–1029.
Article MathSciNet MATH Google Scholar
Trainini, T., & Moreau, E. (2012). Variations around gradient like algorithms for joint diagonalization of Hermitian matrices. In Proceedings of European Signal Processing Conference (EUSIPCO’2012) (pp. 280–284). Bucharest, Romania.
Yeredor, A., Ziehe, A., & Müller, K.-R. (2004). Approximate joint diagonalization using a natural gradient approach. In Proceedings of ICA, Vol. 3195 (pp. 89–96).
Moreau, E., & Macchi, O. (1996). High order contrasts for self-adaptive source separation. International Journal of Adaptive Control and Signal Processing, 10(1), 19–46.
Article MathSciNet MATH Google Scholar
Herault, J., Jutten, C., & Ans, B. (1985). Detection de grandeurs primitives dans un message composite par une architecture de calcul neuromimetique en apprentissage non supervise. In Proceeding of Xe colloque GRETSI, Vol. 2 (pp 1017–1022). Nice, France.
Comon, P. (1994). Independent component analysis, a new concept ? Signal Processing, 36, 287–314.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

CNRS, ENSAM, LSIS, UMR 7296, Aix Marseille Université, 13397, Marseille, France
Tual Trainini & Eric Moreau
CNRS, LSIS, UMR 7296, Université de Toulon, 83957, La Garde, France
Tual Trainini & Eric Moreau

Authors

Tual Trainini
View author publications
You can also search for this author in PubMed Google Scholar
Eric Moreau
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tual Trainini.

Appendices

Appendix 1: Gradient computation

The calculation of the gradient of (13) in the hermitian case, formulated in (15), is detailed below. First, using the definition of the Frobenius norm

$$\begin{aligned} \Vert {\mathbf {M}}\Vert = \left( \mathsf {trace}\left\{ {\mathbf {M}}^H{\mathbf {M}}\right\} \right) ^{\frac{1}{2}} = \left( \sum _{i,j} |M_{i,j}|^2 \right) ^{\frac{1}{2}} \end{aligned}$$

(45)

we can rewrite the criterion

$$\begin{aligned} {\mathcal {J}}_{h} ( {\mathbf {Z}}) = \sum _{i=1}^{N} \mathsf {trace}\left\{ \left( {\mathbf {U}}_{i}^{H} \right) ^H\mathsf {ODiag} \left\{ {\mathbf {U}}_{i}^{H} \right\} \right\} \end{aligned}$$

(46)

One can develop the following matrix product

$$\begin{aligned} {\mathbf {U}}_{i}^{H} = {\mathbf {T}}_i + {\mathbf {Z}}{\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H + {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \end{aligned}$$

(47)

We now compute the partial derivatives of all terms, and we only conserve those which are composed of ${\mathbf {Z}}^*$ and ${\mathbf {Z}}^H$, since the other partial derivatives are null

$$\begin{aligned} \begin{array}{ll} \partial \mathsf {trace}\left\{ \left( {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} &{} \partial \mathsf {trace}\left\{ \left( {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right) ^H \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} \\ \partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {Z}}{\mathbf {T}}_i \right\} \right\} &{} \partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {Z}}{\mathbf {T}}_i \right\} \right\} \\ \partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} &{} \partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i \right) ^H \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} \\ \partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} &{} \partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right) ^H \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} \end{array} \end{aligned}$$

Applying the following property

$$\begin{aligned} \mathsf {trace}\left\{ {\mathbf {M}}\mathsf {ODiag}\left\{ {\mathbf {Q}}\right\} \right\} = \mathsf {trace}\left\{ \mathsf {ODiag}\left\{ {\mathbf {M}}\right\} {\mathbf {Q}}\right\} \end{aligned}$$

(48)

to the first terms over, and also applying

$$\begin{aligned} \mathsf {tr}\left\{ {\mathbf {M}}{\mathbf {N}}{\mathbf {Q}}\right\} = \mathsf {tr}\left\{ {\mathbf {Q}}{\mathbf {M}}{\mathbf {N}}\right\} = \mathsf {tr}\left\{ {\mathbf {N}}{\mathbf {Q}}{\mathbf {M}}\right\} \end{aligned}$$

(49)

(where ${\mathbf {M}}$, ${\mathbf {N}}$ and ${\mathbf {Q}}$ are square matrices) we get

$$\begin{aligned} \mathsf {trace}\left\{ \left( {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right) ^H \partial \mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H\right\} \right\}&= \mathsf {trace}\left\{ \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right\} ^H {\mathbf {T}}_i \partial {\mathbf {Z}}^H \right\} \nonumber \\&= \mathsf {trace}\left\{ \partial {\mathbf {Z}}^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right\} ^H {\mathbf {T}}_i \right\} \end{aligned}$$

(50)

Finally, the definition

$$\begin{aligned} \frac{\partial \mathsf {trace}\left\{ {\mathbf {Z}}^H {\mathbf {M}}\right\} }{\partial {\mathbf {Z}}^*} = {\mathbf {M}} \end{aligned}$$

(51)

applied here leads to

$$\begin{aligned} \frac{\partial \mathsf {trace}\left\{ {\mathbf {Z}}^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right\} ^H {\mathbf {T}}_i \right\} }{\partial {\mathbf {Z}}^*} = \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right\} ^H {\mathbf {T}}_i \end{aligned}$$

(52)

Doing this to all the terms to derivate gives

$$\begin{aligned} \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right) ^H \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H\right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right\} ^H{\mathbf {Z}}{\mathbf {T}}_i \nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H\right) ^H\mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {Z}}{\mathbf {T}}_i \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {Z}}{\mathbf {T}}_i \right\} {\mathbf {Z}}{\mathbf {T}}_i^H \nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i\right) ^H\mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H \right\} {\mathbf {T}}_i^H + \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i \right\} ^H{\mathbf {T}}_i \nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i\right) ^H\mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} {\mathbf {T}}_i^H + \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i \right\} ^H{\mathbf {Z}}{\mathbf {T}}_i \nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H\right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H\right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H \right\} {\mathbf {Z}}{\mathbf {T}}_i^H \nonumber \\&\quad + \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} ^H {\mathbf {T}}_i \nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H\right) ^H\mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} {\mathbf {Z}}{\mathbf {T}}_i^H \nonumber \\&\quad + \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} ^H{\mathbf {Z}}{\mathbf {T}}_i\nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i\right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {Z}}{\mathbf {T}}_i \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {Z}}{\mathbf {T}}_i \right\} {\mathbf {T}}_i^H \end{aligned}$$

(53)

and we then get all the terms composing the gradient of the criterion.

Appendix 2: Computation of the optimal stepsize for the initial criterion

We first remind the formulation of the criterion (13) :

$$\begin{aligned} {\mathcal {J}} ( {\mathbf {Z}}) = \sum _{i=1}^{N} \Vert \mathsf {ODiag}\{ \left( {\mathbf {I}}+ {\mathbf {Z}}\right) {\mathbf {T}}_{i} \left( {\mathbf {I}}+ {\mathbf {Z}}\right) ^{\ddagger } \} \Vert ^2 \end{aligned}$$

(54)

and we use the update of ${\mathbf {Z}}$ in (14)

$$\begin{aligned} {\mathbf {Z}}= - \mu \frac{\partial {\mathcal {J}}( {\mathbf {Z}}) }{\partial {\mathbf {Z}}^*} = \mu {\mathbf {F}} \end{aligned}$$

(55)

We then introduce (55) in (54), and get

$$\begin{aligned} {\mathcal {J}} ( {\mathbf {Z}}) = \sum _{i=1}^{N}\Vert \mathsf {ODiag}\{ \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) {\mathbf {T}}_{i} \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) ^{\ddagger } \} \Vert ^2 \end{aligned}$$

(56)

Using the definitions (45) and (48) into the equation above leads to

$$\begin{aligned} {\mathcal {J}} ( {\mathbf {Z}})&= \sum _{i=1}^{N} \mathsf {tr}\left\{ \left( \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) {\mathbf {T}}_{i} \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) ^{\ddagger }\right) ^H \right. \nonumber \\&\quad \, \times \left. \mathsf {ODiag}\{\left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) {\mathbf {T}}_{i} \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) ^{\ddagger } \} \right\} \end{aligned}$$

(57)

The development of $\left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) {\mathbf {T}}_{i} \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) ^{\ddagger }$ gives a second order polynomial in $\mu $:

$$\begin{aligned} \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) {\mathbf {T}}_{i} \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) ^{\ddagger } = {\mathbf {T}}_{i} + \mu \left( {\mathbf {F}}{\mathbf {T}}_{i} + {\mathbf {T}}_{i}{\mathbf {F}}^{\ddagger }\right) + \mu ^2{\mathbf {F}}{\mathbf {T}}_{i}{\mathbf {F}}^{\ddagger } \end{aligned}$$

(58)

Finally, developping the matrix product in argument of the trace function leads to a fourth order polynomial $\mu $ (36), where the coefficients are given below

$$\begin{aligned} J_0&= {\mathbf {T}}_i^H \mathsf {ODiag}\{{\mathbf {T}}_i\} \nonumber \\ J_1&= {\mathbf {T}}_i^H \mathsf {ODiag}\{{\mathbf {F}}{\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {F}}^\ddagger \} + \left( {\mathbf {F}}{\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\{{\mathbf {T}}_i\} \nonumber \\ J_2&= {\mathbf {T}}_i^H \mathsf {ODiag}\{{\mathbf {F}}{\mathbf {T}}_i{\mathbf {F}}^\ddagger \} + \left( {\mathbf {F}}{\mathbf {T}}_i{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\{{\mathbf {T}}_i\} + \left( {\mathbf {F}}{\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\{{\mathbf {F}}{\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {F}}^\ddagger \}\nonumber \\ J_3&= \left( {\mathbf {F}}{\mathbf {T}}_i{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\{{\mathbf {F}}{\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {F}}^\ddagger \} + \left( {\mathbf {F}}{\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\{{\mathbf {F}}{\mathbf {T}}_i{\mathbf {F}}^\ddagger \}\nonumber \\ J_4&= \left( {\mathbf {F}}{\mathbf {T}}_i{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\{{\mathbf {F}}{\mathbf {T}}_i{\mathbf {F}}^\ddagger \} \end{aligned}$$

(59)

Appendix 3: Computation of the gradient for approximated criteria

We develop here the computation of the gradient of the criterion (28) in the hermitian case, where the result is done in (32). Applying (45) to this criterion (with $\mathbf {E}_i=\mathbf {T}_i^{(1)}$ and $\mathbf {F}_i=\mathbf {T}_i^{(2)})$

$$\begin{aligned} \mathcal {J}_{a,h} ( {\mathbf {Z}}) = \sum _{i=1}^{N} \mathsf {trace}\left\{ \left( {\mathbf {T}}_{i}^{(1)} + {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} + {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right\} \right\} \end{aligned}$$

(60)

Keeping all the partial derivatives composed of ${\mathbf {Z}}^*$ et ${\mathbf {Z}}^H$:

$$\begin{aligned} \partial&\mathsf {trace}\left\{ \left( {\mathbf {T}}_{i}^{(1)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right\} \right\} \qquad \partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} + {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} \right\} \right\} \\ \partial&\mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right\} \right\} \end{aligned}$$

Applying (48), (49) and (51) to the terms above

$$\begin{aligned} \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H}\right\} \left( {\mathbf {T}}_{i}^{(2)}\right) ^H \nonumber \\&\quad + \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)}\right\} ^H{\mathbf {T}}_{i}^{(2)} \nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} + {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} + {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} \right\} \left( {\mathbf {T}}_{i}^{(2)}\right) ^H \nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {T}}_{i}^{(1)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H}\right\} ^H {\mathbf {T}}_{i}^{(2)} \end{aligned}$$

(61)

leads to the expression given in (32). The formulation of the gradient of (33) is reached by not considering the terms including ${\mathbf {Z}}$ in (61).

Appendix 4: Computation of the optimal stepsize for the approximated criterion

Once approximations have been applied to the criterion (13), we, in a first time, reach the criterion (28).

$$\begin{aligned} {\mathcal {J}}_{a} ( {\mathbf {Z}}) = \sum _{i=1}^{N} \Vert \mathsf {ODiag}\{ {\mathbf {T}}_{i}^{(1)} + {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{\ddagger } \} \Vert ^2 \end{aligned}$$

(62)

The principle used to get the gradient of the exact criterion is kept. Replacing ${\mathbf {Z}}$ by (55) in (62) leads to

$$\begin{aligned} {\mathcal {J}}_{a} ( {\mathbf {Z}}) = \sum _{i=1}^{N} \Vert \mathsf {ODiag}\{ {\mathbf {T}}_{i}^{(1)} + \mu {\mathbf {F}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}\mu {\mathbf {F}}^{\ddagger } \} \Vert ^2 \end{aligned}$$

(63)

Then, using (45) and (48) in the equation above, we get

$$\begin{aligned} {\mathcal {J}}_{a} ( {\mathbf {Z}}) = \sum _{i=1}^{N} \mathsf {tr}\left\{ \left( {\mathbf {T}}_{i}^{(1)} + \mu {\mathbf {F}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}\mu {\mathbf {F}}^{\ddagger } \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} + \mu {\mathbf {F}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}\mu {\mathbf {F}}^{\ddagger } \right\} \right\} \end{aligned}$$

(64)

Developping the matrix product in argument of the trace function leads to the second order polynomial in $\mu $ (37), where the coefficients are given below

$$\begin{aligned} J_{a,0}&= {{\mathbf {T}}_{i}^{(1)}}^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} \right\} \nonumber \\ J_{a,1}&= {{\mathbf {T}}_{i}^{(1)}}^H \mathsf {ODiag}\left\{ {\mathbf {F}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {F}}^\ddagger \right\} + \left( {\mathbf {F}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} \right\} \nonumber \\ J_{a,2}&= \left( {\mathbf {F}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\left\{ {\mathbf {F}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {F}}^\ddagger \right\} \end{aligned}$$

(65)

Concerning the criterion (31), we remark that the difference comes from the approximation in ${\mathbf {Z}}$ in (64). Then, we only have a first order polynomial in $\mu $ where the coefficients are those which do not contain a double product in ${\mathbf {Z}}$, whether $b_{i,0}$ or $b_{i,1}$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Trainini, T., Moreau, E. Relative gradient based algorithms for general joint diagonalization of complex matrices. Multidim Syst Sign Process 27, 275–293 (2016). https://doi.org/10.1007/s11045-015-0328-5

Download citation

Received: 08 July 2014
Revised: 30 March 2015
Accepted: 04 April 2015
Published: 22 April 2015
Issue Date: January 2016
DOI: https://doi.org/10.1007/s11045-015-0328-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Relative gradient based algorithms for general joint diagonalization of complex matrices

Abstract

Access this article

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Gradient computation

Appendix 2: Computation of the optimal stepsize for the initial criterion

Appendix 3: Computation of the gradient for approximated criteria

Appendix 4: Computation of the optimal stepsize for the approximated criterion

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation