Skip to main content
Log in

Relative gradient based algorithms for general joint diagonalization of complex matrices

  • Published:
Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Abstract

This article deals with the problem of joint diagonalization of hermitian and/or complex symmetric matrices. Within the framework of gradient algorithms, we develop various algorithms which are based on different levels of approximation of the classical diagonalization criterion. The algorithms are based on a multiplicative update and on the derivation of an optimal step-size. One of the algorithms is a generalization of DOMUNG to the complex case. Finally, in the blind source separation context, computer simulations illustrate the relative performances of some proposed algorithms in comparison to the true gradient one.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

References

  1. Sorensen, M., Icart, S., Comon, P., & Deneire, L. (2008). Gradient based approximate joint diagonalization by orthogonal transforms. In Proceedings of European Signal Processing Conference (EUSIPCO’2008) (pp. 25–29).

  2. Labat, C., & Idier, J. (2008). Convergence of conjugate gradient methods with a closed-form stepsize formula. Journal of Optimization Theory and Applications, 136, 43–60.

    Article  MathSciNet  MATH  Google Scholar 

  3. Dai, Y. H. (2006). A new gradient method with an optimal stepsize property. Computational optimization and applications, 33, 73–88.

    Article  MathSciNet  MATH  Google Scholar 

  4. Joho, M., & Mathis, H. (2002). Joint diagonalization of correlation matrices by using gradient methods with application to blind signal separation. In Proceedings of Sensor Array and Multichannel Signal Processing (pp 273–277).

  5. Weiss, A. J., & Friedlander, B. (1996). Array processing using joint diagonalization. Signal Processing, 50(3), 205–222.

    Article  MATH  Google Scholar 

  6. Wax, M., & Sheinvald, J. (1997). A least-squares approach to joint diagonalization. IEEE Signal Processing Letters, 4(2), 52–53.

    Article  Google Scholar 

  7. Yip, L., Chen, C.-E., Hudson, R. E., & Yao, K. (2008). DOA estimation method for wideband color signals based on least-squares Joint Approximate Diagonalization. In Proceedings of Sensor Array and Multichannel Signal Process (pp. 104–107). Darmstadt: Germany.

  8. Chabriel, G., Kleinsteuber, M., Moreau, E., Shen, H., Tichavsky, P., & Yeredor, A. (2014). Joint matrices decompositions and blind source separation. IEEE Signal Processing Magazine, 31(3), 34–43.

    Article  Google Scholar 

  9. Cardoso, J.-F., & Souloumiac, A. (1993). Blind beamforming for non Gaussian signals. IEE Proceedings-F, 40, 362–370.

  10. Moreau, E. (2001). A generalization of joint-diagonalization criteria for source separation. IEEE Transactions Signal Processing, 49(3), 530–541.

    Article  Google Scholar 

  11. Comon, P., & Jutten, C., eds. (2010). Handbook of blind source separation. Independent Component Analysis and Applications Academic Press.

  12. Moreau, E., & Adali, T. (2013). Blind identification and separation of complex valued signals., Digital Signal and Image Processing Series NY: ISTE-Wiley.

    Book  Google Scholar 

  13. De Lathauwer, L., & De Moor, B. (2002). On the blind separation of non-circular sources. In Proceedings of European Signal Processing Conference (EUSIPCO’2002) (pp. 99–102). Toulouse, France.

  14. Eriksson, J., & Koivunen, V. (2006). Complex random vectors and ICA models: Identifiability, uniqueness and separability. IEEE Transactions Information Theory, 52(3), 1017–1029.

    Article  MathSciNet  MATH  Google Scholar 

  15. Trainini, T., & Moreau, E. (2012). Variations around gradient like algorithms for joint diagonalization of Hermitian matrices. In Proceedings of European Signal Processing Conference (EUSIPCO’2012) (pp. 280–284). Bucharest, Romania.

  16. Yeredor, A., Ziehe, A., & Müller, K.-R. (2004). Approximate joint diagonalization using a natural gradient approach. In Proceedings of ICA, Vol. 3195 (pp. 89–96).

  17. Moreau, E., & Macchi, O. (1996). High order contrasts for self-adaptive source separation. International Journal of Adaptive Control and Signal Processing, 10(1), 19–46.

    Article  MathSciNet  MATH  Google Scholar 

  18. Herault, J., Jutten, C., & Ans, B. (1985). Detection de grandeurs primitives dans un message composite par une architecture de calcul neuromimetique en apprentissage non supervise. In Proceeding of Xe colloque GRETSI, Vol. 2 (pp 1017–1022). Nice, France.

  19. Comon, P. (1994). Independent component analysis, a new concept ? Signal Processing, 36, 287–314.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tual Trainini.

Appendices

Appendix 1: Gradient computation

The calculation of the gradient of (13) in the hermitian case, formulated in (15), is detailed below. First, using the definition of the Frobenius norm

$$\begin{aligned} \Vert {\mathbf {M}}\Vert = \left( \mathsf {trace}\left\{ {\mathbf {M}}^H{\mathbf {M}}\right\} \right) ^{\frac{1}{2}} = \left( \sum _{i,j} |M_{i,j}|^2 \right) ^{\frac{1}{2}} \end{aligned}$$
(45)

we can rewrite the criterion

$$\begin{aligned} {\mathcal {J}}_{h} ( {\mathbf {Z}}) = \sum _{i=1}^{N} \mathsf {trace}\left\{ \left( {\mathbf {U}}_{i}^{H} \right) ^H\mathsf {ODiag} \left\{ {\mathbf {U}}_{i}^{H} \right\} \right\} \end{aligned}$$
(46)

One can develop the following matrix product

$$\begin{aligned} {\mathbf {U}}_{i}^{H} = {\mathbf {T}}_i + {\mathbf {Z}}{\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H + {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \end{aligned}$$
(47)

We now compute the partial derivatives of all terms, and we only conserve those which are composed of \({\mathbf {Z}}^*\) and \({\mathbf {Z}}^H\), since the other partial derivatives are null

$$\begin{aligned} \begin{array}{ll} \partial \mathsf {trace}\left\{ \left( {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} &{} \partial \mathsf {trace}\left\{ \left( {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right) ^H \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} \\ \partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {Z}}{\mathbf {T}}_i \right\} \right\} &{} \partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {Z}}{\mathbf {T}}_i \right\} \right\} \\ \partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} &{} \partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i \right) ^H \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} \\ \partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} &{} \partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right) ^H \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} \end{array} \end{aligned}$$

Applying the following property

$$\begin{aligned} \mathsf {trace}\left\{ {\mathbf {M}}\mathsf {ODiag}\left\{ {\mathbf {Q}}\right\} \right\} = \mathsf {trace}\left\{ \mathsf {ODiag}\left\{ {\mathbf {M}}\right\} {\mathbf {Q}}\right\} \end{aligned}$$
(48)

to the first terms over, and also applying

$$\begin{aligned} \mathsf {tr}\left\{ {\mathbf {M}}{\mathbf {N}}{\mathbf {Q}}\right\} = \mathsf {tr}\left\{ {\mathbf {Q}}{\mathbf {M}}{\mathbf {N}}\right\} = \mathsf {tr}\left\{ {\mathbf {N}}{\mathbf {Q}}{\mathbf {M}}\right\} \end{aligned}$$
(49)

(where \({\mathbf {M}}\), \({\mathbf {N}}\) and \({\mathbf {Q}}\) are square matrices) we get

$$\begin{aligned} \mathsf {trace}\left\{ \left( {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right) ^H \partial \mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H\right\} \right\}&= \mathsf {trace}\left\{ \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right\} ^H {\mathbf {T}}_i \partial {\mathbf {Z}}^H \right\} \nonumber \\&= \mathsf {trace}\left\{ \partial {\mathbf {Z}}^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right\} ^H {\mathbf {T}}_i \right\} \end{aligned}$$
(50)

Finally, the definition

$$\begin{aligned} \frac{\partial \mathsf {trace}\left\{ {\mathbf {Z}}^H {\mathbf {M}}\right\} }{\partial {\mathbf {Z}}^*} = {\mathbf {M}} \end{aligned}$$
(51)

applied here leads to

$$\begin{aligned} \frac{\partial \mathsf {trace}\left\{ {\mathbf {Z}}^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right\} ^H {\mathbf {T}}_i \right\} }{\partial {\mathbf {Z}}^*} = \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right\} ^H {\mathbf {T}}_i \end{aligned}$$
(52)

Doing this to all the terms to derivate gives

$$\begin{aligned} \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right) ^H \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H\right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {Z}}^H \right\} ^H{\mathbf {Z}}{\mathbf {T}}_i \nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H\right) ^H\mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {Z}}{\mathbf {T}}_i \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {Z}}{\mathbf {T}}_i \right\} {\mathbf {Z}}{\mathbf {T}}_i^H \nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i\right) ^H\mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H \right\} {\mathbf {T}}_i^H + \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i \right\} ^H{\mathbf {T}}_i \nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i\right) ^H\mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} {\mathbf {T}}_i^H + \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i \right\} ^H{\mathbf {Z}}{\mathbf {T}}_i \nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H\right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H\right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_i{\mathbf {Z}}^H \right\} {\mathbf {Z}}{\mathbf {T}}_i^H \nonumber \\&\quad + \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} ^H {\mathbf {T}}_i \nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H\right) ^H\mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} {\mathbf {Z}}{\mathbf {T}}_i^H \nonumber \\&\quad + \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_i{\mathbf {Z}}^H \right\} ^H{\mathbf {Z}}{\mathbf {T}}_i\nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_i\right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {Z}}{\mathbf {T}}_i \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_i + {\mathbf {Z}}{\mathbf {T}}_i \right\} {\mathbf {T}}_i^H \end{aligned}$$
(53)

and we then get all the terms composing the gradient of the criterion.

Appendix 2: Computation of the optimal stepsize for the initial criterion

We first remind the formulation of the criterion (13) :

$$\begin{aligned} {\mathcal {J}} ( {\mathbf {Z}}) = \sum _{i=1}^{N} \Vert \mathsf {ODiag}\{ \left( {\mathbf {I}}+ {\mathbf {Z}}\right) {\mathbf {T}}_{i} \left( {\mathbf {I}}+ {\mathbf {Z}}\right) ^{\ddagger } \} \Vert ^2 \end{aligned}$$
(54)

and we use the update of \({\mathbf {Z}}\) in (14)

$$\begin{aligned} {\mathbf {Z}}= - \mu \frac{\partial {\mathcal {J}}( {\mathbf {Z}}) }{\partial {\mathbf {Z}}^*} = \mu {\mathbf {F}} \end{aligned}$$
(55)

We then introduce (55) in (54), and get

$$\begin{aligned} {\mathcal {J}} ( {\mathbf {Z}}) = \sum _{i=1}^{N}\Vert \mathsf {ODiag}\{ \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) {\mathbf {T}}_{i} \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) ^{\ddagger } \} \Vert ^2 \end{aligned}$$
(56)

Using the definitions (45) and (48) into the equation above leads to

$$\begin{aligned} {\mathcal {J}} ( {\mathbf {Z}})&= \sum _{i=1}^{N} \mathsf {tr}\left\{ \left( \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) {\mathbf {T}}_{i} \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) ^{\ddagger }\right) ^H \right. \nonumber \\&\quad \, \times \left. \mathsf {ODiag}\{\left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) {\mathbf {T}}_{i} \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) ^{\ddagger } \} \right\} \end{aligned}$$
(57)

The development of \(\left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) {\mathbf {T}}_{i} \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) ^{\ddagger }\) gives a second order polynomial in \(\mu \):

$$\begin{aligned} \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) {\mathbf {T}}_{i} \left( {\mathbf {I}}+ \mu {\mathbf {F}}\right) ^{\ddagger } = {\mathbf {T}}_{i} + \mu \left( {\mathbf {F}}{\mathbf {T}}_{i} + {\mathbf {T}}_{i}{\mathbf {F}}^{\ddagger }\right) + \mu ^2{\mathbf {F}}{\mathbf {T}}_{i}{\mathbf {F}}^{\ddagger } \end{aligned}$$
(58)

Finally, developping the matrix product in argument of the trace function leads to a fourth order polynomial \(\mu \) (36), where the coefficients are given below

$$\begin{aligned} J_0&= {\mathbf {T}}_i^H \mathsf {ODiag}\{{\mathbf {T}}_i\} \nonumber \\ J_1&= {\mathbf {T}}_i^H \mathsf {ODiag}\{{\mathbf {F}}{\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {F}}^\ddagger \} + \left( {\mathbf {F}}{\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\{{\mathbf {T}}_i\} \nonumber \\ J_2&= {\mathbf {T}}_i^H \mathsf {ODiag}\{{\mathbf {F}}{\mathbf {T}}_i{\mathbf {F}}^\ddagger \} + \left( {\mathbf {F}}{\mathbf {T}}_i{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\{{\mathbf {T}}_i\} + \left( {\mathbf {F}}{\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\{{\mathbf {F}}{\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {F}}^\ddagger \}\nonumber \\ J_3&= \left( {\mathbf {F}}{\mathbf {T}}_i{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\{{\mathbf {F}}{\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {F}}^\ddagger \} + \left( {\mathbf {F}}{\mathbf {T}}_i + {\mathbf {T}}_i{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\{{\mathbf {F}}{\mathbf {T}}_i{\mathbf {F}}^\ddagger \}\nonumber \\ J_4&= \left( {\mathbf {F}}{\mathbf {T}}_i{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\{{\mathbf {F}}{\mathbf {T}}_i{\mathbf {F}}^\ddagger \} \end{aligned}$$
(59)

Appendix 3: Computation of the gradient for approximated criteria

We develop here the computation of the gradient of the criterion (28) in the hermitian case, where the result is done in (32). Applying (45) to this criterion (with \(\mathbf {E}_i=\mathbf {T}_i^{(1)}\) and \(\mathbf {F}_i=\mathbf {T}_i^{(2)})\)

$$\begin{aligned} \mathcal {J}_{a,h} ( {\mathbf {Z}}) = \sum _{i=1}^{N} \mathsf {trace}\left\{ \left( {\mathbf {T}}_{i}^{(1)} + {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} + {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right\} \right\} \end{aligned}$$
(60)

Keeping all the partial derivatives composed of \({\mathbf {Z}}^*\) et \({\mathbf {Z}}^H\):

$$\begin{aligned} \partial&\mathsf {trace}\left\{ \left( {\mathbf {T}}_{i}^{(1)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right\} \right\} \qquad \partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} + {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} \right\} \right\} \\ \partial&\mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right\} \right\} \end{aligned}$$

Applying (48), (49) and (51) to the terms above

$$\begin{aligned} \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H}\right\} \left( {\mathbf {T}}_{i}^{(2)}\right) ^H \nonumber \\&\quad + \mathsf {ODiag}\left\{ {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)}\right\} ^H{\mathbf {T}}_{i}^{(2)} \nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} + {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} + {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} \right\} \left( {\mathbf {T}}_{i}^{(2)}\right) ^H \nonumber \\ \frac{\partial \mathsf {trace}\left\{ \left( {\mathbf {T}}_{i}^{(1)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H} \right\} \right\} }{\partial {\mathbf {Z}}^*}&= \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{H}\right\} ^H {\mathbf {T}}_{i}^{(2)} \end{aligned}$$
(61)

leads to the expression given in (32). The formulation of the gradient of (33) is reached by not considering the terms including \({\mathbf {Z}}\) in (61).

Appendix 4: Computation of the optimal stepsize for the approximated criterion

Once approximations have been applied to the criterion (13), we, in a first time, reach the criterion (28).

$$\begin{aligned} {\mathcal {J}}_{a} ( {\mathbf {Z}}) = \sum _{i=1}^{N} \Vert \mathsf {ODiag}\{ {\mathbf {T}}_{i}^{(1)} + {\mathbf {Z}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {Z}}^{\ddagger } \} \Vert ^2 \end{aligned}$$
(62)

The principle used to get the gradient of the exact criterion is kept. Replacing \({\mathbf {Z}}\) by (55) in (62) leads to

$$\begin{aligned} {\mathcal {J}}_{a} ( {\mathbf {Z}}) = \sum _{i=1}^{N} \Vert \mathsf {ODiag}\{ {\mathbf {T}}_{i}^{(1)} + \mu {\mathbf {F}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}\mu {\mathbf {F}}^{\ddagger } \} \Vert ^2 \end{aligned}$$
(63)

Then, using (45) and (48) in the equation above, we get

$$\begin{aligned} {\mathcal {J}}_{a} ( {\mathbf {Z}}) = \sum _{i=1}^{N} \mathsf {tr}\left\{ \left( {\mathbf {T}}_{i}^{(1)} + \mu {\mathbf {F}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}\mu {\mathbf {F}}^{\ddagger } \right) ^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} + \mu {\mathbf {F}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}\mu {\mathbf {F}}^{\ddagger } \right\} \right\} \end{aligned}$$
(64)

Developping the matrix product in argument of the trace function leads to the second order polynomial in \(\mu \) (37), where the coefficients are given below

$$\begin{aligned} J_{a,0}&= {{\mathbf {T}}_{i}^{(1)}}^H \mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} \right\} \nonumber \\ J_{a,1}&= {{\mathbf {T}}_{i}^{(1)}}^H \mathsf {ODiag}\left\{ {\mathbf {F}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {F}}^\ddagger \right\} + \left( {\mathbf {F}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\left\{ {\mathbf {T}}_{i}^{(1)} \right\} \nonumber \\ J_{a,2}&= \left( {\mathbf {F}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {F}}^\ddagger \right) ^H\mathsf {ODiag}\left\{ {\mathbf {F}}{\mathbf {T}}_{i}^{(2)} + {\mathbf {T}}_{i}^{(2)}{\mathbf {F}}^\ddagger \right\} \end{aligned}$$
(65)

Concerning the criterion (31), we remark that the difference comes from the approximation in \({\mathbf {Z}}\) in (64). Then, we only have a first order polynomial in \(\mu \) where the coefficients are those which do not contain a double product in \({\mathbf {Z}}\), whether \(b_{i,0}\) or \(b_{i,1}\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Trainini, T., Moreau, E. Relative gradient based algorithms for general joint diagonalization of complex matrices. Multidim Syst Sign Process 27, 275–293 (2016). https://doi.org/10.1007/s11045-015-0328-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11045-015-0328-5

Keywords

Navigation