Skip to main content
Log in

Trace ratio optimization with an application to multi-view learning

  • Full Length Paper
  • Series A
  • Published:
Mathematical Programming Submit manuscript

Abstract

A trace ratio optimization problem over the Stiefel manifold is investigated from the perspectives of both theory and numerical computations. Necessary conditions in the form of nonlinear eigenvalue problem with eigenvector dependency (NEPv) are established and a numerical method based on the self-consistent field (SCF) iteration with a postprocessing step is designed to solve the NEPv and the method is proved to be always convergent. As an application to multi-view subspace learning, a new framework and its instantiated concrete models are proposed and demonstrated on real world data sets. Numerical results show that the efficiency of the proposed numerical methods and effectiveness of the new orthogonal multi-view subspace learning models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data Availibility Statement

All used data are public available online.

Code Availability

Available upon request.

Notes

  1. This is a classical inequality. A quick proof goes as follows. Suppose \(\beta >0\) (otherwise the inequality clearly holds). Let \(x=\widehat{\beta }/\beta \). It suffices to show \((1-\theta )+\theta x\ge x^{\theta }\) for all \(x\ge 0\). Since \(x^{\theta }\) is concave for \(0<\theta <1\), the curve of \(x^{\theta }\) as a function of x is at or below its tangent line at \(x=1\) and hence \(x^{\theta }\le 1+\theta (x-1)\), as was to be shown.

  2. By convention, when \(r=k\), W is a null matrix and the term \(U_{(:,r+1:k)}WV_{(:,r+1:k)}^{{{\,\mathrm{T}\,}}}\) disappears from (3.7) altogether.

  3. \(\{\phi _{\theta }(\{P_s^{(i)}\}_{s=1}^v)\}_{i=0}^{\infty }\) is guaranteed convergent for the Gauss–Seidel-style updating by Theorem 5.2(b).

References

  1. Absil, P.A., Mahony, R., Sepulchre, R.: Optimization Algorithms On Matrix Manifolds. Princeton University Press, Princeton (2008)

    Book  MATH  Google Scholar 

  2. Bai, Z., Demmel, J., Dongarra, J., Ruhe, A., van der Vorst, H. (eds.): Templates for the solution of Algebraic Eigenvalue Problems: A Practical Guide. SIAM, Philadelphia (2000)

  3. Bai, Z., Li, R.C., Lu, D.: Sharp estimation of convergence rate for self-consistent field iteration to solve eigenvector-dependent nonlinear eigenvalue problems. SIAM J. Matrix Anal. Appl. 43(1), 301–327 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  4. Baltrušaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2018)

    Article  Google Scholar 

  5. Borg, I., Lingoes, J.: Multidimensional Similarity Structure Analysis. Springer, New York (1987)

    Book  MATH  Google Scholar 

  6. Cai, Y., Zhang, L.H., Bai, Z., Li, R.C.: On an eigenvector-dependent nonlinear eigenvalue problem. SIAM J. Matrix Anal. Appl. 39(3), 1360–1382 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  7. Cao, G., Iosifidis, A., Chen, K., Gabbouj, M.: Generalized multi-view embedding for visual recognition and cross-modal retrieval. IEEE Trans. Cybern. 48(9), 2542–2555 (2018)

    Article  Google Scholar 

  8. Chu, M.T., Trendafilov, N.T.: The orthogonally constrained regression revisited. J. Comput. Graph. Stat. 10(4), 746–771 (2001)

    Article  MathSciNet  Google Scholar 

  9. Cunningham, J.P., Ghahramani, Z.: Linear dimensionality reduction: survey, insights, and generalizations. J. Mach. Learn. Res. 16, 2859–2900 (2015)

    MathSciNet  MATH  Google Scholar 

  10. Demmel, J.: Applied Numerical Linear Algebra. SIAM, Philadelphia (1997)

    Book  MATH  Google Scholar 

  11. Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml

  12. de Geer, J.P.V.: Linear relations among \(k\) sets of variables. Psychometrika 49, 70–94 (1984)

    Google Scholar 

  13. Edelman, A., Arias, T.A., Smith, S.T.: The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20(2), 303–353 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  14. Eldén, L., Park, H.: A procrustes problem on the Stiefel manifold. Numer. Math. 82, 599–619 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  15. Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. Johns Hopkins University Press, Baltimore (2013)

    Book  MATH  Google Scholar 

  16. Gower, J.C., Dijksterhuis, G.B.: Procrustes Problems. Oxford University Press, New York (2004)

    Book  MATH  Google Scholar 

  17. Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis. Cambridge University Press, Cambridge (1991)

    Book  MATH  Google Scholar 

  18. Hurley, J.R., Cattell, R.B.: The Procrustes program: producing direct rotation to test a hypothesized factor structure. Behav. Sci. 7, 258–262 (1962)

    Article  Google Scholar 

  19. Kanzow, C., Qi, H.D.: A QP-free constrained Newton-type method for variational inequality problems. Math. Program. 85, 81–106 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  20. Kushmerick, N.: Learning to remove internet advertisements. In: Proceedings of the Third Annual Conference on Autonomous Agents, pp. 175–181 (1999)

  21. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, pp. 2169–2178. IEEE (2006)

  22. Li, F.F., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Comput. Vis. Image Underst. 106(1), 59–70 (2007)

    Article  Google Scholar 

  23. Li, R.C.: A perturbation bound for the generalized polar decomposition. BIT 33, 304–308 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  24. Li, R.C.: On perturbations of matrix pencils with real spectra. Math. Comput. 62, 231–265 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  25. Li, R.C.: New perturbation bounds for the unitary polar factor. SIAM J. Matrix Anal. Appl. 16, 327–332 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  26. Li, R.C.: Rayleigh quotient based optimization methods for eigenvalue problems. In: Bai, Z., Gao, W., Su, Y. (eds.) Matrix Functions and Matrix Equations, Series in Contemporary Applied Mathematics. Lecture summary for 2013 Gene Golub SIAM Summer School vol. 19, pp. 76–108. World Scientific, Singapore (2015)

  27. Li, W., Sun, W.: Perturbation bounds for unitary and subunitary polar factors. SIAM J. Matrix Anal. Appl. 23, 1183–1193 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  28. Liu, X.G., Wang, X.F., Wang, W.G.: Maximization of matrix trace function of product Stiefel manifolds. SIAM J. Matrix Anal. Appl. 36(4), 1489–1506 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  29. Lu, D., Li, R.C.: Convergence of SCF for NEPv without unitary invariance property (2022). Work-in-progress

  30. Moré, J.J., Sorensen, D.C.: Computing a trust region step. SIAM J. Sci. Statist. Comput. 4(3), 553–572 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  31. Ngo, T., Bellalij, M., Saad, Y.: The trace ratio optimization problem for dimensionality reduction. SIAM J. Matrix Anal. Appl. 31(5), 2950–2971 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  32. Nie, F., Zhang, R., Li, X.: A generalized power iteration method for solving quadratic problem on the Stiefel manifold. Sci. China Info. Sci. 60, 112101:1-112101:10 (2017)

    MathSciNet  Google Scholar 

  33. Ojala, T., Pietikäinen, M., Mäenpää, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)

    Article  MATH  Google Scholar 

  34. Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)

    Article  MATH  Google Scholar 

  35. Peng, Y., Qi, J.: CM-GANs: cross-modal generative adversarial networks for common representation learning. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 15(1), 1–24 (2019)

    Article  MathSciNet  Google Scholar 

  36. Seber, G.A.F.: A Matrix Handbook for Statisticians. Wiley, New York (2007)

    Book  Google Scholar 

  37. Sharma, A., Kumar, A., Daume, H., Jacobs, D.W.: Generalized multiview analysis: a discriminative latent space. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2160–2167. IEEE (2012)

  38. Stewart, G.W.: Matrix Algorithms, Vol. II: Eigensystems. SIAM, Philadelphia (2001)

    Book  MATH  Google Scholar 

  39. Stewart, G.W., Sun, J.G.: Matrix Perturbation Theory. Academic Press, Boston (1990)

    MATH  Google Scholar 

  40. Sun, J.G.: Matrix Perturbation Analysis. Academic Press, Beijing (1987). (In Chinese)

  41. Sun, S., Xie, X., Yang, M.: Multiview uncorrelated discriminant analysis. IEEE Trans. Cybern. 46(12), 3272–3284 (2016)

    Article  Google Scholar 

  42. von Neumann, J.: Some matrix-inequalities and metrization of matrix-space. Tomck. Univ. Rev. 1, 286–300 (1937)

    Google Scholar 

  43. Vía, J., Santamaría, I., Pérez, J.: A learning algorithm for adaptive canonical correlation analysis of several data sets. Neural Netw. 20(1), 139–152 (2007)

    Article  MATH  Google Scholar 

  44. Wang, H., Yan, S., Xu, D., Tang, X., Huang, T.: Trace ratio vs. ratio trace for dimensionality reduction. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)

  45. Wu, J., Rehg, J.M.: Where am i: Place instance and category recognition using spatial pact. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)

  46. Zhang, L.H., Liao, L.Z., Ng, M.K.: Fast algorithms for the generalized Foley–Sammon discriminant analysis. SIAM J. Matrix Anal. Appl. 31(4), 1584–1605 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  47. Zhang, L.H., Liao, L.Z., Ng, M.K.: Superlinear convergence of a general algorithm for the generalized Foley–Sammon discriminant analysis. J. Optim. Theory Appl. 157(3), 853–865 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  48. Zhang, L.H., Wang, L., Bai, Z., Li, R.C.: A self-consistent-field iteration for orthogonal canonical correlation analysis. IEEE Trans. Pattern Anal. Mach. Intell. 44(2), 890–904 (2022). https://doi.org/10.1109/TPAMI.2020.3012541

    Article  Google Scholar 

  49. Zhang, L.H., Yang, W.H., Shen, C., Ying, J.: An eigenvalue-based method for the unbalanced Procrustes problem. SIAM J. Matrix Anal. Appl. 41(3), 957–983 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  50. Zhang, Z., Du, K.: Successive projection method for solving the unbalanced procrustes problem. Sci. China Math. 49(7), 971–986 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  51. Zhao, H., Wang, Z., Nie, F.: Orthogonal least squares regression for feature extraction. Neurocomputing 216, 200–207 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

The authors wish to thank the two anonymous referees for their constructive suggestions that greatly improved the presentation of this paper.

Funding

Research was supported in part by United States National Science Foundation DMS-1719620 and DMS-2009689, and by the National Natural Science Foundation of China NSFC-12071332.

Author information

Authors and Affiliations

Authors

Contributions

LW, L-HZ, R-CL All authors contribute equally.

Corresponding author

Correspondence to Ren-Cang Li.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Wang is supported in part by NSF DMS-2009689. Zhang is supported in part by the National Natural Science Foundation of China NSFC-12071332. Li is supported in part by NSF DMS-1719620 and DMS-2009689.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, L., Zhang, LH. & Li, RC. Trace ratio optimization with an application to multi-view learning. Math. Program. 201, 97–131 (2023). https://doi.org/10.1007/s10107-022-01900-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-022-01900-w

Keywords

Mathematics Subject Classification

Navigation