Abstract
We consider the problem of jointly estimating multiple related zero-mean Gaussian distributions from data. We propose to jointly estimate these covariance matrices using Laplacian regularized stratified model fitting, which includes loss and regularization terms for each covariance matrix, and also a term that encourages the different covariances matrices to be close. This method ‘borrows strength’ from the neighboring covariances, to improve its estimate. With well chosen hyper-parameters, such models can perform very well, especially in the low data regime. We propose a distributed method that scales to large problems, and illustrate the efficacy of the method with examples in finance, radar signal processing, and weather forecasting.
Similar content being viewed by others
References
Almgren R, Chriss N (2000) Optimal execution of portfolio transactions. J Risk 3:5–40
Anderson T (2003) An introduction to multivariate statistical analysis, 3rd edn. Wiley, Hoboken
Banerjee O, El Ghaoui L, d’Aspremont A (2008) Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J Machine Learn Res 9:485–516
Bergin J, Techau P (2002) High-fidelity site-specific radar data set. In: Knowledge-aided sensor signal processing & expert reasoning workshop 2002
Bickel P, Levina E (2008) Covariance regularization by thresholding. The Ann Stat 36(6):2577–2604
Bishop C (2006) Pattern recognition and machine learning. Springer, Berlin
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge
Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Machine Learn 3(1):1–122
Boyd S, Busseti E, Diamond S, Kahn R, Koh K, Nystrup P, Speth J (2017) Multi-period trading via convex optimization. Found Trends Opt 3(1):1–76
Burg J, Luenberger D, DWenger (1982) Estimation of structured covariance matrices. Proc IEEE 70(9):963–974
Cao G, Bouman C (2009) Covariance estimation for high dimensional data vectors using the sparse matrix transform. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds) Advances in neural information processing systems 21, Curran Associates, Inc., pp 225–232
Danaher P, Wang P, Witten D (2014) The joint graphical lasso for inverse covariance estimation across multiple classes. J Royal Stat Soc 76(2):373–397
Deshmukh S, Dubey A (2020) Improved covariance matrix estimation with an application in portfolio optimization. IEEE Signal Process Lett 27:985–989
Eaton M (1983) Multivariate statistics: a vector space approach. Wiley, New York
Engle R (1982) Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica 50(4):987–1007
Fan J, Liao Y, Liu H (2016) An overview of the estimation of large covariance and precision matrices. Econom J 19(1):C1–C32
Fazel M (2002) Matrix rank minimization with applications. PhD thesis, Stanford University
Flury B (1997) A First Course in Multivariate Statistics. Springer Texts in Statistics, Springer
Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441
Guo J, Levina E, Michailidis G, Zhu J (2011) Joint estimation of multiple graphical models. Biometrika 98(1):1–15
Hallac D, Leskovec J, Boyd S (2015) Network lasso: Clustering and optimization in large graphs. In: Proceedings of the ACM international conference on knowledge discovery and data mining, pp 387–396
Hallac D, Park Y, Boyd S, Leskovec J (2017) Network inference via the time-varying graphical lasso. In: Proceedings of the ACM international conference on knowledge discovery and data mining, pp 205–213
Hestenes MR, Stiefel E (1952) Methods of conjugate gradients for solving linear systems. J Res Nat Bureau Stand 49:409–435
Hoffbeck J, Landgrebe D (1996) Covariance matrix estimation and classification with limited training data. IEEE Trans Pattern Anal Machine Intell 18(7):763–767
Kang B (2015) Robust covariance matrix estimation for radar space-time adaptive processing (stap). PhD thesis, The Pennsylvania state university
Kelner J, Orecchia L, Sidford A, Zhu Z (2013) A simple, combinatorial algorithm for solving sdd systems in nearly-linear time. In: Proceedings of the forty-fifth annual ACM symposium on theory of computing, association for computing machinery, New York, NY, USA, STOC ’13, p 911–920
Kernan W, Viscoli C, Makuch R, Brass L, Horwitz R (1999) Stratified randomization for clinical trials. J Clin Epidemiol 52(1):19–26
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. The MIT Press, USA
Ledoit O, Wolf M (2003) Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. J Empir Financ 10(5):603–621
Ledoit O, Wolf M (2020) The power of (non-)linear shrinking: a review and guide to covariance matrix estimation. J Financ Economet. https://doi.org/10.1093/jjfinec/nbaa007
Levitan E, GHerman (1987) A maximum a posteriori probability expectation maximization algorithm for image reconstruction in emission tomography. IEEE Trans Med Imaging 6(3):185–192
Li H, Stoica P, Li J (1999) Computationally efficient maximum likelihood estimation of structured covariance matrices. IEEE Trans Sig Process 47(5):1314–1323
Ma J, Michailidis G (2016) Joint structural estimation of multiple graphical models. J Mach Learn Res 17(166):1–48
Markowitz H (1952) Portfolio selection. J Finan 7(1):77–91
Melvin W (2004) A STAP overview. IEEE Aerospace Elect Syst Mag 19(1):19–35
Miller M, Snyder D (1987) The role of likelihood and entropy in incomplete-data problems: applications to estimating point-process intensities and toeplitz constrained covariances. Proceed IEEE 75(7):892–907
OpenWeather (2017) OpenWeather weather API. https://openweathermap.org/history
Parikh N, Boyd S (2014) Proximal algorithms. Found Trends Opt 1(3):127–239
Recht B, Fazel M, Parrilo P (2010) Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev 52(3):471–501
Robey F, Fuhrmann D, Kelly E, Nitzberg R (1992) A CFAR adaptive matched filter detector. IEEE Trans Aerospace Elect Syst 28(1):208–216
Saegusa T, Shojaie A (2016) Joint estimation of precision matrices in heterogeneous populations. Elect J Stat 10(1):1341–1392. https://doi.org/10.1214/16-EJS1137
Salari S, Chan F, Chan Y, Kim I, Cormier R (2019) Joint DOA and clutter covariance matrix estimation in compressive sensing MIMO radar. IEEE Trans Aerospace Electron Syst 55(1):318–331
Schneider T (2001) Analysis of incomplete climate data: estimation of mean values and covariance matrices and imputation of missing values. J Clim 14(5):853–871
Skaf J, Boyd S (2009) Multi-period portfolio optimization with constraints and transaction costs. Manuscript
Steiner M, Gerlach K (2000) Fast converging adaptive processor or a structured covariance matrix. IEEE Trans Aerospace Electron Syst 36(4):1115–1126
Sun Y, Babu P, Palomar D (2017) Majorization-minimization algorithms in signal processing, communications, and machine learning. IEEE Trans Signal Process 65(3):794–816
Takapoui R, Javadi H (2016) Preconditioning via diagonal scaling. arXiv preprint arXiv:1610.03871
Tuck J, Boyd S (2021) Eigen-stratified models. Opt Eng. https://doi.org/10.1007/s11081-020-09592-x
Tuck J, Hallac D, Boyd S (2019) Distributed majorization-minimization for Laplacian regularized problems. IEEE/CAA J Autom Sinica 6(1):45–52
Tuck J, Barratt S, Boyd S (2021) A distributed method for fitting Laplacian regularized stratified models. J Machine Learn Res Appear
Vandenberghe L, Boyd S (1996) Semidefinite programming. SIAM Rev 38(1):49–95
Vishnoi N (2013) Lx= b. Found Trends Theoret Comput Sci 8(1–2):1–141
Wahlberg B, Boyd S, Annergren M, Wang Y (2012) An ADMM algorithm for a class of total variation regularized estimation problems. In: 16th IFAC symposium on system identification
Ward J (1995) Space-time adaptive processing for airborne radar. In: 1995 International conference on acoustics, speech, and signal processing, 5 2809–2812
Wicks M, Rangaswamy M, Adve R, Hale T (2006) Space-time adaptive processing: a knowledge-based perspective for airborne radar. IEEE Signal Process Mag 23(1):51–65
Witten D, Tibshirani R (2009) Covariance-regularized regression and classification for high dimensional problems. J Royal Stat Soc 71(3):615–636
Zhu Y, Shen X, Pan W (2014) Structural pursuit over multiple undirected graphs. J Am Stat Assoc 109(508):1683–1696
Acknowledgements
Jonathan Tuck is supported by the Stanford Graduate Fellowship in Science and Engineering. The authors thank Muralidhar Rangaswamy and Peter Stoica for helpful comments on an early draft of this paper.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Funding
J. Tuck is supported by the Stanford Graduate Fellowship.
Conflicts of interest
The authors declare that a possible conflict of interest is that S. Boyd is an author of this paper and an editor of this journal.
Availability of data and material
All data is made available at www.github.com/cvxgrp/strat_models.
Code availability
All code is made available at www.github.com/cvxgrp/strat_models.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tuck, J., Boyd, S. Fitting Laplacian regularized stratified Gaussian models. Optim Eng 23, 895–915 (2022). https://doi.org/10.1007/s11081-021-09611-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11081-021-09611-5