Space alternating penalized Kullback proximal point algorithms for maximizing likelihood with nondifferentiable penalty

Chrétien, Stéphane; Hero, Alfred; Perdry, Hervé

doi:10.1007/s10463-011-0333-x

Space alternating penalized Kullback proximal point algorithms for maximizing likelihood with nondifferentiable penalty

Published: 11 August 2011

Volume 64, pages 791–809, (2012)
Cite this article

Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Stéphane Chrétien¹,
Alfred Hero² &
Hervé Perdry³

154 Accesses
2 Citations
Explore all metrics

Abstract

The EM algorithm is a widely used methodology for penalized likelihood estimation. Provable monotonicity and convergence are the hallmarks of the EM algorithm and these properties are well established for smooth likelihood and smooth penalty functions. However, many relaxed versions of variable selection penalties are not smooth. In this paper, we introduce a new class of space alternating penalized Kullback proximal extensions of the EM algorithm for nonsmooth likelihood inference. We show that the cluster points of the new method are stationary points even when they lie on the boundary of the parameter set. We illustrate the new class of algorithms for the problems of model selection for finite mixtures of regression and of sparse image reconstruction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Frank-Wolfe Algorithm: A Short Introduction

Article Open access 13 December 2023

Preconditioned golden ratio primal-dual algorithm with linesearch

Article 16 April 2024

Efficient Convex Optimization for Non-convex Non-smooth Image Restoration

Article 17 April 2024

References

Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. Second International Symposium on Information Theory (Tsahkadsor, 1971) (pp. 267–281). Budapest: Akadmiai Kiad.
Barron, A. R. (1999). Information-theoretic characterization of Bayes performance and the choice of priors in parametric and nonparametric problems. In Bayesian statistics, 6 (Alcoceber 1998) (Vol. 6, 2752). New York: Oxford University Press.
Candès E., Plan Y. (2009) Near-ideal model selection by L1 minimization. The Annals of Statistics 37(5): 2145–2177
Article MathSciNet MATH Google Scholar
Candès E., Tao T. (2007) The Dantzig selector: statistical estimation when p is much larger than n. The Annals of Statistics 35(6): 2313–2351
Article MathSciNet MATH Google Scholar
Celeux G., Chrétien S., Forbes F., Mkhadri A. (2001) A component-wise EM algorithm for mixtures. Journal of Computational and Graphical Statistics 10(4): 697–712
Article MathSciNet Google Scholar
Chrétien S., Hero A. O. (2000) Kullback proximal algorithms for maximum-likelihood estimation. Information-theoretic imaging. IEEE Transactions on Information Theory 46(5): 1800–1810
Article MATH Google Scholar
Chrétien S., Hero A. O. (2008) On EM algorithms and their proximal generalizations. European Society for Applied and Industrial Mathematics Probability and Statistics 12: 308–326
MATH Google Scholar
Clarke, F. (1990). Optimization and nonsmooth analysis (Vol. 5: Classics in Applied Mathematics). Philadelphia: Society for Industrial and Applied Mathematics.
Csiszár, I. (1967). Information-type measures of difference of probability distributions and indirect observations. Studia Scientiarum Mathematicarum Hungarica, 2, 299–318.
Google Scholar
Delyon B., Lavielle M., Moulines E. (1999) Convergence of a stochastic approximation version of the EM algorithm. The Annals of Statistics 27(1): 94–128
Article MathSciNet MATH Google Scholar
Dempster A. P., Laird N. M., Rubin D. B. (1977) Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39(1): 1–38
MathSciNet MATH Google Scholar
Fan J., Li R. (2001) Variable selection via non-concave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96: 1348–1360
Article MathSciNet MATH Google Scholar
Fan J., Li R. (2002) Variable selection for Cox’s proportional hazards model and frailty model. The Annals of Statistics 30: 74–99
Article MathSciNet MATH Google Scholar
Fessler J. A., Hero A. O. (1994) Space-alternating generalized expectation–maximization algorithm. IEEE Transactions on Signal Processing 42(10): 2664–2677
Article Google Scholar
Figueiredo M. A. T., Nowak R. D. (2003) An EM algorithm for wavelet-based image restoration. IEEE Transactions on Image Processing 12(8): 906–916
Article MathSciNet Google Scholar
Friedmand, J., & Popescu, B. E. (2003). Importance sampled learning ensembles. Journal of Machine Learning Research (submitted).
Hiriart-Urruty J. B., Lemaréchal C. (1993) Convex analysis and minimization algorithms (Vol. 306: Grundlehren der mathematischen Wissenschaften). Springer, Berlin
Google Scholar
Hunter D. R., Lange K. (2004) A tutorial on MM algorithms. The American Statistician 58(1): 30–37
Article MathSciNet Google Scholar
Ibragimov, I.A., & Hasminski, R.Z. (1981). Statistical estimation. asymptotic theory Translated from the Russian by Samuel Kotz. Applications of Mathematics (Vol. 16.). New York: Springer.
Johnstone I. M., Silverman B. W. (2004) Needles and straw in haystacks: empirical Bayes estimates of possibly sparse sequences. The Annals of Statistics 32(4): 1594–1649
Article MathSciNet MATH Google Scholar
Khalili A., Chen J. (2007) Variable selection in finite mixture of regression models. Journal of the American Statistical Association 102(479): 1025–1038
Article MathSciNet MATH Google Scholar
Koh K., Kim S.-J., Boyd S. (2007) An interior-point method for large-scale l1-regularized logistic regression. Journal of Machine Learning Research 8: 1519–1555
MathSciNet MATH Google Scholar
Kuhn E., Lavielle M. (2004) Coupling a stochastic approximation version of EM with an MCMC procedure. European Society for Applied and Industrial Mathematics Probability and Statistics 8: 115–131
MathSciNet MATH Google Scholar
Liu C., Rubin D. B., Wu Y. N. (1998) Parameter expansion to accelerate EM: The PX–EM algorithm. Biometrika 85(4): 755–770
Article MathSciNet MATH Google Scholar
Martinet B. (1970) Régularisation d’inéquation variationnelles par approximations successives. Revue Francaise d’Informatique et de Recherche Operationnelle 3: 154–179
MathSciNet Google Scholar
Rockafellar, R. T. (1970). Convex analysis. Convex analysis. Princeton Mathematical Series (No. 28) Princeton: Princeton University Press.
Rockafellar R. T. (1976) Monotone operators and the proximal point algorithm. Society for Industrial and Applied Mathematics Journal on Control and Optimization 14: 877–898
MathSciNet MATH Google Scholar
Rockafellar, R. T., & Wets, R. J. B. (2004). Variational analysis. Variational analysis, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] (Vol. 317). Berlin: Springer.
Schwarz G. (1978) Estimating the dimension of a model. The Annals of Statistics 6: 461–464
Article MathSciNet MATH Google Scholar
Tibshirani R. (1996) Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society, Series B 58(1): 267–288
MathSciNet MATH Google Scholar
Tipping M. E. (2001) Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research 1(3): 211–244
MathSciNet MATH Google Scholar
Varadhan R., Roland Ch. (2007) Simple and globally-convergent numerical methods for accelerating any EM algorithm. Scandinavian Journal of Statistics 35(2): 335–353
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Mathematics Department, UMR CNRS 6623, University of Franche Comte, UFR-ST, 16 route de Gray, Besançon, 25030, France
Stéphane Chrétien
Department of Electrical Engineering and Computer Science, The University of Michigan, 1301 Beal Avenue, Ann Arbor, MI, 48109-2122, USA
Alfred Hero
Université Paris-Sud and Inserm UMR-S 669, Hôpital Paul Brousse, Villejuif Cedex, 94817, France
Hervé Perdry

Authors

Stéphane Chrétien
View author publications
You can also search for this author in PubMed Google Scholar
Alfred Hero
View author publications
You can also search for this author in PubMed Google Scholar
Hervé Perdry
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stéphane Chrétien.

About this article

Cite this article

Chrétien, S., Hero, A. & Perdry, H. Space alternating penalized Kullback proximal point algorithms for maximizing likelihood with nondifferentiable penalty. Ann Inst Stat Math 64, 791–809 (2012). https://doi.org/10.1007/s10463-011-0333-x

Download citation

Received: 21 August 2008
Revised: 13 January 2011
Published: 11 August 2011
Issue Date: August 2012
DOI: https://doi.org/10.1007/s10463-011-0333-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Space alternating penalized Kullback proximal point algorithms for maximizing likelihood with nondifferentiable penalty

Abstract

Access this article

Similar content being viewed by others

The Frank-Wolfe Algorithm: A Short Introduction

Preconditioned golden ratio primal-dual algorithm with linesearch

Efficient Convex Optimization for Non-convex Non-smooth Image Restoration

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Keywords

Navigation

Space alternating penalized Kullback proximal point algorithms for maximizing likelihood with nondifferentiable penalty

Abstract

Access this article

Similar content being viewed by others

The Frank-Wolfe Algorithm: A Short Introduction

Preconditioned golden ratio primal-dual algorithm with linesearch

Efficient Convex Optimization for Non-convex Non-smooth Image Restoration

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Share this article

Keywords

Search

Navigation