Abstract
Mixtures-of-Experts models and their maximum likelihood estimation (MLE) via the EM algorithm have been thoroughly studied in the statistics and machine learning literature. They are subject of a growing investigation in the context of modeling with high-dimensional predictors with regularized MLE. We examine MoE with Gaussian gating network, for clustering and regression, and propose an \(\ell _1\)-regularized MLE to encourage sparse models and deal with the high-dimensional setting. We develop an EM-Lasso algorithm to perform parameter estimation and utilize a BIC-like criterion to select the model parameters, including the sparsity tuning hyperparameters. Experiments conducted on simulated data show the good performance of the proposed regularized MLE compared to the standard MLE with the EM algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chamroukhi, F.: Non-normal mixtures of experts, July 2015. arXiv:1506.06707
Chamroukhi, F.: Robust mixture of experts modeling using the \(t\)-distribution. Neural Netw. 79, 20–36 (2016)
Chamroukhi, F.: Skew-normal mixture of experts. In: The International Joint Conference on Neural Networks (IJCNN), Vancouver, Canada, July 2016
Chamroukhi, F.: Skew \(t\) mixture of experts. Neurocomputing 266, 390–408 (2017)
Chamroukhi, F., Samé, A., Govaert, G., Aknin, P.: A regression model with a hidden logistic process for feature extraction from time series. In: International Joint Conference on Neural Networks (IJCNN), pp. 489–496 (2009)
Chamroukhi, F., Trabelsi, D., Mohammed, S., Oukhellou, L., Amirat, Y.: Joint segmentation of multivariate time series with hidden process regression for human activity recognition. Neurocomputing 120, 633–644 (2013)
Chamroukhi, F., Huynh, B.T.: Regularized maximum likelihood estimation and feature selection in mixtures-of-experts models. J. Soc. Française Stat. 160(1), 57–85 (2019)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. JRSS B 39(1), 1–38 (1977)
Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Technical report, Annals of Applied Statistics (2007)
Hastie, T., Tibshirani, R., Wainwright, M.: Statistical Learning with Sparsity: The Lasso and Generalizations. Chapman & Hall/CRC, London/Boca Raton (2015)
Huynh, T., Chamroukhi, F.: Estimation and feature selection in mixtures of generalized linear experts models. arXiv:1907.06994 (2019)
Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Comput. 3(1), 79–87 (1991)
Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the EM algorithm. Neural Comput. 6, 181–214 (1994)
Khalili, A.: New estimation and feature selection methods in mixture-of-experts models. Can. J. Stat. 38(4), 519–539 (2010)
McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions, 2nd edn. Wiley, New York (2008)
McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000)
Nguyen, H.D., Chamroukhi, F.: Practical and theoretical aspects of mixture-of-experts modeling: an overview. WIREs Data Min. Knowl. Discov. 8, e1246-n/a (2018). https://doi.org/10.1002/widm.1246
Nguyen, H.D., Chamroukhi, F., Forbes, F.: Approximation results regarding the multiple-output mixture of linear experts model. Neurocomputing (2019). https://doi.org/10.1016/j.neucom.2019.08.014
Nguyen, H.D., McLachlan, G.J.: Laplace mixture of linear experts. Comput. Stat. Data Anal. 93, 177–191 (2016)
Pan, W., Shen, X.: Penalized model-based clustering with application to variable selection. J. Mach. Learn. Res. 8, 1145–1164 (2007)
Städler, N., Bühlmann, P., van de Geer, S.: Rejoinder: l1-penalization for mixture regression models. TEST 19(2), 280–285 (2010)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. Ser. B 58(1), 267–288 (1996)
Wu, T.T., Lange, K.: Coordinate descent algorithms for lasso penalized regression. Ann. Appl. Stat. 2(1), 224–244 (2008). https://doi.org/10.1214/07-AOAS147
Xu, L., Jordan, M.I., Hinton, G.E.: An alternative model for mixtures of experts. In: Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 633–640. MIT Press, Cambridge (1995)
Acknowledgments
This research is supported by Ethel Raybould Fellowship (Univ. of Queensland), ANR SMILES ANR-18-CE40-0014, and Région Normandie RIN AStERiCs.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chamroukhi, F., Lecocq, F., Nguyen, H.D. (2019). Regularized Estimation and Feature Selection in Mixtures of Gaussian-Gated Experts Models. In: Nguyen, H. (eds) Statistics and Data Science. RSSDS 2019. Communications in Computer and Information Science, vol 1150. Springer, Singapore. https://doi.org/10.1007/978-981-15-1960-4_3
Download citation
DOI: https://doi.org/10.1007/978-981-15-1960-4_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1959-8
Online ISBN: 978-981-15-1960-4
eBook Packages: Computer ScienceComputer Science (R0)