Abstract
In finite mixture of regression models, normal assumption for the errors of each regression component is typically adopted. Though this common assumption is theoretically and computationally convenient, it often produces inefficient and undesirable estimates which undermine the applicability of the model particularly in the presence of outliers. To reduce these defects, we propose to use nonparametric Gaussian scale mixture distributions for component error distributions. By this means, we can lessen the risk of misspecification and obtain robust estimators. In this paper, we study the identifiability of the proposed model and develop a feasible estimating algorithm. Numerical studies including simulation studies and real data analysis to demonstrate the performance of the proposed method are also presented.
Similar content being viewed by others
References
Andrews DF, Mallows CL (1974) Scale mixtures of normal distributions. J R Stat Soc Ser B Methodol 36(1):99–102
Bai X, Yao W, Boyer JE (2012) Robust fitting of mixture regression models. Comput Stat Data Anal 56(7):2347–2359
Bashir S, Carter E (2012) Robust mixture of linear regression models. Commun Stat Theory Methods 41(18):3371–3388
Benaglia T, Chauveau D, Hunter DR, Young DS (2010) mixtools: an r package for analyzing mixture models. J Stat Softw 32:1–29
Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22(7):719–725
Branco MD, Dey DK (2001) A general class of multivariate skew-elliptical distributions. J Multivariate Anal 79(1):99–113
Brochado A, Martins FV (2014) Identifying small market segments with mixture regression models. Int J Latest Trends Finance Econ Sci 4:812–820
Böhning D (1985) Numerical estimation of a probability measure. J Stat Plan Inference 11:57–69
Böhning D (1986) A vertex-exchange-method in D-optimal design theory. Metrika 33:337–347
Cao S, Chang W, Zhang C (2022) Robmixreg: Robust mixture regression. R package version 1.1.0
Caudill SB (2012) A partially adaptive estimator for the censored regression model based on a mixture of normal distributions. Stat Methods Appl 21:121–137
Day NE (1969) Estimating the components of a mixture of normal distributions. Biometrika 56(3):463–474
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B Methodol 39(1):1–22
Doğru FZ, Arslan O (2018) Robust mixture regression modeling using the least trimmed squares (lts)-estimation method. Commun Stat Simul Comput 47(7):2184–2196
Efron B, Olshen RA (1978) How broad is the class of normal scale mixtures? Ann Stat 6:1159–1164
Ferreira CS, Bolfarine H, Lachos VH (2022) Linear mixed models based on skew scale mixtures of normal distributions. Commun Stat Simul Comput 51(12):7194–7214
Garay AM, Lachos VH, Bolfarine H, Cabral CR (2017) Linear censored regression models with scale mixtures of normal distributions. Stat Pap 58:247–278
Garay AM, Lachos VH, Lin T-I (2016) Nonlinear censored regression models with heavy-tailed distributions. Stat Interface 9(3):281–293
García-Escudero LA, Gordaliza A, Mayo-Íscar A, San Martín R (2010) Robust clusterwise linear regression through trimming. Comput Stat Data Anal 54(12):3057–3069
Hennig C (2000) Identifiablity of models for clusterwise linear regression. J Classification 17(2):273–296
Hubert L, Arabie P (1985) Comparing partitions. J Classification 2(1):193–218
Hunter DR, Young DS (2012) Semiparametric mixtures of regressions. J Nonparametric Stat 24(1):19–38
Ingrassia S, Minotti SC, Vittadini G (2012) Local statistical modeling via a cluster-weighted approach with elliptical distributions. J Classification 29(3):363–401
James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning: with applications in R. Springer, Berlin
James G, Witten D, Hastie T, Tibshirani R (2017). ISLR: Data for an introduction to statistical learning with applications in r. R package version, 1
Kim D, Seo B (2014) Assessment of the number of components in Gaussian mixture models in the presence of multiple local maximizers. J Multivariate Anal 125:100–120
Lee H, Seo B (2023). Finite mixture of semiparametric multivariate skew-normal distributions. (in press)
Lesperance ML, Kalbfleisch JD (1992) An algorithm for computing the nonparametric MLE of a mixing distribution. J Am Stat Assoc 87:120–126
Lindsay BG (1995) Mixture models: theory. geometry and applications. Institute of Mathematical Statistics and American Statistical Association
Ma Y, Wang S, Xu L, Yao W (2021) Semiparametric mixture regression with unspecified error distributions. Test 30(2):429–444
Mattos T, d. B, Garay, A. M, Lachos V. H. Likelihood-based inference for censored linear regression models with scale mixtures of skew-normal distributions. J Appl Stat 45(11):2039–2066 (2018)
McLachlan GJ, Peel D (2004) Finite mixture models. John Wiley & Sons, New York
Meng X-L, Rubin DB (1993) Maximum likelihood estimation via the ecm algorithm: a general framework. Biometrika 80(2):267–278
Mirfarah E, Naderi M, Chen D-G (2021) Mixture of linear experts model for censored data: a novel approach with scale-mixture of normal distributions. Comput Stat Data Anal 158:107182
Neykov N, Filzmoser P, Dimova R, Neytchev P (2007) Robust fitting of mixtures using the trimmed likelihood estimator. Comput Stat Data Anal 52(1):299–308
Oh S (2023). Adaptive robust regression modeling with mixture distributions. Ph.D. Dissertation, Sungkyunkwan University
Oh S, Seo B (2023) Merging components in linear gaussian cluster-weighted models. J Classification 40:25–51
Quandt RE (1972) A new approach to estimating switching regressions. J Am Stat Assoc 67(338):306–310
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Seo B, Kang S (2023) Accelerated failure time modeling via nonparametric mixtures. Biometrics 79(1):165–177
Seo B, Kim D (2012) Root selection in normal mixture models. Comput Stat Data Anal 56(8):2454–2470
Seo B, Lee T (2015) A new algorithm for maximum likelihood estimation in normal scale-mixture generalized autoregressive conditional heteroskedastic models. J Stat Comput Simul 85:202–215
Seo B, Noh J, Lee T, Yoon YJ (2017) Adaptive robust regression with continuous gaussian scale mixture errors. J Korean Stat Soc 46(1):113–125
Song W, Yao W, Xing Y (2014) Robust mixture regression model fitting by laplace distribution. Comput Stat Data Anal 71:128–137
Turner TR (2000) Estimating the propagation rate of a viral infection of potato plants via mixtures of regressions. J R Stat Soc Ser C Appl Stat 49(3):371–384
Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854
Wang Y (2007) On fast computation of the non-parametric maximum likelihood estimate of a mixing distribution. J R Stat Soc Ser B Methodol 69:185–198
West M (1987) On scale mixtures of normal distributions. Biometrika 74(3):646–648
Xiang S, Yao W, Seo B (2016) Semiparametric mixture: Continuous scale mixture approach. Comput Stat Data Anal 103:413–425
Yao W, Wei Y, Yu C (2014) Robust mixture regression using the t-distribution. Comput Stat Data Anal 71:116–127
Yu F, Xu C, Deng H-W, Shen H (2020) A novel computational strategy for dna methylation imputation using mixture regression model (mrm). BMC Bioinf 21(1):1–17
Zarei A, Khodadadi Z, Maleki M, Zare K (2023) Robust mixture regression modeling based on two-piece scale mixtures of normal distributions. Adv Data Anal Classification 17:181–210
Zeller CB, Cabral CRB, Lachos VH (2016) Robust mixture regression modeling based on scale mixtures of skew-normal distributions. TEST 25(2):375–396
Zeller CB, Cabral CRB, Lachos VH, Benites L (2019) Finite mixture of regression models for censored data based on scale mixtures of normal distributions. Adv Data Anal Classification 13:89–116
Acknowledgements
This paper is based on a part of Sangkon Oh’s doctoral thesis. The authors wish to thank the Associate Editor and two referees for their valuable comments and suggestions. The research of Byungtae Seo is supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (NRF-2022R1A2C1006462).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have declared no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Proof of Theorem 1
Appendix: Proof of Theorem 1
In (3), suppose that there exists \(\tilde{{\varvec{\theta }}}= (\tilde{\pi }_1, \tilde{{\varvec{\beta }}}_1, \ldots , \tilde{\pi }_J, \tilde{{\varvec{\beta }}}_J)\) and \(\tilde{{\varvec{Q}}}= (\tilde{Q}_1, \ldots , \tilde{Q}_J)\) satisfying
The characteristic function of the left hand side of (11) is
where \(i=\sqrt{-1}\) and \(\psi _{Q_k}(t)=\int \exp (-t^2\sigma ^2/2)dQ_k(\sigma )\), \(k=1,\ldots ,K\). The characteristic function of the right hand side of (11) can be similarly represented as
where \(\psi _{G_j}(t)=\int \exp (-t^2\sigma ^2/2)d\tilde{Q}_j(\sigma )\), \(j=1,2,\cdots ,J\).
Then the following equality must hold:
Because \({{\varvec{\beta }}}_{k_1}\ne {{\varvec{\beta }}}_{k_2}\) if \(k_1\ne k_2\), there exists an open set \(U =\{{{\varvec{u}}}\in \mathbf {\mathcal {X}} |\eta ({\varvec{u}};{{\varvec{\beta }}}_{k_1}) \ne \eta ({\varvec{u}};{{\varvec{\beta }}}_{k_2})\}\), where \(\mathbf {\mathcal {X}}\) is the support of \({{\varvec{X}}}\). For any fixed \(k^{*} \in \{1, 2, \cdots , K \}\), multiplying \(\exp (-i\eta ({{\varvec{u}}};{{\varvec{\beta }}}_{k^*})t)\) on both sides of equation (12), we have
for all \({{\varvec{u}}}\in U\).
Assume that \(\eta ({{\varvec{u}}};\tilde{{\varvec{\beta }}}_j) - \eta ({{\varvec{u}}};{{\varvec{\beta }}}_{k^*}) \ne 0\) for all \(j = 1, \dots , J\), and let H be an open subset of \(\{{{\varvec{u}}}\in U| \eta ({{\varvec{u}}};\tilde{{\varvec{\beta }}}_j) \ne \eta ({{\varvec{u}}};{{\varvec{\beta }}}_{k^*}), j=1,\ldots ,J \}\). Then, for \({{\varvec{u}}}\in H\), (13) should hold. Because \(\eta (\cdot )\) is a continuous function, there exists open sets V and \(\tilde{V}\) in \(\mathbb {R}\) satisfying
for all \(v_k \in V\) for \(k \in \{1,2,\ldots , K \} {\setminus } \{k^*\}\) and \(\tilde{v}_j \in \tilde{V}\) for \(j \in \{1,2,\ldots , J \}\). Because V and \(\tilde{V}\) are open sets, there also exists \(\xi >0\) such that
holds for all \(c\in (1-\xi ,1+\xi )\). Because the exponential function is analytic on the whole complex plane, if (15) holds for \(c\in (1-\xi ,1+\xi )\), (15) should also hold for all \(-\infty<c<\infty\). Hence, the following equality should also hold:
for all \(T>0\). Letting \(T\rightarrow \infty\), we can conclude \(\pi _{k^{*}} \psi _{Q_{k^{*}}}(t)=0\) from \(\lim _{T\rightarrow \infty }\frac{1}{2T}\int _{-T}^T\exp (ic\lambda )dc=0\) for any \(\lambda\). Further, letting \(t \rightarrow 0\) implies \(\pi _{k^*}=0\) which contradicts the assumption that \(\pi _{k^*}>0\). Therefore, there must exist \(j^{*} \in \{1, 2, \ldots, J \}\) such that \(\eta ({\varvec{u}};\tilde{{\varvec{\beta }}}_{j^{*}}) = \eta ({\varvec{u}};{{\varvec{\beta }}}_{k^{*}})\), implying \(\tilde{{\varvec{\beta }}}_{j^{*}}={{\varvec{\beta }}}_{k^{*}}\). Accordingly, (16) follows \(\pi _{k^{*}} \psi _{Q_{k^{*}}}(t) = \tilde{\pi }_{j^{*}} \psi _{\tilde{Q}_{j^{*}}}(t)\). Letting \(t \rightarrow 0\), we have \(\pi _{k^{*}} = \tilde{\pi }_{j^{*}}\). Thus, \(\psi _{Q_{k^{*}}}(t) = \psi _{\tilde{Q}_{j^{*}}}(t)\) is obtained, which in turns implies that \(Q_{k^{*}} = \tilde{Q}_{j^{*}}\) from the uniqueness of the Laplace transform.
Repeating the prior argument, we obtain \(\pi _{k^*} = \tilde{\pi }_{j^*}\), \({{\varvec{\beta }}}_{k^*} = \tilde{{{\varvec{\beta }}}}_{j^*}\), \(Q_{k^*} = \tilde{Q}_{j^*}\) for \(k^*,j^* \in \{ 1,\ldots ,K \}\) and \(K \le J\). However, if \(K < J\) is hold, there exist some \(\tilde{\pi }_j\)’s, \(j = 1,\ldots ,J\), are 0 in contradiction to \(\pi _j > 0, j = 1,\ldots ,J\). Therefore, K and J must be the same.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Oh, S., Seo, B. Semiparametric mixture of linear regressions with nonparametric Gaussian scale mixture errors. Adv Data Anal Classif 18, 5–31 (2024). https://doi.org/10.1007/s11634-023-00570-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-023-00570-6