Skip to main content
Log in

Semiparametric mixture of linear regressions with nonparametric Gaussian scale mixture errors

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

In finite mixture of regression models, normal assumption for the errors of each regression component is typically adopted. Though this common assumption is theoretically and computationally convenient, it often produces inefficient and undesirable estimates which undermine the applicability of the model particularly in the presence of outliers. To reduce these defects, we propose to use nonparametric Gaussian scale mixture distributions for component error distributions. By this means, we can lessen the risk of misspecification and obtain robust estimators. In this paper, we study the identifiability of the proposed model and develop a feasible estimating algorithm. Numerical studies including simulation studies and real data analysis to demonstrate the performance of the proposed method are also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Andrews DF, Mallows CL (1974) Scale mixtures of normal distributions. J R Stat Soc Ser B Methodol 36(1):99–102

    MathSciNet  Google Scholar 

  • Bai X, Yao W, Boyer JE (2012) Robust fitting of mixture regression models. Comput Stat Data Anal 56(7):2347–2359

    Article  MathSciNet  Google Scholar 

  • Bashir S, Carter E (2012) Robust mixture of linear regression models. Commun Stat Theory Methods 41(18):3371–3388

    Article  MathSciNet  Google Scholar 

  • Benaglia T, Chauveau D, Hunter DR, Young DS (2010) mixtools: an r package for analyzing mixture models. J Stat Softw 32:1–29

    Google Scholar 

  • Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22(7):719–725

    Article  Google Scholar 

  • Branco MD, Dey DK (2001) A general class of multivariate skew-elliptical distributions. J Multivariate Anal 79(1):99–113

    Article  MathSciNet  Google Scholar 

  • Brochado A, Martins FV (2014) Identifying small market segments with mixture regression models. Int J Latest Trends Finance Econ Sci 4:812–820

    Google Scholar 

  • Böhning D (1985) Numerical estimation of a probability measure. J Stat Plan Inference 11:57–69

    Article  MathSciNet  Google Scholar 

  • Böhning D (1986) A vertex-exchange-method in D-optimal design theory. Metrika 33:337–347

    Article  MathSciNet  Google Scholar 

  • Cao S, Chang W, Zhang C (2022) Robmixreg: Robust mixture regression. R package version 1.1.0

  • Caudill SB (2012) A partially adaptive estimator for the censored regression model based on a mixture of normal distributions. Stat Methods Appl 21:121–137

    Article  MathSciNet  Google Scholar 

  • Day NE (1969) Estimating the components of a mixture of normal distributions. Biometrika 56(3):463–474

    Article  MathSciNet  Google Scholar 

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B Methodol 39(1):1–22

    MathSciNet  Google Scholar 

  • Doğru FZ, Arslan O (2018) Robust mixture regression modeling using the least trimmed squares (lts)-estimation method. Commun Stat Simul Comput 47(7):2184–2196

    Article  MathSciNet  Google Scholar 

  • Efron B, Olshen RA (1978) How broad is the class of normal scale mixtures? Ann Stat 6:1159–1164

    Article  MathSciNet  Google Scholar 

  • Ferreira CS, Bolfarine H, Lachos VH (2022) Linear mixed models based on skew scale mixtures of normal distributions. Commun Stat Simul Comput 51(12):7194–7214

    Article  MathSciNet  Google Scholar 

  • Garay AM, Lachos VH, Bolfarine H, Cabral CR (2017) Linear censored regression models with scale mixtures of normal distributions. Stat Pap 58:247–278

    Article  MathSciNet  Google Scholar 

  • Garay AM, Lachos VH, Lin T-I (2016) Nonlinear censored regression models with heavy-tailed distributions. Stat Interface 9(3):281–293

    Article  MathSciNet  Google Scholar 

  • García-Escudero LA, Gordaliza A, Mayo-Íscar A, San Martín R (2010) Robust clusterwise linear regression through trimming. Comput Stat Data Anal 54(12):3057–3069

    Article  MathSciNet  Google Scholar 

  • Hennig C (2000) Identifiablity of models for clusterwise linear regression. J Classification 17(2):273–296

    Article  MathSciNet  Google Scholar 

  • Hubert L, Arabie P (1985) Comparing partitions. J Classification 2(1):193–218

    Article  Google Scholar 

  • Hunter DR, Young DS (2012) Semiparametric mixtures of regressions. J Nonparametric Stat 24(1):19–38

    Article  MathSciNet  Google Scholar 

  • Ingrassia S, Minotti SC, Vittadini G (2012) Local statistical modeling via a cluster-weighted approach with elliptical distributions. J Classification 29(3):363–401

    Article  MathSciNet  Google Scholar 

  • James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning: with applications in R. Springer, Berlin

    Book  Google Scholar 

  • James G, Witten D, Hastie T, Tibshirani R (2017). ISLR: Data for an introduction to statistical learning with applications in r. R package version, 1

  • Kim D, Seo B (2014) Assessment of the number of components in Gaussian mixture models in the presence of multiple local maximizers. J Multivariate Anal 125:100–120

    Article  MathSciNet  Google Scholar 

  • Lee H, Seo B (2023). Finite mixture of semiparametric multivariate skew-normal distributions. (in press)

  • Lesperance ML, Kalbfleisch JD (1992) An algorithm for computing the nonparametric MLE of a mixing distribution. J Am Stat Assoc 87:120–126

    Article  Google Scholar 

  • Lindsay BG (1995) Mixture models: theory. geometry and applications. Institute of Mathematical Statistics and American Statistical Association

  • Ma Y, Wang S, Xu L, Yao W (2021) Semiparametric mixture regression with unspecified error distributions. Test 30(2):429–444

    Article  MathSciNet  Google Scholar 

  • Mattos T, d. B, Garay, A. M, Lachos V. H. Likelihood-based inference for censored linear regression models with scale mixtures of skew-normal distributions. J Appl Stat 45(11):2039–2066 (2018)

  • McLachlan GJ, Peel D (2004) Finite mixture models. John Wiley & Sons, New York

    Google Scholar 

  • Meng X-L, Rubin DB (1993) Maximum likelihood estimation via the ecm algorithm: a general framework. Biometrika 80(2):267–278

    Article  MathSciNet  Google Scholar 

  • Mirfarah E, Naderi M, Chen D-G (2021) Mixture of linear experts model for censored data: a novel approach with scale-mixture of normal distributions. Comput Stat Data Anal 158:107182

    Article  MathSciNet  Google Scholar 

  • Neykov N, Filzmoser P, Dimova R, Neytchev P (2007) Robust fitting of mixtures using the trimmed likelihood estimator. Comput Stat Data Anal 52(1):299–308

    Article  MathSciNet  Google Scholar 

  • Oh S (2023). Adaptive robust regression modeling with mixture distributions. Ph.D. Dissertation, Sungkyunkwan University

  • Oh S, Seo B (2023) Merging components in linear gaussian cluster-weighted models. J Classification 40:25–51

    Article  MathSciNet  Google Scholar 

  • Quandt RE (1972) A new approach to estimating switching regressions. J Am Stat Assoc 67(338):306–310

    Article  Google Scholar 

  • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464

    Article  MathSciNet  Google Scholar 

  • Seo B, Kang S (2023) Accelerated failure time modeling via nonparametric mixtures. Biometrics 79(1):165–177

    Article  MathSciNet  PubMed  Google Scholar 

  • Seo B, Kim D (2012) Root selection in normal mixture models. Comput Stat Data Anal 56(8):2454–2470

    Article  MathSciNet  Google Scholar 

  • Seo B, Lee T (2015) A new algorithm for maximum likelihood estimation in normal scale-mixture generalized autoregressive conditional heteroskedastic models. J Stat Comput Simul 85:202–215

    Article  MathSciNet  Google Scholar 

  • Seo B, Noh J, Lee T, Yoon YJ (2017) Adaptive robust regression with continuous gaussian scale mixture errors. J Korean Stat Soc 46(1):113–125

    Article  MathSciNet  Google Scholar 

  • Song W, Yao W, Xing Y (2014) Robust mixture regression model fitting by laplace distribution. Comput Stat Data Anal 71:128–137

    Article  MathSciNet  Google Scholar 

  • Turner TR (2000) Estimating the propagation rate of a viral infection of potato plants via mixtures of regressions. J R Stat Soc Ser C Appl Stat 49(3):371–384

    Article  MathSciNet  Google Scholar 

  • Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854

    MathSciNet  Google Scholar 

  • Wang Y (2007) On fast computation of the non-parametric maximum likelihood estimate of a mixing distribution. J R Stat Soc Ser B Methodol 69:185–198

    Article  MathSciNet  Google Scholar 

  • West M (1987) On scale mixtures of normal distributions. Biometrika 74(3):646–648

    Article  MathSciNet  Google Scholar 

  • Xiang S, Yao W, Seo B (2016) Semiparametric mixture: Continuous scale mixture approach. Comput Stat Data Anal 103:413–425

    Article  MathSciNet  Google Scholar 

  • Yao W, Wei Y, Yu C (2014) Robust mixture regression using the t-distribution. Comput Stat Data Anal 71:116–127

    Article  MathSciNet  Google Scholar 

  • Yu F, Xu C, Deng H-W, Shen H (2020) A novel computational strategy for dna methylation imputation using mixture regression model (mrm). BMC Bioinf 21(1):1–17

    Article  Google Scholar 

  • Zarei A, Khodadadi Z, Maleki M, Zare K (2023) Robust mixture regression modeling based on two-piece scale mixtures of normal distributions. Adv Data Anal Classification 17:181–210

    Article  MathSciNet  Google Scholar 

  • Zeller CB, Cabral CRB, Lachos VH (2016) Robust mixture regression modeling based on scale mixtures of skew-normal distributions. TEST 25(2):375–396

    Article  MathSciNet  Google Scholar 

  • Zeller CB, Cabral CRB, Lachos VH, Benites L (2019) Finite mixture of regression models for censored data based on scale mixtures of normal distributions. Adv Data Anal Classification 13:89–116

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This paper is based on a part of Sangkon Oh’s doctoral thesis. The authors wish to thank the Associate Editor and two referees for their valuable comments and suggestions. The research of Byungtae Seo is supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (NRF-2022R1A2C1006462).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Byungtae Seo.

Ethics declarations

Conflict of interest

The authors have declared no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Proof of Theorem 1

Appendix: Proof of Theorem 1

In (3), suppose that there exists \(\tilde{{\varvec{\theta }}}= (\tilde{\pi }_1, \tilde{{\varvec{\beta }}}_1, \ldots , \tilde{\pi }_J, \tilde{{\varvec{\beta }}}_J)\) and \(\tilde{{\varvec{Q}}}= (\tilde{Q}_1, \ldots , \tilde{Q}_J)\) satisfying

$$\begin{aligned} \sum _{k=1}^{K} \pi _k \int \frac{1}{\sigma } \phi \left( \frac{y - \eta ({{\varvec{x}}};{{\varvec{\beta }}}_k)}{\sigma }\right) dQ_k(\sigma ) = \sum _{j=1}^{J} \tilde{\pi }_j \int \frac{1}{\sigma } \phi \left( \frac{y - \eta ({{\varvec{x}}};\tilde{{\varvec{\beta }}}_j)}{\sigma }\right) d\tilde{Q}_j(\sigma ). \end{aligned}$$
(11)

The characteristic function of the left hand side of (11) is

$$\begin{aligned} \psi _{Y \mid {{\varvec{X}}}} (t)&= E_{Y \mid {{\varvec{X}}}} [\exp (iyt)] \\&= \sum _{k=1}^{K} \pi _k \int \int \exp (iyt) \frac{1}{\sigma } \phi \left( \frac{y - \eta ({{\varvec{x}}};{{\varvec{\beta }}}_k)}{\sigma }\right) dy dQ_k(\sigma ) \\&=\sum _{k=1}^{K} \pi _k \exp (i\eta ({{\varvec{x}}};{{\varvec{\beta }}}_k)t) \psi _{Q_k}(t), \end{aligned}$$

where \(i=\sqrt{-1}\) and \(\psi _{Q_k}(t)=\int \exp (-t^2\sigma ^2/2)dQ_k(\sigma )\), \(k=1,\ldots ,K\). The characteristic function of the right hand side of (11) can be similarly represented as

$$\begin{aligned} \tilde{\psi }_{Y \mid {{\varvec{X}}}} (t) = \sum _{k=1}^{J} \tilde{\pi }_j \exp (i\eta ({{\varvec{x}}};\tilde{{{\varvec{\beta }}}}_j)t) \psi _{\tilde{Q}_j}(t) \end{aligned}$$

where \(\psi _{G_j}(t)=\int \exp (-t^2\sigma ^2/2)d\tilde{Q}_j(\sigma )\), \(j=1,2,\cdots ,J\).

Then the following equality must hold:

$$\begin{aligned} \sum _{k=1}^{K} \pi _k \exp (i\eta ({{\varvec{x}}};{{\varvec{\beta }}}_k)t) \psi _{Q_k}(t) = \sum _{j=1}^{J} \tilde{\pi }_j \exp (i\eta ({{\varvec{x}}};\tilde{{\varvec{\beta }}}_j)t) \psi _{\tilde{Q}_j}(t). \end{aligned}$$
(12)

Because \({{\varvec{\beta }}}_{k_1}\ne {{\varvec{\beta }}}_{k_2}\) if \(k_1\ne k_2\), there exists an open set \(U =\{{{\varvec{u}}}\in \mathbf {\mathcal {X}} |\eta ({\varvec{u}};{{\varvec{\beta }}}_{k_1}) \ne \eta ({\varvec{u}};{{\varvec{\beta }}}_{k_2})\}\), where \(\mathbf {\mathcal {X}}\) is the support of \({{\varvec{X}}}\). For any fixed \(k^{*} \in \{1, 2, \cdots , K \}\), multiplying \(\exp (-i\eta ({{\varvec{u}}};{{\varvec{\beta }}}_{k^*})t)\) on both sides of equation (12), we have

$$\begin{aligned}&\pi _{k^{*}} \psi _{Q_{k^{*}}}(t) + \sum _{k \ne k^{*}}^{} \pi _k \exp (i(\eta ({{\varvec{u}}};{{\varvec{\beta }}}_k) - \eta ({{\varvec{u}}};{{\varvec{\beta }}}_{k^*}))t) \psi _{Q_k}(t) \nonumber \\&\quad = \sum _{j=1}^{J} \tilde{\pi }_j \exp (i({\eta }({{\varvec{u}}};\tilde{{\varvec{\beta }}}_j) - \eta ({{\varvec{u}}};{{\varvec{\beta }}}_{k^*}))t) \psi _{\tilde{Q}_j}(t) \end{aligned}$$
(13)

for all \({{\varvec{u}}}\in U\).

Assume that \(\eta ({{\varvec{u}}};\tilde{{\varvec{\beta }}}_j) - \eta ({{\varvec{u}}};{{\varvec{\beta }}}_{k^*}) \ne 0\) for all \(j = 1, \dots , J\), and let H be an open subset of \(\{{{\varvec{u}}}\in U| \eta ({{\varvec{u}}};\tilde{{\varvec{\beta }}}_j) \ne \eta ({{\varvec{u}}};{{\varvec{\beta }}}_{k^*}), j=1,\ldots ,J \}\). Then, for \({{\varvec{u}}}\in H\), (13) should hold. Because \(\eta (\cdot )\) is a continuous function, there exists open sets V and \(\tilde{V}\) in \(\mathbb {R}\) satisfying

$$\begin{aligned} \pi _{k^{*}} \psi _{Q_{k^{*}}}(t) + \sum _{k \ne k^{*}}^{} \pi _k \exp (iv_kt) \psi _{Q_k}(t) = \sum _{j=1}^{J} \tilde{\pi }_j \exp (i\tilde{v}_jt) \psi _{\tilde{Q}_j}(t), \end{aligned}$$
(14)

for all \(v_k \in V\) for \(k \in \{1,2,\ldots , K \} {\setminus } \{k^*\}\) and \(\tilde{v}_j \in \tilde{V}\) for \(j \in \{1,2,\ldots , J \}\). Because V and \(\tilde{V}\) are open sets, there also exists \(\xi >0\) such that

$$\begin{aligned} \pi _{k^{*}} \psi _{Q_{k^{*}}}(t) + \sum _{k \ne k^{*}}^{} \pi _k \exp (icv_kt) \psi _{Q_k}(t) = \sum _{j=1}^{J} \tilde{\pi }_j \exp (ic\tilde{v}_jt) \psi _{\tilde{Q}_j}(t) \end{aligned}$$
(15)

holds for all \(c\in (1-\xi ,1+\xi )\). Because the exponential function is analytic on the whole complex plane, if (15) holds for \(c\in (1-\xi ,1+\xi )\), (15) should also hold for all \(-\infty<c<\infty\). Hence, the following equality should also hold:

$$\begin{aligned}&\pi _{k^{*}} \psi _{Q_{k^{*}}}(t) + \sum _{k \ne k^{*}}^{} \pi _k \Bigg \{ \frac{1}{2T}\int _{-T}^T\exp (icv_kt)dc\Bigg \} \psi _{Q_k}(t) \nonumber \\&\quad = \sum _{j=1}^{J} \tilde{\pi }_j \Bigg \{ \frac{1}{2T}\int _{-T}^T \exp (ic\tilde{v}_jt)dc \Bigg \} \psi _{\tilde{Q}_j}(t), \end{aligned}$$
(16)

for all \(T>0\). Letting \(T\rightarrow \infty\), we can conclude \(\pi _{k^{*}} \psi _{Q_{k^{*}}}(t)=0\) from \(\lim _{T\rightarrow \infty }\frac{1}{2T}\int _{-T}^T\exp (ic\lambda )dc=0\) for any \(\lambda\). Further, letting \(t \rightarrow 0\) implies \(\pi _{k^*}=0\) which contradicts the assumption that \(\pi _{k^*}>0\). Therefore, there must exist \(j^{*} \in \{1, 2, \ldots, J \}\) such that \(\eta ({\varvec{u}};\tilde{{\varvec{\beta }}}_{j^{*}}) = \eta ({\varvec{u}};{{\varvec{\beta }}}_{k^{*}})\), implying \(\tilde{{\varvec{\beta }}}_{j^{*}}={{\varvec{\beta }}}_{k^{*}}\). Accordingly, (16) follows \(\pi _{k^{*}} \psi _{Q_{k^{*}}}(t) = \tilde{\pi }_{j^{*}} \psi _{\tilde{Q}_{j^{*}}}(t)\). Letting \(t \rightarrow 0\), we have \(\pi _{k^{*}} = \tilde{\pi }_{j^{*}}\). Thus, \(\psi _{Q_{k^{*}}}(t) = \psi _{\tilde{Q}_{j^{*}}}(t)\) is obtained, which in turns implies that \(Q_{k^{*}} = \tilde{Q}_{j^{*}}\) from the uniqueness of the Laplace transform.

Repeating the prior argument, we obtain \(\pi _{k^*} = \tilde{\pi }_{j^*}\), \({{\varvec{\beta }}}_{k^*} = \tilde{{{\varvec{\beta }}}}_{j^*}\), \(Q_{k^*} = \tilde{Q}_{j^*}\) for \(k^*,j^* \in \{ 1,\ldots ,K \}\) and \(K \le J\). However, if \(K < J\) is hold, there exist some \(\tilde{\pi }_j\)’s, \(j = 1,\ldots ,J\), are 0 in contradiction to \(\pi _j > 0, j = 1,\ldots ,J\). Therefore, K and J must be the same.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Oh, S., Seo, B. Semiparametric mixture of linear regressions with nonparametric Gaussian scale mixture errors. Adv Data Anal Classif 18, 5–31 (2024). https://doi.org/10.1007/s11634-023-00570-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-023-00570-6

Keywords

Mathematics Subject Classification

Navigation