Skip to main content

Robust Mixture Regression Using Mixture of Different Distributions

  • Conference paper
  • First Online:
Recent Advances in Robust Statistics: Theory and Applications

Abstract

In this paper, we examine the mixture regression model based on mixture of different type of distributions. In particular, we consider two-component mixture of normal-t distributions, and skew t-skew normal distributions. We obtain the maximum likelihood (ML) estimators for the parameters of interest using the expectation maximization (EM) algorithm. We give a simulation study and real data examples to illustrate the performance of the proposed estimators.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Caski F (eds) Proceeding of the second international symposium on information theory. Akademiai Kiado, Budapest, pp 267–281

    Google Scholar 

  • Azzalini A (1985) A class of distributions which includes the normal ones. Scand J Stat 12:171–178

    MathSciNet  MATH  Google Scholar 

  • Azzalini A (1986) Further results on a class of distributions which includes the normal ones. Statistica 46:199–208

    MathSciNet  MATH  Google Scholar 

  • Azzalini A, Capitaino A (2003) Distributions generated by perturbation of symmetry with emphasis on a multivariate skew \(t\) distribution. J R Statist Soc B 65:367–389

    Article  MathSciNet  MATH  Google Scholar 

  • Bai X (2010) Robust mixture of regression models. Master’s thesis, Kansas State University

    Google Scholar 

  • Bai X, Yao W, Boyer JE (2012) Robust fitting of mixture regression models. Comput Stat Data An 56:2347–2359

    Article  MathSciNet  MATH  Google Scholar 

  • Bozdogan H (1993) Choosing the number of component clusters in the mixture model using a new informational complexity criterion of the inverse-fisher information matrix. Information and classification. Springer, Berlin, pp 40–54

    Chapter  Google Scholar 

  • Cohen AC (1984) Some effects of inharmonic partials on interval perception. Music Percept 1:323–349

    Article  Google Scholar 

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the E-M algorithm. J Roy Stat Soc B Met 39:1–38

    MathSciNet  MATH  Google Scholar 

  • Doğru FZ (2015) Robust parameter estimation in mixture regression models. PhD thesis, Ankara University

    Google Scholar 

  • Doğru FZ, Arslan O (2016) Robust mixture regression based on the skew t distribution. Revista Colombiana de Estadística, accepted

    Google Scholar 

  • Hennig C (2013) fpc: Flexible procedures for clustering. R Package Version 2.1-5

    Google Scholar 

  • Henze N (1986) A probabilistic representation of the skew-normal distribution. Scand J Stat 13:271–275

    MathSciNet  MATH  Google Scholar 

  • Lin TI, Lee JC, Hsieh WJ (2007) Robust mixture modeling using the skew t distribution. Stat Comput 17:81–92

    Article  MathSciNet  Google Scholar 

  • Liu M, Lin TI (2014) A skew-normal mixture regression model. Educ Psychol Meas 74(1):139–162

    Article  Google Scholar 

  • Quandt RE (1972) A new approach to estimating switching regressions. J Am Statis Assoc 67:306–310

    Article  MATH  Google Scholar 

  • Quandt RE, Ramsey JB (1978) Estimating mixtures of normal distributions and switching regressions. J Am Stat Assoc 73:730–752

    Article  MathSciNet  MATH  Google Scholar 

  • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464

    Article  MathSciNet  MATH  Google Scholar 

  • Song W, Yao W, Xing Y (2014) Robust mixture regression model fitting by laplace distribution. Comput Stat Data An 71:128–137

    Article  MathSciNet  Google Scholar 

  • Turner TR (2000) Estimating the propagation rate of a viral infection of potato plants via mixtures of regressions. J Roy Statist Soc Ser C 49:371–384

    Article  MathSciNet  MATH  Google Scholar 

  • Wei Y (2012) Robust mixture regression models using t-distribution. Master’s thesis, Kansas State University

    Google Scholar 

  • Yao W, Wei Y, Yu C (2014) Robust mixture regression using the t-distribution. Comput Stat Data An 71:116–127

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

The authors would like to thank referees and the editor for their constructive comments and suggestions that have considerably improved this work. The first author would like to thank the Higher Education Council of Turkey for providing financial support for Ph.D. study in Ankara University. The second author would like to thank the European Commission-JRC and the Indian Statistical Institute for providing financial support to attend the ICORS 2015 in Kolkata in India.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fatma Zehra Doğru .

Editor information

Editors and Affiliations

Appendix

Appendix

To get the conditional expectation of the complete data log-likelihood function given in (8), the following conditional expectations should be calculated given \(y_j\) and the current parameter estimate \(\varvec{\hat{\varTheta }}=({\varvec{\hat{\beta }}}_{1},\hat{\sigma }_{1}^{2},{\varvec{\hat{\beta }}}_{2},\hat{\sigma }_{2}^{2},\hat{\nu })\)

$$\begin{aligned} \hat{z}_{j}= & {} E(z_{j}|y_{j},\varvec{\hat{\varTheta }}) =\frac{ \hat{w}\phi ( y_{j};\mathbf {x}_{j}^{{\prime }} \hat{{\varvec{\beta }}}_{1},\hat{\sigma }_{1}^{2}) }{ \hat{w}\phi ( y_{j};\mathbf {x}_{j}^{{\prime }} \hat{{\varvec{\beta }}}_{1},\hat{\sigma }_{1}^{2}) +( 1-\hat{w}) f_{t}( y_{j};\mathbf {x}_{j}^{{\prime }}\hat{{\varvec{\beta }}}_{2},\hat{\sigma }_{2}, \hat{\nu }) } , \end{aligned}$$
(36)
$$\begin{aligned} \hat{u}_{1j}= & {} E( u_{j}|y_{j},\varvec{\hat{\varTheta }}) =\frac{ \hat{\nu }+1}{\hat{\nu }+\left( \left( y_{j}-\mathbf {x} _{j}^{^{\prime }}{\varvec{\hat{\beta }}}_{2}\right) /\hat{\sigma } _{2}\right) ^{2}} , \end{aligned}$$
(37)
$$\begin{aligned} \hat{u}_{2j}= & {} E(\log u_{j}|y_{j},\varvec{\hat{\varTheta }}) =DG\left( \frac{\hat{\nu }+1}{2}\right) -\log \left( \frac{\hat{\nu }}{2} +\frac{\left( y_{j}-\mathbf {x}_{j}^{{\prime }}{\varvec{\hat{\beta }}}_{2}\right) ^{2}}{2\hat{\sigma }_{2}^{2}}\right) . \end{aligned}$$
(38)

These conditional expectations will be used in EM algorithm given in Sect. 2.1.

Similarly, to obtain the conditional expectation of the complete data log-likelihood function given in (21) the following expectations should be computed given \(y_j\) and the current parameter estimate \(\varvec{\hat{\varTheta }}=({\varvec{\hat{\beta }}}_{1},\hat{\sigma }_{1}^{2},\hat{\lambda }_{1},\hat{\nu },{\varvec{\hat{\beta }}}_{2},\hat{\sigma }_{2}^{2},\hat{\lambda }_{2})\)

$$\begin{aligned} \hat{z}_{j}= & {} E(z_{j}|y_{j},\varvec{\hat{\varTheta }}) =\frac{ \hat{w}f_{ST}(y_{j};\mathbf {x}_{j}^{^{\prime }}\hat{{ \varvec{\beta }}}_{1},\hat{\sigma }_{1}^{2},\hat{ \lambda }_{1},\hat{\nu }) }{\hat{w}f_{ST}( y_{j};\mathbf {x} _{j}^{^{\prime }}\hat{{\varvec{\beta }}}_{1},\hat{\sigma } _{1}^{2},\hat{\lambda }_{1},\hat{\nu }) +( 1-\hat{w} )f_{SN}(y_{j};\mathbf {x}_{j}^{^{\prime }}\hat{{\varvec{ \beta }}}_{1},\hat{\sigma }_{2}^{2},\hat{\lambda }_{2}) } , \end{aligned}$$
(39)
$$\begin{aligned} \hat{s}_{1j}= & {} E(z_{j}\tau _{j}|y_{j},\varvec{\hat{\varTheta }}) = \hat{z}_{j}\bigg ( \frac{\hat{\nu }+1}{\hat{\eta }_{1j}^{2}+ \hat{\nu }}\bigg ) \frac{T_{\hat{\nu }+3}\bigg ( \hat{M}_{j}\sqrt{ \frac{\hat{\nu }+3}{\hat{\nu }+1}}\bigg ) }{T_{\hat{\nu } +1}(\hat{M}_{j}) } , \end{aligned}$$
(40)
$$\begin{aligned} \hat{s}_{2j}= & {} E(z_{j}\gamma _{j}\tau _{j}|y_{j},\varvec{\hat{\varTheta }} ) =\frac{\hat{\delta }_{\lambda _{1}}( y_{j}-\mathbf {x}_{j}^{^{\prime }}\hat{{\varvec{\beta }}} _{1}) \hat{s}_{1j}}{\hat{\sigma }_{1}} +\frac{\hat{z}_{j}\sqrt{1-\hat{\delta } _{\lambda _{1}}^{2}}}{\pi \hat{\sigma }_{1}\hat{f}(y_{j})}\bigg ( \frac{\hat{\eta }_{1j}^{2}}{\hat{\nu }(1-\hat{\delta }_{\lambda _{1}}^{2})}+1\bigg ) ^{-(\frac{\hat{\nu }}{2}+1) }, \end{aligned}$$
(41)
$$\begin{aligned} \hat{s}_{3j}= & {} E(z_{j}\gamma _{j}^{2}\tau _{j}|y_{j},\varvec{\hat{\varTheta }}) =\hat{\delta }_{\lambda _{1}}^{2}\bigg ( \frac{y_{j}- \mathbf {x}_{j}^{^{\prime }}\hat{{\varvec{\beta }}}_{1}}{\hat{ \sigma }_{1}}\bigg ) ^{2}\hat{s}_{1j}+\hat{z}_{j}\bigg \{ (1-\hat{ \delta }_{\lambda _{1}}^{2}) \nonumber \\+ & {} \frac{\hat{\delta }_{\lambda _{1}}( y_{j}-\mathbf {x} _{j}^{^{\prime }}\hat{{\varvec{\beta }}}_{1}) \sqrt{1-\hat{ \delta }_{\lambda _{1}}^{2}}}{\pi \hat{\sigma }_{1}^{2}\hat{f}(y_{j}) }\bigg ( \frac{\hat{\eta }_{1j}^{2}}{\hat{\nu }(1-\hat{\delta } _{\lambda _{1}}^{2})}+1\bigg ) ^{-(\frac{\hat{\nu }}{2}+1)}\bigg \} , \end{aligned}$$
(42)
$$\begin{aligned} \hat{s}_{4j}= & {} E( z_{j}\log (\tau _{j})|y_{j},\varvec{\hat{\varTheta }} ) =\hat{z}_{j}\bigg \{ DG\bigg ( \frac{\hat{\nu }+1}{2}\bigg ) -\log \bigg ( \frac{\hat{\eta }_{1j}^{2}+\hat{\nu }}{2}\bigg ) \nonumber \\+ & {} \bigg ( \frac{\hat{\nu }+1}{\hat{\eta }_{1j}^{2}+\hat{\nu }} \bigg ) \left( \frac{T_{\hat{\nu }+3}\bigg ( \hat{\lambda }_{1} \hat{\eta }_{1j}\sqrt{\frac{\hat{\nu }+3}{\hat{\nu }+\hat{ \eta }_{1j}^{2}}}\bigg ) }{T_{\hat{\nu }+1}\bigg ( \hat{\lambda }_{1} \hat{\eta }_{1j}\sqrt{\frac{\hat{\nu }+1}{\hat{\nu }+\hat{ \eta }_{1j}^{2}}}\bigg ) }-1\right) \nonumber \\+ & {} \frac{\hat{\lambda }_{1}\hat{\eta }_{1j}(\hat{\eta } _{1j}^{2}-1)}{\sqrt{(\hat{\nu }+1)(\hat{\nu }+\hat{\eta } _{1j}^{2})^{3}}}\frac{t_{\hat{\nu }+1}\bigg ( \hat{\lambda }_{1} \hat{\eta }_{1j}\sqrt{\frac{\hat{\nu }+1}{\hat{\nu }+\hat{ \eta }_{1j}^{2}}}\bigg ) }{T_{\hat{\nu }+1}\bigg ( \hat{\lambda }_{1} \hat{\eta }_{1j}\sqrt{\frac{\hat{\nu }+1}{\hat{\nu }+\hat{ \eta }_{1j}^{2}}}\bigg ) } \nonumber \\+ & {} \frac{1}{T_{\hat{\nu }+1}\bigg ( \hat{\lambda }_{1}\hat{ \eta }_{1j}\sqrt{\frac{\hat{\nu }+1}{\hat{\nu }+\hat{\eta } _{1j}^{2}}}\bigg ) }\int \limits _{-\infty }^{\hat{M}_{j}}g_{\hat{\nu } }(x)t_{\hat{\nu }+1}(x)dx\bigg \} , \end{aligned}$$
(43)
$$\begin{aligned} \hat{t}_{1j}= & {} E( \gamma _{j}|y_{j},\varvec{\hat{\varTheta }}) = \hat{\delta }_{\lambda _{2}}\hat{\eta }_{2j}+\sqrt{1-\hat{\delta }_{\lambda _{2}}^{2}}\frac{\phi \left( \hat{\lambda }_{2}\hat{\eta } _{2j}\right) }{\Phi \left( \hat{\lambda }_{2}\hat{\eta }_{2j}\right) } , \end{aligned}$$
(44)
$$\begin{aligned} \hat{t}_{2j}= & {} E(\gamma _{j}^{2}|y_{j},\varvec{\hat{\varTheta }}) =1-\hat{\delta }_{\lambda _{2}}^{2}+\hat{\delta }_{\lambda _{2}} \hat{\eta }_{2j}\hat{t}_{1j} , \end{aligned}$$
(45)

where

$$\begin{aligned} \hat{\eta }_{1j}= & {} \frac{(y_{j}-\mathbf {x}_{j}^{{\prime }}\hat{{\varvec{\beta }}}_{1})}{\hat{\sigma }_{1}},\hat{ \delta }_{\lambda _{1}}=\frac{\hat{\lambda }_{1}}{\sqrt{1+\hat{ \lambda }_{1}^{2}}}, \\ \hat{\eta }_{2j}= & {} \frac{(y_{j}-\mathbf {x}_{j}^{{\prime }}\hat{{\varvec{\beta }}}_{2})}{\hat{\sigma }_{2}},\hat{ \delta }_{\lambda _{2}}=\frac{\hat{\lambda }_{2}}{\sqrt{1+\hat{\lambda }_{2}^{2}}},\hat{M}_{j} =\hat{\lambda }_{1}\hat{\eta }_{1j}\sqrt{\frac{\hat{\nu }}{\hat{\nu }+\hat{\eta }_{1j}^{2}}},\\ g_{\hat{\nu }} (x)= & {} DG\bigg (\frac{\hat{\nu }+2}{2}\bigg )-DG\bigg (\frac{\hat{\nu }+1}{2}\bigg )-\log \bigg (1+\frac{x^2}{\hat{\nu }+1}\bigg )+\frac{x^2(\hat{\nu }+1)-\hat{\nu }-1}{(\hat{\nu }+1)(\hat{\nu }+1+x^2)},\\ \hat{f} (y_{j})= & {} \hat{w_{1}}\frac{2}{\hat{\sigma }_{1}}t_{\hat{\nu }}(\hat{ \eta }_{1j})T_{\hat{\nu }+1}(\hat{M}_{j})+(1-\hat{w_{1}})\frac{2}{\hat{\sigma }_{2}}\phi (\hat{\eta }_{2j})\Phi (\hat{\lambda _{2}}\hat{\eta }_{2j}). \end{aligned}$$

These conditional expectations will be used in EM algorithm given in Sect. 2.2.

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer India

About this paper

Cite this paper

Doğru, F.Z., Arslan, O. (2016). Robust Mixture Regression Using Mixture of Different Distributions. In: Agostinelli, C., Basu, A., Filzmoser, P., Mukherjee, D. (eds) Recent Advances in Robust Statistics: Theory and Applications. Springer, New Delhi. https://doi.org/10.1007/978-81-322-3643-6_4

Download citation

Publish with us

Policies and ethics