Mixtures of multivariate contaminated normal regression models

Mazza, Angelo; Punzo, Antonio

doi:10.1007/s00362-017-0964-y

Mixtures of multivariate contaminated normal regression models

Regular Article
Published: 13 November 2017

Volume 61, pages 787–822, (2020)
Cite this article

Statistical Papers Aims and scope Submit manuscript

926 Accesses
28 Citations
Explore all metrics

Abstract

Mixtures of regression models (MRMs) are widely used to investigate the relationship between variables coming from several unknown latent homogeneous groups. Usually, the conditional distribution of the response in each mixture component is assumed to be (multivariate) normal (MN-MRM). To robustify the approach with respect to possible elliptical heavy-tailed departures from normality, due to the presence of mild outliers, the multivariate contaminated normal MRM is here introduced. In addition to the parameters of the MN-MRM, each mixture component has a parameter controlling the proportion of outliers and one specifying the degree of contamination with respect to the response variable(s). Crucially, these parameters do not have to be specified a priori, adding flexibility to our approach. Furthermore, once the model is estimated and the observations are assigned to the groups, a finer intra-group classification in typical points and (mild) outliers, can be directly obtained. Identifiability conditions are provided, an expectation-conditional maximization algorithm is outlined for parameter estimation, and various implementation and operational issues are discussed. Properties of the estimators of the regression coefficients are evaluated through Monte Carlo experiments and compared with other procedures. The performance of this novel family of models is also illustrated on artificial and real data, with particular emphasis to the application in allometric studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Seemingly unrelated clusterwise linear regression

Article 12 August 2019

Parsimonious Mixtures of Seemingly Unrelated Contaminated Normal Regression Models

Robust mixture regression modeling based on scale mixtures of skew-normal distributions

Article 19 July 2015

References

Aitken AC (1926) A series formula for the roots of algebraic and transcendental equations. Proc R Soc Edinb 45(1):14–22
Article MATH Google Scholar
Aitkin M, Wilson GT (1980) Mixture models, outliers, and the EM algorithm. Technometrics 22(3):325–331
Article MATH Google Scholar
Andrews JL, McNicholas PD (2011) Extending mixtures of multivariate $t$-factor analyzers. Stat Comput 21(3):361–373
Article MathSciNet MATH Google Scholar
Andrews JL, McNicholas PD, Subedi S (2011) Model-based classification via mixtures of multivariate $t$-distributions. Comput Stat Data Anal 55:520–529
Article MathSciNet MATH Google Scholar
Baek J, McLachlan GJ (2011) Mixtures of common $t$-factor analyzers for clustering high-dimensional microarray data. Bioinformatics 27(9):1269–1276
Article Google Scholar
Bagnato L, Punzo A (2013) Finite mixtures of unimodal beta and gamma densities and the $k$-bumps algorithm. Comput Stat 28(4):1571–1597
Article MathSciNet MATH Google Scholar
Bagnato L, Punzo A, Zoia MG (2017) The multivariate leptokurtic-normal distribution and its application in model-based clustering. Can J Stat 45(1):95–119
Article MathSciNet Google Scholar
Bai X, Yao W, Boyer JE (2012) Robust fitting of mixture regression models. Comput Stat Data Anal 56(7):2347–2359
Article MathSciNet MATH Google Scholar
Banfield JD, Raftery AE (1993) Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3):803–821
Article MathSciNet MATH Google Scholar
Berkane M, Bentler PM (1988) Estimation of contamination parameters and identification of outliers in multivariate data. Sociol Methods Res 17(1):55–64
Article Google Scholar
Berta P, Ingrassia S, Punzo A, Vittadini G (2016) Multilevel cluster-weighted models for the evaluation of hospitals. METRON 74(3):275–292
Article MathSciNet MATH Google Scholar
Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22(7):719–725
Article Google Scholar
Biernacki C, Celeux G, Govaert G (2003) Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Comput Stat Data Anal 41(3–4):561–575
Article MathSciNet MATH Google Scholar
Böhning D (1999) Computer Assisted Analysis of Mixtures and Applications: Meta Analysis, Disease Mapping, and Others, Chapman & Hall/CRC Monographs on Statistics & Applied Probability, vol 81. Taylor & Francis
Böhning D, Dietz E, Schaub R, Schlattmann P, Lindsay B (1994) The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann Inst Stat Math 46(2):373–388
Article MATH Google Scholar
Browne RP, Subedi S, McNicholas PD (2013) Constrained optimization for a subset of the Gaussian parsimonious clustering models. http://arxiv.org/abs/1306.5824
Campbell NA, Mahon RJ (1974) A multivariate study of variation in two species of rock crab of genus Leptograpsus. Aust J Zool 22(3):417–425
Article Google Scholar
Celeux G, Govaert G (1995) Gaussian parsimonious clustering models. Pattern Recogn 28(5):781–793
Article Google Scholar
Celeux G, Hurn M, Robert CP (2000) Computational and inferential difficulties with mixture posterior distributions. J Am Stat Assoc 95(451):957–970
Article MathSciNet MATH Google Scholar
Clarke BR, Davidson T, Hammarstrand R (2017) A comparison of the $l_2$ minimum distance estimator and the em-algorithm when fitting $k$-component univariate normal mixtures. Stat Papers pp 1–20 https://doi.org/10.1007/s00362-016-0747-x
Cuesta-Albertos JA, Gordaliza A, Matrán C (1997) Trimmed $k$-means: an attempt to robustify quantizers. Ann Stat 25(2):553–576
Article MathSciNet MATH Google Scholar
Dang UJ, McNicholas PD (2015) Families of parsimonious finite mixtures of regression models. In: Morlini I, Minerva T, Vichi M (eds) Advances in Statistical Models for Data Analysis. Studies in Classification, Data Analysis and Knowledge Organization. Springer, Switzerland pp 73–84
Dang UJ, Browne RP, McNicholas PD (2015) Mixtures of multivariate power exponential distributions. Biometrics 71(4):1081–1089
Article MathSciNet MATH Google Scholar
Dang UJ, Punzo A, McNicholas PD, Ingrassia S, Browne RP (2017) Multivariate response and parsimony for Gaussian cluster-weighted models. J Classif 34(1):4–34
Article MathSciNet MATH Google Scholar
Davies L, Gather U (1993) The identification of multiple outliers. J Am Stat Assoc 88(423):782–792
Article MathSciNet MATH Google Scholar
Dayton CM, Macready GB (1988) Concomitant-variable latent-class models. J Am Stat Assoc 83(401):173–178
Article MathSciNet Google Scholar
Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J Roy Stat Soc: Ser B 39(1):1–38
MathSciNet MATH Google Scholar
Depraetere N, Vandebroek M (2014) Order selection in finite mixtures of linear regressions. Stat Pap 55(3):871–911
Article MATH Google Scholar
DeSarbo WS, Cron WL (1988) A maximum likelihood methodology for clusterwise linear regression. J Classif 5(2):249–282
Article MathSciNet MATH Google Scholar
Fraley C, Raftery AE, Murphy TB, Scrucca L (2012) mclust version 4 for R: Normal mixture modeling for model-based clustering, classification, and density estimation. Technical report 597, Department of Statistics, University of Washington, Seattle, Washington
Frühwirth-Schnatter S (2006) Finite mixture and Markov switching models. Springer, New York
MATH Google Scholar
Galimberti G, Soffritti G (2014) A multivariate linear regression analysis using finite mixtures of $t$ distributions. Comput Stat Data Anal 71:138–150
Article MathSciNet MATH Google Scholar
García-Escudero LA, Gordaliza A, Mayo-Iscar A, San Martín R (2010) Robust clusterwise linear regression through trimming. Comput Stat Data Anal 54(12):3057–3069
Article MathSciNet MATH Google Scholar
Golam Kibria BM, Safiul Haq M (1999) The multivariate linear model with multivariate $t$ and intra-class covariance structure. Stat Pap 40(3):263–276
Article MathSciNet MATH Google Scholar
Gómez E, Gómez-Viilegas MA, Marin JM (1998) A multivariate generalization of the power exponential family of distributions. Commun Stat Theory Methods 27(3):589–600
Article MathSciNet MATH Google Scholar
Greselin F, Punzo A (2013) Closed likelihood ratio testing procedures to assess similarity of covariance matrices. Am Stat 67(3):117–128
Article MathSciNet Google Scholar
Grün B, Leisch F (2008) FlexMix version 2: Finite mixtures with concomitant variables and varying and constant parameters. J Stat Softw 28(4):1–35
Article Google Scholar
Hartigan JA (1985) Statistical theory in clustering. J Classif 2(1):63–76
Article MathSciNet Google Scholar
Hastie T, Tibshirani R (1996) Discriminant analysis by Gaussian mixtures. J Roy Stat Soc 58(1):155–176
MathSciNet MATH Google Scholar
Hennig C (2000) Identifiablity of models for clusterwise linear regression. J Classif 17(2):273–296
Article MathSciNet MATH Google Scholar
Hennig C (2004) Breakdown points for maximum likelihood estimators of location-scale mixtures. Ann Stat 32(4):1313–1340
Article MathSciNet MATH Google Scholar
Ingrassia S (2004) A likelihood-based constrained algorithm for multivariate normal mixture models. Stat Methods Appl 13(2):151–166
Article MathSciNet Google Scholar
Ingrassia S, Punzo A (2016) Decision boundaries for mixtures of regressions. J Korean Stat Soc 45(2):295–306
Article MathSciNet MATH Google Scholar
Ingrassia S, Rocci R (2007) Constrained monotone em algorithms for finite mixture of multivariate Gaussians. Comput Stat Data Anal 51(11):5339–5351
Article MathSciNet MATH Google Scholar
Ingrassia S, Minotti SC, Punzo A (2014) Model-based clustering via linear cluster-weighted models. Comput Stat Data Anal 71:159–182
Article MathSciNet MATH Google Scholar
Ingrassia S, Punzo A, Vittadini G, Minotti SC (2015) The generalized linear mixed cluster-weighted model. J Classif 32(1):85–113
Article MathSciNet MATH Google Scholar
Jiang W, Tanner MA (1999) Hierarchical mixtures-of-experts for exponential family regression models: approximation and maximum likelihood estimation. Ann Stat 27(3):987–1011
Article MathSciNet MATH Google Scholar
Karlis D, Xekalaki E (2003) Choosing initial values for the EM algorithm for finite mixtures. Comput Stat Data Anal 41(3–4):577–590
Article MathSciNet MATH Google Scholar
Karlsson M, Laitila T (2014) Finite mixture modeling of censored regression models. Stat Pap 55(3):627–642
Article MathSciNet MATH Google Scholar
Klingenberg CP (1996) Multivariate allometry. Advances in Morphometrics. Springer, New York pp 23–49
Knoebel BR, Burkhart HE (1991) A bivariate distribution approach to modeling forest diameter distributions at two points in time. Biometrics 47(1):241–253
Article Google Scholar
Lachos VH, Angolini T, Abanto-Valle CA (2011) On estimation and local influence analysis for measurement errors models under heavy-tailed distributions. Stat Pap 52(3):567–590
Article MathSciNet MATH Google Scholar
Lamont AE, Vermunt JK, Van Horn ML (2016) Regression mixture models: Does modeling the covariance between independent variables and latent classes improve the results? Multivar Behav Res 51(1):35–52
Article Google Scholar
Lange KL, Little RJA, Taylor JMG (1989) Robust statistical modeling using the $t$ distribution. J Am Stat Assoc 84(408):881–896
MathSciNet Google Scholar
Leisch F (2004) FlexMix: A general framework for finite mixture models and latent class regression in R. J Stat Softw 11(8):1–18
Article Google Scholar
Little RJA (1988) Robust estimation of the mean and covariance matrix from data with missing values. Appl Stat 37(1):23–38
Article MATH Google Scholar
Maruotti A, Punzo A (2017) Model-based time-varying clustering of multivariate longitudinal data with covariates and outliers. Comput Stat Data Anal 113(4):475–496
Article MathSciNet MATH Google Scholar
Mazza A, Punzo A, Ingrassia S (2015) flexCWM: Flexible Cluster-Weighted Modeling. http://cran.r-project.org/web/packages/flexCWM/index.html
Mazza A, Punzo A, Ingrassia S (2018) flexCWM. A flexible framework for cluster-weighted models. J Stat Softw pp 1–29
McLachlan G, Krishnan T (2007) The EM algorithm and extensions, Wiley Series in Probability and Statistics, vol 382, 2nd edn. Wiley, New York
McLachlan GJ, Peel D (2000) Finite Mixture Models. Wiley, New York
Book MATH Google Scholar
McNicholas PD (2010) Model-based classification using latent Gaussian mixture models. J Stat Plan Inference 140(5):1175–1181
Article MathSciNet MATH Google Scholar
McNicholas PD, Subedi S (2012) Clustering gene expression time course data using mixtures of multivariate $t$-distributions. J Stat Plan Inference 142(5):1114–1127
Article MathSciNet MATH Google Scholar
McNicholas PD, Murphy TB, McDaid AF, Frost D (2010) Serial and parallel implementations of model-based clustering via parsimonious Gaussian mixture models. Comput Stat Data Anal 54(3):711–723
Article MathSciNet MATH Google Scholar
Meng XL, Rubin DB (1993) Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80(2):267–278
Article MathSciNet MATH Google Scholar
Neykov N, Filzmoser P, Dimova R, Neytchev P (2007) Robust fitting of mixtures using the trimmed likelihood estimator. Comput Stat Data Anal 52(1):299–308
Article MathSciNet MATH Google Scholar
Niu X, Li P, Zhang P (2016) Testing homogeneity in a scale mixture of normal distributions. Stat Pap 57(2):499–516
Article MathSciNet MATH Google Scholar
Peel D, McLachlan GJ (2000) Robust mixture modelling using the $t$ distribution. Stat Comput 10(4):339–348
Article Google Scholar
Punzo A (2014) Flexible mixture modeling with the polynomial Gaussian cluster-weighted model. Stat Model 14(3):257–291
Article MathSciNet Google Scholar
Punzo A, Ingrassia S (2015) Parsimonious generalized linear Gaussian cluster-weighted models. In: Morlini I, Minerva T, Vichi M (eds). Advances in Statistical Models for Data Analysis. Studies in Classification, Data Analysis and Knowledge Organization. Springer International Publishing, Switzerland, pp 201–209
Punzo A, Ingrassia S (2016) Clustering bivariate mixed-type data via the cluster-weighted model. Comput Stat 31(3):989–1013
Article MathSciNet MATH Google Scholar
Punzo A, Maruotti A (2016) Clustering multivariate longitudinal observations: The contaminated Gaussian hidden Markov model. J Comput Gr Stat 25(4):1097–1116
Article MathSciNet Google Scholar
Punzo A, McNicholas PD (2016) Parsimonious mixtures of multivariate contaminated normal distributions. Biometr J 58(6):1506–1537
Article MathSciNet MATH Google Scholar
Punzo A, McNicholas PD (2017) Robust clustering in regression analysis via the contaminated Gaussian cluster-weighted model. J Classif 34(2):249–293
Article MathSciNet MATH Google Scholar
Punzo A, Browne RP, McNicholas PD (2016) Hypothesis testing for mixture model selection. J Stat Comput Simul 86(14):2797–2818
Article MathSciNet Google Scholar
Punzo A, Bagnato L, Maruotti A (2017) Compound unimodal distributions for insurance losses. Insur: Math Econ. https://doi.org/10.1016/j.insmatheco.2017.10.007
Punzo A, Mazza A, McNicholas PD (2018) ContaminatedMixt: An R package for fitting parsimonious mixtures of multivariate contaminated normal distributions. J Stat Softw pp 1–25
Qin LX, Self SG (2006) The clustering of regression models method with applications in gene expression data. Biometrics 62(2):526–533
Article MathSciNet MATH Google Scholar
R Core Team (2013) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org/
Ritter G (2015) Robust cluster analysis and variable selection. CRC Press, Baco Raton, CRC Monographs on Statistics & Applied Probability. Chapman & Hall/
MATH Google Scholar
Rousseeuw PJ, Driessen KV (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41(3):212–223
Article Google Scholar
Schreuder HT, Hafley WL (1977) A useful bivariate distribution for describing stand structure of tree heights and diameters. Biometrics 33(3):471–478
Article Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Article MathSciNet MATH Google Scholar
Seo B, Kim D (2012) Root selection in normal mixture models. Comput Stat Data Anal 56(8):2454–2470
Article MathSciNet MATH Google Scholar
Skrondal A, Rabe-Hesketh S (2004) Generalized Latent Variable Modeling: Multilevel, Longitudinal, and Structural Equation Models. Interdisciplinary Statistics. Taylor & Francis, Baco Raton
Book MATH Google Scholar
Song W, Yao W, Xing Y (2014) Robust mixture regression model fitting by Laplace distribution. Comput Stat Data Anal 71:128–137
Article MathSciNet MATH Google Scholar
Stephens M (2000) Dealing with label switching in mixture models. J Royal Stat Soc B 62(4):795–809
Article MathSciNet MATH Google Scholar
Subedi S, Punzo A, Ingrassia S, McNicholas PD (2013) Clustering and classification via cluster-weighted factor analyzers. Adv Data Anal Classif 7(1):5–40
Article MathSciNet MATH Google Scholar
Subedi S, Punzo A, Ingrassia S, McNicholas PD (2015) Cluster-weighted $t$-factor analyzers for robust model-based clustering and dimension reduction. Stat Methods Appl 24(4):623–649
Article MathSciNet MATH Google Scholar
Tukey JW (1960) A survey of sampling from contaminated distributions. In: Olkin I (ed) Contributions to probability and statistics: essays in honor of Harold Hotelling, Stanford studies in mathematics and statistics. Stanford University Press, California, pp 448–485
Google Scholar
Wedel M, Kamakura W (2001) Market segmentation: Conceptual and methodological foundations, 2nd edn. Kluwer Academic Publishers, Boston
MATH Google Scholar
Yao W (2012) Model based labeling for mixture models. Stat Comput 22(2):337–347
Article MathSciNet MATH Google Scholar
Yao W, Lindsay BG (2009) Bayesian mixture labeling by highest posterior density. J Am Stat Assoc 104(486):758–767
Article MathSciNet MATH Google Scholar
Yao W, Wei Y, Yu C (2014) Robust mixture regression using the $t$-distribution. Comput Stat Data Anal 71:116–127
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors acknowledge the financial support from the grant “Finite mixture and latent variable models for causal inference and analysis of socio-economic data” (FIRB 2012-Futuro in Ricerca) funded by the Italian Government (RBFR12SHVV).

Author information

Authors and Affiliations

Department of Economics and Business, University of Catania, Corso Italia 55, 95129, Catania, Italy
Angelo Mazza & Antonio Punzo

Authors

Angelo Mazza
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Punzo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antonio Punzo.

Appendix A: Updates in the first CM-step

The estimates of $\pi _j, {\varvec{B}}_j, \varvec{\varSigma }_j$, and $\alpha _j, j=1,\ldots ,k$, at the $\left( r+1\right) $th first CM-step of the ECM algorithm, require the maximization of

$$\begin{aligned} Q\left( \varvec{\vartheta }_1|\varvec{\vartheta }^{\left( r\right) }\right) =Q_1\left( \varvec{\pi }|\varvec{\vartheta }^{\left( r\right) }\right) +Q_2\left( \varvec{\alpha }|\varvec{\vartheta }^{\left( r\right) }\right) +Q_3\left( {\varvec{B}},\varvec{\varSigma }|\varvec{\vartheta }^{\left( r\right) }\right) , \end{aligned}$$

(14)

where

$$\begin{aligned} Q_1\left( \varvec{\pi }|\varvec{\vartheta }^{\left( r\right) }\right)&=\sum _{i=1}^{n}\sum _{j=1}^{k}z_{ij}^{\left( r\right) }\ln \pi _j,\\ Q_2\left( \varvec{\alpha }|\varvec{\vartheta }^{\left( r\right) }\right)&=\sum _{i=1}^{n}\sum _{j=1}^{k}z_{ij}^{\left( r\right) }\left[ u_{ij}^{\left( r\right) }\ln \alpha _j+\left( 1-u_{ij}^{\left( r\right) }\right) \ln \left( 1-\alpha _j\right) \right] ,\\ Q_3\left( {\varvec{B}},\varvec{\varSigma }|\varvec{\vartheta }^{\left( r\right) }\right)&=-\frac{1}{2}\sum _{i=1}^n\sum _{j=1}^k\Biggl \{z_{ij}^{\left( r\right) }\ln \left| \varvec{\varSigma }_j\right| \\&\quad +z_{ij}^{\left( r\right) }\left( u_{ij}^{\left( r\right) }+\frac{1-u_{ij}^{\left( r\right) }}{\eta _j^{\left( r\right) }}\right) \delta \left( {\varvec{x}}_i,\varvec{\mu }\left( {\varvec{x}}_i;{\varvec{B}}_j\right) ;\varvec{\varSigma }_j\right) \Biggr \}. \end{aligned}$$

Terms which are independent by the parameters of interest have been removed from $Q_3$. As the three terms on the right-hand side of (14) have zero cross-derivatives, they can be maximized separately.

1.1 A.1 Update of $\varvec{\pi }$

The maximum of $Q_1\left( \varvec{\pi }|\varvec{\vartheta }^{\left( r\right) }\right) $ with respect to $\varvec{\pi }$, subject to the constraints on those parameters, is obtained by maximizing the augmented function

$$\begin{aligned} \sum _{i=1}^n\sum _{j=1}^kz_{ij}^{\left( r\right) }\ln \pi _j-\lambda \left( \sum _{j=1}^k\pi _j-1\right) , \end{aligned}$$

(15)

where $\lambda $ is a Lagrangian multiplier. Setting the derivative of equation (15) with respect to $\pi _j$ equal to zero and solving for $\pi _j$ yields

$$\begin{aligned} \pi _j^{\left( r+1\right) }=\displaystyle \displaystyle \sum _{i=1}^nz_{ij}^{\left( r\right) }\Big /n. \end{aligned}$$

1.2 A.2 Update of $\varvec{\alpha }_{{\varvec{Y}}}$

The updates for $\varvec{\alpha }$ can be obtained through the first partial derivatives

$$\begin{aligned} \frac{\partial Q_2\left( \varvec{\alpha }|\varvec{\vartheta }^{\left( r\right) }\right) }{\partial \alpha _j} = \frac{\displaystyle \sum _{i=1}^n z_{ij}^{\left( r\right) }u_{ij}^{\left( r\right) }-\alpha _j\sum _{i=1}^n z_{ij}^{\left( r\right) }}{\alpha _j\left( 1-\alpha _j\right) }, \qquad j=1,\ldots ,k. \end{aligned}$$

(16)

Equating (16) to zero yields

$$\begin{aligned} \alpha _j^{\left( r+1\right) }=\displaystyle \sum _{i=1}^n z_{ij}^{\left( r\right) }u_{ij}^{\left( r\right) }\Bigg /\displaystyle \sum _{i=1}^n z_{ij}^{\left( r\right) }, \qquad j=1,\ldots ,k. \end{aligned}$$

1.3 A.3 Update of ${\varvec{B}}$ and $\varvec{\varSigma }_{{\varvec{Y}}}$

Using properties of trace and transpose, the updates for ${\varvec{B}}$ can be obtained through the first partial derivatives

$$\begin{aligned}&\frac{\partial Q_3\left( {\varvec{B}},\varvec{\varSigma }_{{\varvec{Y}}}|\varvec{\vartheta }^{\left( r\right) }\right) }{ \partial {\varvec{B}}_j'} \nonumber \\&\quad = \frac{\partial \left\{ -\displaystyle \frac{1}{2}\displaystyle \sum _{i=1}^nz_{ij}^{\left( r\right) }\left( u_{ij}^{\left( r\right) }+\frac{1-u_{ij}^{\left( r\right) }}{\eta _j^{\left( r\right) }}\right) \left[ {\varvec{y}}_i-\varvec{\mu }\left( {\varvec{x}}_i;{\varvec{B}}_j\right) \right] '\varvec{\varSigma }_j^{-1}\left[ {\varvec{y}}_i-\varvec{\mu }\left( {\varvec{x}}_i;{\varvec{B}}_j\right) \right] \right\} }{\partial {\varvec{B}}_j'}\nonumber \\&\quad = \frac{\partial \left[ -\displaystyle \frac{1}{2}\displaystyle \sum _{i=1}^nz_{ij}^{\left( r\right) }\left( u_{ij}^{\left( r\right) }+\frac{1-u_{ij}^{\left( r\right) }}{\eta _j^{\left( r\right) }}\right) \left( -{\varvec{y}}_i'\varvec{\varSigma }_j^{-1}{\varvec{B}}_j'{\varvec{x}}_i^*-{\varvec{x}}_i^{*'}{\varvec{B}}_j\varvec{\varSigma }_j^{-1}{\varvec{y}}_i+{\varvec{x}}_i^{*'}{\varvec{B}}_j\varvec{\varSigma }_j^{-1}{\varvec{B}}_j'{\varvec{x}}_i^*\right) \right] }{\partial {\varvec{B}}_j'} \nonumber \\&\quad = \frac{\partial \left\{ \displaystyle \frac{1}{2}\displaystyle \sum _{i=1}^nz_{ij}^{\left( r\right) }\left( u_{ij}^{\left( r\right) }+\frac{1-u_{ij}^{\left( r\right) }}{\eta _j^{\left( r\right) }}\right) \left[ \text{ tr }\left( {\varvec{B}}_j'{\varvec{x}}_i^*{\varvec{y}}_i'\varvec{\varSigma }_j^{-1}\right) +\text{ tr }\left( \left( \varvec{\varSigma }_j^{-1}{\varvec{y}}_i{\varvec{x}}_i^{*'}\right) '{\varvec{B}}_j'\right) -\text{ tr }\left( {\varvec{B}}_j'{\varvec{x}}_i^*{\varvec{x}}_i^{*'}{\varvec{B}}_j\varvec{\varSigma }_j^{-1}\right) \right] \right\} }{\partial {\varvec{B}}_j'}\nonumber \\&\quad =\displaystyle \frac{1}{2}\displaystyle \sum _{i=1}^nz_{ij}^{\left( r\right) }\left( u_{ij}^{\left( r\right) }+\frac{1-u_{ij}^{\left( r\right) }}{\eta _j^{\left( r\right) }}\right) \left( 2\varvec{\varSigma }_j^{-1}{\varvec{y}}_i{\varvec{x}}_i^{*'}-2\varvec{\varSigma }_j^{-1}{\varvec{B}}_j'{\varvec{x}}_i^*{\varvec{x}}_i^{*'}\right) , \qquad j=1,\ldots ,k. \end{aligned}$$

(17)

Equating (17) to the null matrix yields

$$\begin{aligned} {\varvec{B}}_j^{\left( r+1\right) }= & {} \left[ \sum _{i=1}^n z_{ij}^{\left( r\right) }\left( u_{ij}^{\left( r\right) }+\frac{1-u_{ij}^{\left( r\right) }}{\eta _j^{\left( r\right) }}\right) {\varvec{x}}_i^*{\varvec{x}}_i^{*'}\right] ^{-1}\\&\left[ \sum _{i=1}^n z_{ij}^{\left( r\right) }\left( u_{ij}^{\left( r\right) }+\frac{1-u_{ij}^{\left( r\right) }}{\eta _j^{\left( r\right) }}\right) {\varvec{x}}_i^*{\varvec{y}}_i\right] , \qquad j=1,\ldots ,k. \end{aligned}$$

Finally, the updates for $\varvec{\varSigma }$ can be obtained through the first partial derivatives

$$\begin{aligned} \frac{\partial Q_3\left( {\varvec{B}},\varvec{\varSigma }|\varvec{\vartheta }^{\left( r\right) }\right) }{\partial \varvec{\varSigma }_j^{-1}}= & {} \displaystyle \frac{1}{2}\displaystyle \sum _{i=1}^n z_{ij}^{\left( r\right) } \left\{ \varvec{\varSigma }_j+ \left( u_{ij}^{\left( r\right) }+\displaystyle \frac{1-u_{ij}^{\left( r\right) }}{ \eta _j^{\left( r\right) }}\right) \right. \nonumber \\&\left. \left[ {\varvec{y}}_i-\varvec{\mu }\left( {\varvec{x}}_i;{\varvec{B}}_j^{\left( r+1\right) }\right) \right] \right. \nonumber \\&\left. \left[ {\varvec{y}}_i-\varvec{\mu }\left( {\varvec{x}}_i;{\varvec{B}}_j^{\left( r+1\right) }\right) \right] '\right\} , \qquad j=1,\ldots ,k. \end{aligned}$$

(18)

Equating (18) to the null matrix yields

$$\begin{aligned} \varvec{\varSigma }_j^{\left( r+1\right) }= & {} \frac{1}{n_j^{\left( r\right) }}\sum _{i=1}^nz_{ij}^{\left( r\right) } \left( u_{ij}^{\left( r\right) }+\frac{1-u_{ij}^{\left( r\right) }}{\eta _j^{\left( r \right) }}\right) \\&\left[ {\varvec{y}}_i-\displaystyle \varvec{\mu }\left( {\varvec{x}}_i;{\varvec{B}}_j^{\left( r+1\right) }\right) \right] \left[ {\varvec{y}}_i-\displaystyle \varvec{\mu }\left( {\varvec{x}}_i;{\varvec{B}}_j^{\left( r+1\right) }\right) \right] ', \qquad j=1,\ldots ,k. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mazza, A., Punzo, A. Mixtures of multivariate contaminated normal regression models. Stat Papers 61, 787–822 (2020). https://doi.org/10.1007/s00362-017-0964-y

Download citation

Received: 02 April 2016
Revised: 16 October 2017
Published: 13 November 2017
Issue Date: April 2020
DOI: https://doi.org/10.1007/s00362-017-0964-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mixtures of multivariate contaminated normal regression models

Abstract

Access this article

Similar content being viewed by others

Seemingly unrelated clusterwise linear regression

Parsimonious Mixtures of Seemingly Unrelated Contaminated Normal Regression Models

Robust mixture regression modeling based on scale mixtures of skew-normal distributions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix A: Updates in the first CM-step

1.1 A.1 Update of \(\varvec{\pi }\)

1.2 A.2 Update of \(\varvec{\alpha }_{{\varvec{Y}}}\)

1.3 A.3 Update of \({\varvec{B}}\) and \(\varvec{\varSigma }_{{\varvec{Y}}}\)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Mixtures of multivariate contaminated normal regression models

Abstract

Access this article

Similar content being viewed by others

Seemingly unrelated clusterwise linear regression

Parsimonious Mixtures of Seemingly Unrelated Contaminated Normal Regression Models

Robust mixture regression modeling based on scale mixtures of skew-normal distributions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix A: Updates in the first CM-step

Appendix A: Updates in the first CM-step

1.1 A.1 Update of \(\varvec{\pi }\)

1.2 A.2 Update of \(\varvec{\alpha }_{{\varvec{Y}}}\)

1.3 A.3 Update of \({\varvec{B}}\) and \(\varvec{\varSigma }_{{\varvec{Y}}}\)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation