Skip to main content
Log in

A robust factor analysis model based on the canonical fundamental skew-t distribution

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

The traditional factor analysis rested on the assumption of multivariate normality has been extended by considering the restricted multivariate skew-t (rMST) distribution for the unobserved factors and errors jointly. However, the rMST distribution has limited use for characterising skewness that concentrates in a single direction. This paper is devoted to introducing a more flexible robust factor analysis model based on the broader canonical fundamental skew-t (CFUST) distribution, called the CFUSTFA model. The proposed new model can account for more complex features of skewness toward multiple directions. An efficient alternating expectation conditional maximization algorithm fabricated under several reduced complete-data spaces is developed to estimate parameters under the maximum likelihood (ML) perspective. To assess the variability of parameter estimates, we present an information-based approach to approximating the asymptotic covariance matrix of the ML estimators. The effectiveness and applicability of the proposed techniques are demonstrated through the analysis of simulated and real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: 2nd international symposium on information theory. Akademiai Kiado, Budapest, pp 267–281

  • Arellano-Valle R, Genton M (2005) On fundamental skew distributions. J Multivar Anal 96:93–116

    MathSciNet  MATH  Google Scholar 

  • Azzalini A, Capitaino A (2003) Distributions generated by perturbation of symmetry with emphasis on a multivariate skew \(t\)-distribution. J R Stat Soc Ser B 65:367–389

    MathSciNet  MATH  Google Scholar 

  • Azzalini A, Dalla Valle A (1996) The multivariate skew-normal distribution. Biometrika 83:715–726

    MathSciNet  MATH  Google Scholar 

  • Basford KE, Greenway DR, Mclachlan GJ, Peel D (1997) Standard errors of fitted means under normal mixture. Comput Stat 12:1–17

    MATH  Google Scholar 

  • Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)?—Arguments against avoiding RMSE in the literature. Geosci Model Dev 7:1247–1250

    Google Scholar 

  • DÁgostino RB, (1970) Transformation to normality of the null Distribution of g1. Biometrika 57:679–681

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J Roy Stat Soc Ser B 9:1–38

    MATH  Google Scholar 

  • Dunn JE (1973) A note on a sufficiency condition for uniqueness of a restricted factor matrix. Psychometrika 38:141–143

    MathSciNet  MATH  Google Scholar 

  • Fox DG (1981) Judginga air quality model performance. Bull Am Meteorol Soc 62:599–609

    Google Scholar 

  • Galarza CE, Lachos VH (2019) MomTrunc: moments of folded and doubly truncated multivariate distributions. R package version 4.51. http://CRAN.R-project.org/package=MomTrunc

  • Galarza CE, Lin TI, Wang WL, Lachos VH (2021) On moments of folded and truncated multivariate Student-\(t\) distributions based on recurrence relations. Metrika 84:825–850

    MathSciNet  MATH  Google Scholar 

  • Geweke JF, Zhou G (1996) Measuring the pricing error of the arbitrage pricing theory. Rev Financ Stud 9:557–587

    Google Scholar 

  • Hashemi F, Naderi M, Jamalizadeh A, Lin TI (2020) A skew factor analysis model based on the normal mean-variance mixture of Birnbaum-Saunders distribution. J Appl Stat 47:3007–3029

    MathSciNet  MATH  Google Scholar 

  • Ho HJ, Lin TI, Chen HY, Wang WL (2012) Some results on the truncated multivariate \(t\) distribution. J Stat Plan Inference 142:25–40

    MathSciNet  MATH  Google Scholar 

  • Ho HJ, Pyne S, Lin TI (2012) Maximum likelihood inference for mixtures of skew student-t-normal distributions through practical EM-type algorithms. Stat Comput 22:287–299

    MathSciNet  MATH  Google Scholar 

  • Ho HJ, Lin TI, Wang WL (2015) R TTmoment package: sampling and calculating the first and second moments for the doubly truncated multivariate \(t\) distribution. R package version 1.0. http://cran.r-project.org/web/packages/TTmoment

  • Jarque CM, Bera AK (1980) Efficient test for normality, homoscedasticity and serial independence of residuals. Econ Lett 6:255–259

    MathSciNet  Google Scholar 

  • Johnson RA, Wichern DW (2007) Applied multivariate statistical analysis, 6th edn. Pearson Prentice-Hall, Upper Saddle River

    MATH  Google Scholar 

  • Jöreskog KG (1977) Factor analysis by least-squares and maximum likelihood methods. In: Enslein K, Ralston A, Wilf HS (eds) Mathematical methods for digital computers. Wiley, New York, pp 125–153

    Google Scholar 

  • Lawley DN, Maxwell AE (1971) Factor analysis as a statistical method, 2nd edn. Butterworth, London

    MATH  Google Scholar 

  • Lee SX, McLachlan G (2013) On mixtures of skew normal and skew \(t\)-distributions. Adv Data Anal Classif 7:241–266

    MathSciNet  MATH  Google Scholar 

  • Lee SX, McLachlan GJ (2016) Finite mixtures of canonical fundamental skew t-distributions: the unification of the restricted and unrestricted skew t-mixture models. Stat Comput 26:573–589

    MathSciNet  MATH  Google Scholar 

  • Lee SX, McLachlan GJ (2018) EMMIXcskew: an R package for the fitting of a mixture of canonical fundamental skew \(t\)-distributions. J Stat Softw https://doi.org/10.18637/jss.v083.i03.

  • Lee SX, McLachlan GJ (2021) On formulations of skew factor models: skew factors and/or skew errors. Stat Probab Lett 168:108935

    MathSciNet  MATH  Google Scholar 

  • Lee SX, Lin TI, McLachlan GJ (2021) Mixtures of factor analyzers with fundamental skew symmetric distributions. Adv Data Anal Classif 15:481–512

    MathSciNet  MATH  Google Scholar 

  • Lin TI (2010) Robust mixture modeling using multivariate skew \(t\) distributions. Stat Comp 20:343–356

    MathSciNet  Google Scholar 

  • Lin TI, Lin TC (2011) Robust statistical modelling using the multivariate skew \(t\) distribution with complete and incomplete data. Stat Model 11:253–277

    MathSciNet  MATH  Google Scholar 

  • Lin TI, Ho HJ, Chen CL (2009) Analysis of multivariate skew normal models with incomplete data. J Multivar Anal 100:2337–2351

    MathSciNet  MATH  Google Scholar 

  • Lin TI, Wu PH, MaLachlan GJ, Lee SX (2015) A robust factor analysis model using the restricted skew-\(t\) distribution. TEST 24:510–531

    MathSciNet  MATH  Google Scholar 

  • Lin TI, Wang WL, McLachlan GJ, Lee SX (2018) Robust mixtures of factor analysis models using the restricted multivariate skew-\(t\) distribution. Stat Model 18:50–72

    MathSciNet  MATH  Google Scholar 

  • Liu M, Lin TI (2015) Skew-normal factor analysis models with incomplete data. J Appl Stat 42:789–805

    MathSciNet  MATH  Google Scholar 

  • Liu CH, Rubin DB (1994) The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 81:633–648

    MathSciNet  MATH  Google Scholar 

  • Lopes HF, West M (2004) Bayesian model assessment in factor analysis. Stat Sin 4:41–67

    MathSciNet  MATH  Google Scholar 

  • Louis TA (1982) Finding the observed information matrix when using the EM algorithm. J R Stat Soc B 44:226–233

    MathSciNet  MATH  Google Scholar 

  • McDermott J, Forsyth R (2016) Diagnosing a disorder in a classification benchmark. Pattern Recognit Lett 73:41–43

    Google Scholar 

  • McLachlan GJ, Bean RW, Jones LBT (2007) Extension of the mixture of factor analyzers model to incorporate the multivariate \(t\)-distribution. Comput Stat Data Anal 51:5327–5338

    MathSciNet  MATH  Google Scholar 

  • Meilijson I (1989) A fast improvement to the EM algorithm to its own terms. J R Stat Soc Ser B 51:127–138

    MathSciNet  MATH  Google Scholar 

  • Meng XL, Rubin DB (1993) Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80:267–278

    MathSciNet  MATH  Google Scholar 

  • Meng XL, van Dyk D (1997) The EM algorithm-an old folk song sung to a fast new tune. J R Stat Soc Ser B 59:511–567

    MathSciNet  MATH  Google Scholar 

  • Montanari A, Viroli C (2010) A skew-normal factor model for the analysis of student satisfaction towards university courses. J Appl Stat 37:473–487

    MathSciNet  MATH  Google Scholar 

  • Mooijaart A (1985) Factor analysis for non-normal variables. Psychometrika 50:323–342

    MathSciNet  MATH  Google Scholar 

  • Pourmousa R, Jamalizadeh A, Rezapour M (2015) Multivariate normal mean-variance mixture distribution based on Birnbaum-Saunders distribution. J Stat Comp Sim 85:2736–2749

    MathSciNet  MATH  Google Scholar 

  • Pyne S, Hu X, Wang K, Rossin E, Lin TI, Maier LM, Baecher-Allan C, McLachlan GJ, Tamayo P, Hafler DA, De Jager PL, Mesirov JP (2009) Automated high-dimensional flow cytometric data analysis. Proc Natl Acad Sci USA 106:8519–8524

    Google Scholar 

  • R Core Team (2019) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria

  • Sahu SK, Dey DK, Branco MD (2003) A new class of multivariate skew distributions with application to Bayesian regression models. Can J Stat 31:129–150

    MathSciNet  MATH  Google Scholar 

  • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    MathSciNet  MATH  Google Scholar 

  • Spearman C (1904) General intelligence, objectively determined and measured. Am J Psychol 15:201–293

    Google Scholar 

  • Wang WL, Lin TI (2013) An efficient ECM algorithm for maximum likelihood estimation in mixtures of \(t\)-factor analyzers. Comput Stat 28:751–769

    MathSciNet  MATH  Google Scholar 

  • Wang WL, Liu M, Lin TI (2017) Robust skew-\(t\) factor analysis models for handling missing data. Stat Methods Appl 26:649–672

    MathSciNet  MATH  Google Scholar 

  • Wang WL, Castro LM, Chang YT, Lin TI (2019) Mixtures of restricted skew-t factor analyzers with common factor loadings. Adv Data Anal Classif 13:445–480

    MathSciNet  MATH  Google Scholar 

  • Wang WL, Jamalizadeh A, Lin TI (2020) Finite mixtures of multivariate scale-shape mixtures of skew-normal distributions. Stat Pap 61:2643–2670

    MathSciNet  MATH  Google Scholar 

  • Willmott CJ, Ackleson SG, Davis RE, Feddema JJ, Klink KM, Legates DR, O’Donnell J, Rowe CM (1985) Statistics for the evaluation and comparison of models. J Geophys Res 90:8995–9005

    Google Scholar 

  • Zhang J, Li J, Liu C (2014) Robust factor analysis using the multivariate \(t\)-distribution. Stat Sin 24:291–312

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors are grateful to the Co-Editors, the Associate Editor and two anonymous referees for their valuable comments and constructive suggestions which had improved the content of this paper greatly. This project was partially supported by the Ministry of Science and Technology of Taiwan under Grant Nos. 109-2118-M-005-005-MY3 and 110-2118-M-006-006-MY3.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wan-Lun Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 89 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, TI., Chen, IA. & Wang, WL. A robust factor analysis model based on the canonical fundamental skew-t distribution. Stat Papers 64, 367–393 (2023). https://doi.org/10.1007/s00362-022-01318-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-022-01318-8

Keywords

Navigation