Skip to main content
Log in

A fuzzy approach to robust regression clustering

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

A new robust fuzzy regression clustering method is proposed. We estimate coefficients of a linear regression model in each unknown cluster. Our method aims to achieve robustness by trimming a fixed proportion of observations. Assignments to clusters are fuzzy: observations contribute to estimates in more than one single cluster. We describe general criteria for tuning the method. The proposed method seems to be robust with respect to different types of contamination.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Ali AM, Karmakar GC, Dooley LS (2008) Review on fuzzy clustering algorithms. J Adv Comput 2:169–181

    Google Scholar 

  • Bezdek JC (1981) Pattern recognition with fuzzy objective function algoritms. Plenum Press, New York

    Book  MATH  Google Scholar 

  • Bock HH (1969) The equivalence of two extremal problems and its application to the iterative classification of multivariate data. Paper presented at the Workshop “Medizinische Statistik”, Forschungsinstitut Oberwolfach

  • Bryant PG (1991) Large-sample results for optimization-based clustering methods. J Classif 8:31–44

    Article  MathSciNet  MATH  Google Scholar 

  • Celeux G, Govaert A (1992) Classification EM algorithm for clustering and two stochastic versions. Comput Stat Data Anal 13:315–332

    Article  MathSciNet  MATH  Google Scholar 

  • Cerioli A, Farcomeni A, Riani M (2013) Robust distances for outlier free goodness-of-fit testing. Comput Stat Data Anal 65:29–45

    Article  MathSciNet  Google Scholar 

  • Cerioli A, Farcomeni A (2011) Error rates for multivariate outlier detection. Comput Stat Data Anal 55:544–553

    Article  MathSciNet  MATH  Google Scholar 

  • Coretto P, Hennig C (2016) Robust improper maximum likelihood: tuning, computation and a comparison with other methods for robust Gaussian clustering. J Am Stat Assoc (in press)

  • DeSarbo WS, Cron WL (1988) A maximum likelihood methodology for clusterwise linear regression. J Classif 5:249–282

    Article  MathSciNet  MATH  Google Scholar 

  • D’Urso P, Massari R, Santoro A (2011) Robust fuzzy regression analysis. Inf Sci 18:4154–4174

    Article  MathSciNet  MATH  Google Scholar 

  • D’Urso P, De Giovanni L, Massari R (2014) Trimmed fuzzy clustering for interval-values data. Adv Data Anal Classif 9:21–40

    Article  Google Scholar 

  • Farcomeni A (2014a) Snipping for robust \(k\)-means clustering under component-wise contamination. Stat Comput 24:909–917

    Article  MathSciNet  MATH  Google Scholar 

  • Farcomeni A (2014b) Robust constrained clustering in presence of entry-wise outliers. Technometrics 56:102–111

    Article  MathSciNet  Google Scholar 

  • Farcomeni A, Greco L (2015) Robust methods for data reduction. Chapman and Hall/CRC Press, Boca Raton

    Book  MATH  Google Scholar 

  • Fritz H, García-Escudero LA, Mayo-Iscar A (2013a) Robust constrained fuzzy clustering. Inf Sci 245:38–52

    Article  MathSciNet  MATH  Google Scholar 

  • Fritz H, García-Escudero LA, Mayo-Iscar A (2013b) A fast algorithm for robust constrained clustering. Comput Stat Data Anal 61:124–136

    Article  MathSciNet  MATH  Google Scholar 

  • García-Escudero LA, Gordaliza A, Matrán C, Mayo-Iscar A (2008) A general trimming approach to robust cluster analysis. Ann Stat 36:1324–1345

    Article  MathSciNet  MATH  Google Scholar 

  • García-Escudero LA, Gordaliza A, Matrán C, Mayo-Iscar A (2010) Robust clusterwise linear regression through trimming. Comput Stat Data Anal 54:3057–3069

    Article  MathSciNet  MATH  Google Scholar 

  • García-Escudero LA, Gordaliza A, Matrán C, Mayo-Iscar A (2011) Exploring the number of groups in robust model-based clustering. Stat Comput 21:585–599

    Article  MathSciNet  MATH  Google Scholar 

  • Gath I, Geva AB (1989) Unsupervised optimal fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 11:773–781

    Article  MATH  Google Scholar 

  • Gustafson DE, Kessel WC (1979) Fuzzy clustering with a fuzzy covariance matrix. In: Proceedings of the IEEE international conference on fuzzy systems, vol 25, pp 761–766

  • Hathaway RJ, Bezdek JC (1993) Switching regression models and fuzzy clustering. IEEE Trans Fuzzy Syst 1:195–204

    Article  Google Scholar 

  • Hennig C, Liao TF (2013) How to find an appropriate clustering for mixed types of variables with application to socioeconomic stratification. J R Stat Sci Ser C (Appl Stat) 62:309–369

    Article  Google Scholar 

  • Honda K, Ohyama T, Ichihashi H, Notsu A (2008) FCM-type switching regression with alternating least square method. In: Proceedings of the IEEE international conference on fuzzy systems (FUZZ 2008), pp 122–127

  • Hosmer DW Jr (1974) Maximum likelihood estimates of the parameters of a mixture of two regression lines. Commun Stat 3:995–1006

    Article  MATH  Google Scholar 

  • Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218

    Article  MATH  Google Scholar 

  • Ingrassia S, Minotti SC, Punzo A (2014) Model-based clustering via linear cluster-weighted models. Comput Stat Data Anal 71:159–182

    Article  MathSciNet  Google Scholar 

  • Kim J, Krishnapuram R, Davé RN (1996) Application of the least trimmed squares technique to prototype-based clustering. Pattern Recognit Lett 17:633–641

    Article  Google Scholar 

  • Leisch F (2006) A toolbox for K-centroids cluster analysis. Comput Stat Data Anal 51:526–544

    Article  MathSciNet  MATH  Google Scholar 

  • Lenstra AK, Lenstra JK, Rinnooy Kan AHG, Wansbeek TJ (1982) Two lines least squares. Ann Discrete Math 66:201–211

    MathSciNet  MATH  Google Scholar 

  • McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York

    Book  MATH  Google Scholar 

  • Perry PO (2009) Cross-validation for unsupervised learning. arXiv:0909.3052

  • Ritter G (2015) Robust cluster analysis and variable selection. CRC Press, Boca Raton

    MATH  Google Scholar 

  • Rousseeuw PJ, Kaufman L, Trauwaert E (1996) Fuzzy clustering using scatter matrices. Comput Stat Data Anal 23:135–151

    Article  MATH  Google Scholar 

  • Ruspini EH (1969) A new approach to clustering. Inf Control 29:22–32

    Article  MATH  Google Scholar 

  • Sadaaki M, Masao M (1997) Fuzzy \(c\)-means as a regularization and maximum entropy approach. In: Proceedings of the 7th international fuzzy systems association world congress (IFSA’97), vol 2. University of Economics, Prague, pp 86–92

  • Song W, Yao W, Xing Y (2014) Robust mixture regression model fitting by Laplace distribution. Comput Stat Data Anal 71:128–137

    Article  MathSciNet  Google Scholar 

  • Späth H (1982) A fast algorithm for clusterwise linear regression. Computing 29:175–181

    Article  MATH  Google Scholar 

  • Symons MJ (1981) Clustering criteria and multivariate normal mixtures. Biometrics 37:35–43

    Article  MathSciNet  MATH  Google Scholar 

  • Trauwaert E, Kaufman L, Rousseeuw P (1991) Fuzzy clustering algorithms based on the maximum likelihood principle. Fuzzy Sets Syst 42:213–227

    Article  MATH  Google Scholar 

  • Wu KL, Yang MS, Hsieh, JN (2009) Alternative fuzzy switching regression. In: Proceedings of the international multiconference of engineers and computer scientists 2009 (IMECS 2009), 18–20 Mar, vol 1. Newswood Limited, Hong Kong

  • Yao W, Li L (2014) A new regression model: modal linear regression. Scand J Stat 41:656–671

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

The authors are grateful to three referees and the Associated Editor for several constructive suggestions. Research partially supported by the Spanish Ministerio de Economía y Competitividad, Grant MTM2014-56235-C2-1-P, and by Consejería de Educación de la Junta de Castilla y León, Grant VA212U13.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Dotto.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 553 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dotto, F., Farcomeni, A., García-Escudero, L.A. et al. A fuzzy approach to robust regression clustering. Adv Data Anal Classif 11, 691–710 (2017). https://doi.org/10.1007/s11634-016-0271-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-016-0271-9

Keywords

Mathematics Subject Classification

Navigation