Abstract
In recent decades, functional data analysis has attracted the attention of many researchers in mathematics and statistics. For this reason, both the theory and applications have proliferated in the literature. Much of classical statistics has been rewritten in functional terms to handle data that are or can be represented by functions through appropriate smoothing operations. Within the framework of supervised and unsupervised classification, numerous techniques have been proposed to identify homogeneous groups of functional data based on different possible metrics and semi-metrics depending on the specific context. A limitation of these techniques is that they always lead to crisp-type groupings. Recently, in fuzzy set theory, many classification methods have been proposed to obtain non-crisp groupings so that the researcher is not forced to assign a statistical unit to a single group in a unique way. Following this approach, it is possible to carry out a classification that contemplates the possibility that a statistical unit belongs to different groups at the same time with different degrees of membership. The objective of this article is to propose a fuzzy functional unsupervised classification algorithm that takes into account both the functional and the fuzzy approach in order to identify similar patterns of functional data. After presenting the method, a possible application is proposed using the health composite indicator concerning the Italian regions in the period 2010–2015. The final aim of this work is to provide professionals with a tool capable of monitoring the risks of health imbalances at the national level, identifying similar behaviours at the local level but embracing the uncertainty that the fuzzy functional classification preserves in results.
Similar content being viewed by others
References
Aguilera A, Aguilera-Morillo M (2013) Penalized PCA approaches for b-spline expansions of smooth functional data. Appl Math Comput 219:7805–7819. https://doi.org/10.1016/j.amc.2013.02.009
Aguilera A, Aguilera-Morillo MC, Escabias M, Valderrama M (2011) Penalized spline approaches for functional principal component logit regression. In: Contributions to statistics. Physica-Verlag, Heidelberg, pp 1–7. https://doi.org/10.1007/978-3-7908-2736-1_1
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Springer, Boston. https://doi.org/10.1007/978-1-4757-0450-1
Bora DJ, Gupta DAK (2014) A comparative study between fuzzy clustering algorithm and hard clustering algorithm. Int J Comput Trends Technol 10:108–113 https://doi.org/10.14445/22312803/ijctt-v10p119
Cardot H, Ferraty F, Sarda P (1999) Functional linear model. Stat Probab Lett 45:11–22. https://doi.org/10.1016/s0167-7152(99)00036-x
Collan M, Fedrizzi M, Luukka P (2017) Possibilistic risk aversion in group decisions: theory with application in the insurance of giga-investments valued through the fuzzy pay-off method. Soft Comput 21(15):4375–4386. https://doi.org/10.1007/s00500-016-2069-2
Cuevas A (2014) A partial overview of the theory of statistics with functional data. J Stat Plan Inference 147:1–23. https://doi.org/10.1016/j.jspi.2013.04.002
Cummins RA (2018) Subjective well-being as a social indicator. Soc Indic Res 135:879–891. https://doi.org/10.1007/s11205-016-1496-x
Di Spalatro D, Maturo, F, Sicuro, L (2017) Inequalities in the provinces of Abruzzo: a comparative study through the indices of deprivation and principal component analysis. Springer, Cham. pp 219–231 https://doi.org/10.1007/978-3-319-54819-7_15
Escabias M, Aguilera AM, Aguilera-Morillo MC (2014) Functional PCA and base-line logit models. J Classif 31:296–324. https://doi.org/10.1007/s00357-014-9162-y
Febrero-Bande M, de la Fuente M (2012) Statistical computing in functional data analysis: the R package fda.usc. J Stat Softw, Articles 51:1–28 https://doi.org/10.18637/jss.v051.i04
Felice E (2017) The roots of a dual equilibrium: GDP, productivity and structural change in the Italian regions in the long-run (1871–2011). SSRN Electron J. https://doi.org/10.2139/ssrn.3082184
Ferraro MB, Giordani P (2015) A toolbox for fuzzy clustering using the R programming language. Fuzzy Sets Syst 279:1–16. https://doi.org/10.1016/j.fss.2015.05.001
Ferraty F, Vieu P (2006) Nonparametric functional data analysis. Springer, New York. https://doi.org/10.1007/0-387-36620-2
Fortuna F, Maturo F (2018) K-means clustering item characteristic curves and item information curves via functional principal component analysis. Qual Quant 53:2291–2304. https://doi.org/10.1007/s11135-018-0724-7
Fortuna F, Maturo F, Di Battista T (2018) Clustering functional data streams: unsupervised classification of soccer top players based on google trends. Qual Reliab Eng Int 34:1448–1460. https://doi.org/10.1002/qre.2333
Hoppner F, Klawonn F, Kruse R, Runkler T (2000) Fuzzy cluster analysis: methods for classification, data analysis and image recognition. J Oper Res Soc. https://doi.org/10.2307/254022
ISTAT (2016) Rapporto BES 2016: il benessere equo e sostenibile in Italia
Jacques J, Preda C (2014) Model-based clustering for multivariate functional data. Comput Stat Data Anal 71:92–106. https://doi.org/10.1016/j.csda.2012.12.004
Klawonn F, Kruse R, Winkler R (2015) Fuzzy clustering: more than just fuzzification. Fuzzy Sets Syst 281:272–279. https://doi.org/10.1016/j.fss.2015.06.024
Lee SJ, Kim Y (2016) Structure of well-being: an exploratory study of the distinction between individual well-being and community well-being and the importance of intersubjective community well-being. In: Social factors and community well-being, Springer, Berlin, pp 13–37. https://doi.org/10.1007/978-3-319-29942-6_2
Lefèvre T, Rondet C, Parizot I, Chauvin P (2014) Applying multivariate clustering techniques to health data: the 4 types of healthcare utilization in the Paris metropolitan area. PLoS ONE 9:1–20. https://doi.org/10.1371/journal.pone.0115064
Liao M, Li Y, Kianifard F, Obi E, Arcona S (2016) Cluster analysis and its application to healthcare claims data: a study of end-stage renal disease patients who initiated hemodialysis. BMC Nephrol 17:305–315. https://doi.org/10.1186/s12882-016-0238-2
Maturo F (2018) Unsupervised classification of ecological communities ranked according to their biodiversity patterns via a functional principal component decomposition of Hill’s numbers integral functions. Ecol Indic 90:305–315. https://doi.org/10.1016/j.ecolind.2018.03.013
Maturo F, Di Battista T (2018) A functional approach to Hill’s numbers for assessing changes in species variety of ecological communities over time. Ecol Indic 84:70–81. https://doi.org/10.1016/j.ecolind.2017.08.016
Maturo F, Migliori S, Paolone F (2018) Measuring and monitoring diversity in organizations through functional instruments with an application to ethnic workforce diversity of the U.S. Federal agencies. Comput Math Organ Theory 24:1–32. https://doi.org/10.1007/s10588-018-9267-7
OECD (2013) OECD guidelines on measuring subjective well-being. measuring subjective well-being, pp 139–178. https://doi.org/10.1787/9789264191655-7-en
OECD (2014) GDP as a welfare metric: the beyond GDP agenda, pp 451 – 477. https://doi.org/10.1787/9789264214637-16-en
Onasanya BO, Hoskova-Mayerova S (2019) Multi-fuzzy group induced by multisets. Ital J Pure Appl Math 41:597–604
Ramsay J, Dalzell C (1991) Some tools for functional data analysis. J R Stat Soc Sers B Methodol 53:539–561. https://doi.org/10.1111/j.2517-6161.1991.tb01844.x
Ramsay J, Hooker G, Graves S (2009) Introduction to functional data analysis. In: Functional data analysis with R and MATLAB, Springer, New York, pp 1–19. https://doi.org/10.1007/978-0-387-98185-7_1
Ramsay J, Silverman B (2005) Functional data analysis, 2nd edn. Springer, New York
Rojas M (2018) Indicators of people’s well-being. Soc Indic Res 135:941–950. https://doi.org/10.1007/s11205-016-1507-y
Shang H (2013) A survey of functional principal component analysis. AStA Adv Stat Anal 98:121–142. https://doi.org/10.1007/s10182-013-0213-1
Viertl R (2011) Statistical methods for fuzzy data, Wiley, New York. https://doi.org/10.1002/9780470974414
Winkler R, Klawonn F, Kruse R (2011) Fuzzy c-means in high dimensional spaces. Int J Fuzzy Syst Appl 1:1–16. https://doi.org/10.4018/ijfsa.2011010101
Zadeh L (1975) The concept of a linguistic variable and its application to approximate reasoning. Inf Sci I 8:199–249
Zimmermann HJ (2001) Fuzzy sets-basic definitions. Springer, Dordrecht, pp 11–21. https://doi.org/10.1007/978-94-010-0646-0_2
Acknowledgements
Dr Maturo and Dr Ferguson are supported by a Grant from the Health Research Board, Ireland: EIA-2017-017.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants performed by any of the authors.
Additional information
Communicated by M. Squillante.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Maturo, F., Ferguson, J., Di Battista, T. et al. A fuzzy functional k-means approach for monitoring Italian regions according to health evolution over time. Soft Comput 24, 13741–13755 (2020). https://doi.org/10.1007/s00500-019-04505-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04505-2