Fuzzy data analysis and classification

D’Urso, Pierpaolo; Gil, María Ángeles

doi:10.1007/s11634-017-0304-z

Fuzzy data analysis and classification

Special issue in memoriam of Professor Lotfi A. Zadeh, father of fuzzy logic

Editorial
Published: 16 November 2017

Volume 11, pages 645–657, (2017)
Cite this article

Download PDF

Advances in Data Analysis and Classification Aims and scope Submit manuscript

Fuzzy data analysis and classification

Download PDF

Pierpaolo D’Urso¹ &
María Ángeles Gil²

9978 Accesses
15 Citations
1 Altmetric
Explore all metrics

In analyzing and classifying data from a statistical perspective, fuzzy sets and logic have become a valuable tool either to model and handle imprecise data or to establish flexible techniques to deal with precise data.

From the very beginning of his 52 years-old theory, Professor Zadeh highlighted that “Probability theory/statistics and fuzzy logic should be viewed as complementary rather than competitive,” and he anticipated and encouraged the materialization of such a complementarity. Nowadays, this assertion is a reality, as shown by the many related papers, specialized conferences, special sessions and tracks in general conferences, and so on.

This special issue started in 2015, with the 50th anniversary of the seminal paper on fuzzy sets by Zadeh (1965), aiming to collect a sample of research papers about the current trends on the combination of fuzzy sets/logic and data analysis/classification.

When this special issue was almost ready for publication, Zadeh unfortunately passed away at age 96 (February 4, 1921–September 6, 2017). We wish this special issue to be dedicated to Professor Zadeh, as a modest part of the many tributes that he will receive, and intending to show that fuzzy sets/logic and data analysis/classification can certainly work in synergy.

1 Introduction

This issue means a contribution of a well-known rather young journal from a scientific field that, sometimes and especially at the beginning of the introduction of fuzzy theory, put it into question. However, Zadeh has been permanently trying to build bridges between statistics/probability and fuzzy logic. We have thought that this was a cordial and gentle way to combate criticisms, in accordance with his well-known friendly attitude, by demonstrating the potential benefits of collaboration between both fields. But we have been ‘probably’ wrong. Zadeh’s fondness and affection towards Probability and Statistics is neither incidental nor temporary, but distant in time.

To illustrate the last assertion, in his 2015’s paper (Zadeh 2015) Professor Zadeh has taken a look back over his past in research and he says: “\(\ldots \) The early years in my academic career coincided with the birth of the age of computers and information. It was an exciting period, spurred by competition between the United States and the Soviet Union. At Columbia, my research was focused on system theory and information systems. Probability theory had a position of centrality in my work. My first paper, published in 1949 in the Journal of Applied Physics, was entitled, Probability criterion for the design of servomechanisms. My second paper, published also in the Journal of Applied Physics in 1950, was entitled, An extension of Wiener’s theory of prediction. I had a close relationship with the Department of Mathematical Statistics and its Chair, Herbert Robbins, a brilliant mathematician, who became my best friend...” Actually, many of Zadeh’s first publications’ titles involve unequivocally statistical terms such as stochastic operators, correlation functions, prediction, time-series, and so on. The same applies to several communications presented in meetings of the American Mathematical Society.

Anyway, as it often happens with new theories and approaches, there has been more than two decades in which probabilistic and statistical journals scarcely publish papers concerning fuzzy material, but maybe some eventual debates. In this respect, as outlined in Belohlavek et al. (2017) and Ross et al. (2002), one should at least mention the following:

First, the debate along the Conference on the Calculus of Uncertainty in Artificial Intelligence and Expert Systems held in 1984. In this conference, Lotfi Zadeh, as supporter of fuzzy logic, Dennis Lindley, as supporter of subjective probability, Glenn Shafer, as supporter of evidence theory, and David Spiegelhalter, as supporter of applications of expert systems, acted as invited speakers. This debate was mostly gathered in the Issue 1 of Volume 2 of the journal (Statistical Science 1987), although Zadeh could not send his written presentation in due time but he often referred to what was discussed there in different papers.
Secondly, the invited paper with discussions by Laviolette et al. (1995) in the journal Technometrics, where Laviolette et al. reviewed some basic ideas in fuzzy theory and offer what they consider to be simpler alternatives based on traditional probability and statistical theory; the corresponding Zadeh’s discussion (Zadeh 1995) argued that probability theory by itself is not sufficient for dealing with uncertainty and imprecision in real-world settings, but allowing them to coexist is much more effective and reasonable.
Finally, another interesting debate is the one around the invited paper with discussions by Singpurwalla and Booker (2004) in the Journal of the American Statistical Association, in which authors articulated that probability theory has a sufficiently rich structure for incorporating fuzzy sets within its framework, and that probability and fuzzy set theories can work in concert. Zadeh’s discussion about this last paper (Zadeh 2004) suggested to restructure probability theory by involving a shift in its foundations from bivalent to fuzzy logic.

As we have already commented, nowadays the controversy has been substantially diminishing and ideas of reconciliation and coexistence of fuzzy and probability/statistics theories and the development of hybrid models and methods have been thriving. Kruse et al. have pointed out (Kruse et al. 2013) that “... We must differentiate between fuzzy data analysis and fuzzy data analysis. The former deals with the analysis of classical data using methods based on fuzzy set theory. These methods, e.g., fuzzy clustering or fuzzy regression analysis, have been used successfully in lots of industrial applications. The second approach tries to analyze fuzzy data by using statistical methods...” In other words, Fuzzy Data Analysis and Classification studies are mainly focussed

either on developing concepts, results and methods to deal with classical (non-fuzzy) data, where fuzziness is involved in the construction of the analysis/classification procedures,
or on developing/extending concepts, results and methods concerning data analysis and classification of fuzzy-valued data,
or on both.

2 On the fuzzy analysis and the fuzzy classification of non-fuzzy/standard data

The development of fuzzy approaches to classify ‘crisp’ data started soon after the formalization of fuzzy sets (Zadeh 1965). In fact, Zadeh along with Bellman and Kalaba were the first in suggesting fuzzy sets as a theoretical basis to develop clustering algorithms (Bellman et al. 1966). Some of the most influential pioneer works on the subject are, among others, those by Ruspini (1969, 1970), Tamura et al. (1971), Dunn (1973, 1974), Bezdek (1973, 1974, 1980), and Bezdek et al. (1984), which have inspired both applications and many further methodologies. At present, this is one of the most successful topics involving Fuzzy Sets and Statistical theories, and the number of research papers on it is unquestionably growing [among the most recent ones see, for instance, the approaches in Liu et al. (2013), Gong et al. (2014), Yamashita and Mayekawa (2015), Ruan et al. (2016), and Nguyen-Trang and Vo-Van (2017)], and it appears often either combined with or supporting other data analysis problems.

In more detail, useful references to the extensive literature on the fuzzy clustering (from both theoretical and applicative points of view) can be found in the chapter on the fuzzy clustering by D’Urso (2016), the seminal monograph by Bezdek (1981), the books by Jain and Dubes (1988), De Oliveira and Pedrycz (2007), Miyamoto et al. (2008) and, e.g., the following journals: Fuzzy Sets and Systems, IEEE Transactions on Fuzzy Systems, Information Sciences, Pattern Recognition, Applied Soft Computing, Soft Computing, Advances in Data Analysis and Classification, Computational Statistics and Data Analysis, Chemometrics and Intelligent Laboratory Systems, Pattern Recognition Letters, etc.

As remarked by D’Urso (2017a), there are different uncertainty-based clustering methods that can be considered extensions, variants and alternatives of the fuzzy clustering for non-fuzzy/standard data, like

possibilistic clustering [see, for instance, Krishnapuram and Keller (1993)],
shadowed clustering [see, for instance, Pedrycz (1998)],
rough sets-based clustering [see, for instance, Lingras and West (2004)],
intuitionistic fuzzy clustering [see, for instance, Hung et al. (2004)],
evidential clustering, credal clustering or belief clustering [see, for instance, Denoeux and Masson (2004a)],
credibilistic clustering [see, for instance, Zhou et al. (2007)],
type-2 fuzzy clustering [see, for instance, Hwang and Rhee (2007)],
neutrosophic clustering [see, for instance, Shan et al. (2012)],
hesitant fuzzy clustering [see, for instance, Chen et al. (2013)],
interval-based fuzzy clustering [see, for instance, Silva et al. (2015)],
picture fuzzy clustering [see, for instance, Son (2015)].

Fuzzy approaches to analyze crisp/standard data, have not been carried out as exhaustively as fuzzy clustering ones for the same data. And they were developed several years after fuzzy sets were introduced. Among them, one can highlight

the fuzzy linear regression ideas between non-fuzzy input and output data, by considering the problem as a linear programming one [see, for instance, the first formulation by Tanaka et al. (1982) and Tanaka and Watada (1988)],
hypothesis fuzzy testing, testing of fuzzy hypotheses, and fuzzy estimation regarding non-fuzzy parameters on the basis of non-fuzzy data [see, for instance, Watanabe and Imaizumi (1993), Arnold (1998), Buckley (2004), Hryniewicz (2006), Parchami et al. (2009)],
fuzzy statistical quality control [see, for instance, Grzegorzewski and Hryniewicz (2000)],
statistical decision problems with fuzzy utilites/losses [see, for instance, Gil and Jain (1992), Gil and López-Díaz (1996)].

3 On the analysis and classification of fuzzy data

On the other hand, approaches to classify fuzzy-valued data are becoming a challenging topic. Among the first published approaches one should mention those by Esogbue (1986), Hathaway et al. (1996), and Pedrycz et al. (1998) and, among the recent ones, see those by Coppi et al. (2012), D’Urso and De Giovanni (2014), D’Urso et al. (2015a), Ansari et al. (2017) and Ferraro and Giordani (2017).

The analysis of fuzzy-valued data is also a topic receiving an increasing attention along the years. Some of the developed methodologies aiming to analyze fuzzy data consider a descriptive view and do not refer to models associated with the probabilistic framework. Nevertheless, most of the methodologies are based on the modeling of the random mechanisms generating fuzzy data within a probabilistic setting. In this respect, we can mention, among some of these methodologies the following:

The methodologies based on the notion of fuzzy information system, introduced by Okuda et al. (1978), who consider the available information from a classical random experiment associated with a real-valued random variable to be fuzzy [i.e., they consider an epistemic viewpoint in accordance with the distinction made by Couso and Dubois (2014)] and assume that this available information constitutes a fuzzy partition (in Ruspini’s sense Ruspini 1970) of the sample space of the variable, and probabilities are based on Zadeh’s probabilistic definition of fuzzy events (Zadeh 1968). Some data analysis developments using this model can be seen, for instance, in Gil et al. (1988), Gil (1992) and, more recently, Denoeux (2011).
The methodologies based on the notion of fuzzy random variable, introduced by Kwakernaak (1978, 1979) and later formalized by Kruse and Meyer (1987). As Kruse et al. pointed out in Kruse et al. (2013), this deep and wide data analysis with vague data was a fruit of the encouragement by Professors Lotfi Zadeh and Heinz Skala [editor of the Series Theory and Decision Library of the D. Reidel Publishing Co., see, e.g., Skala (1975)], which was mostly prompted by the development by Kruse and Meyer of useful fuzzy methods and a software tool for statistical applications for the Siemens AG. The model refers to the epistemic perspective and fuzzy random variable is viewed as the fuzzy perception of an original non-fuzzy random variable. Statistical developments with fuzzy data coming from the fuzzy perception of real-valued ones will be mainly based on propagating the associated imprecision to the distribution function, parameters, etc., through Zadeh’s extension principle (Zadeh 1975a, b, c). It should be remarked that, albeit based on fuzzy information, statistical conclusions with Kruse and Meyer’s fuzzy random variables always concern the original random variable and its parameters. Among the studies based on Kruse and Meyer’s fuzzy random variables one can refer, for instance, to Kruse (1984, 1987), Grzegorzewski (1998), Wang (2004), and Wu (2005).
The methodologies based on the notion of random fuzzy sets, introduced by Féron (1979), and to some extent anticipated by Fréchet (1948), and later formalized by Puri and Ralescu (1985, 1986). The model, which was initially coined as fuzzy random variables, refers to a kind of ontic perspective, since a random fuzzy set (or random fuzzy number if values are fuzzy numbers) is viewed as a mapping associating experimental outcomes with fuzzy values in a Borel-measurable way, so that the induced distribution associated with the random fuzzy set is immediate, the stochastic independence between random fuzzy sets is also trivially induced, and so on. It should be remarked that, in contrast to the Kruse and Meyer’s approach, statistical conclusions with Puri and Ralescu random fuzzy sets always concern the fuzzy-valued random element and the parameters associated with its induced distribution. An interesting distinctive feature of the statistical methodology based on this approach to generate fuzzy data is that most of the classical ideas in data analysis can be immediately preserved without needing to either define or adapt them expressly. Among the statistical developments involving this approach one can refer, for instance, to Bandemer and Näther (1992) and Näther (1997, 2006) and, more recently, Blanco-Fernández et al. (2014a, b).

Other developments and approaches can be found in the literature (for instance, those by Viertl (2006), Grzegorzewski and Szymanowski (2014), etc.). Among the scientific journals publishing papers on the topic, one can mention those indicated in Sect. 2 for the fuzzy clustering and analysis of standard data.

4 Additional related literature

It is pertinent to state at this point that some other procedures have been suggested in the literature to categorize non-standard data in a fuzzy manner, and some of them have been gathered in Table 1 [see also, D’Urso (2016)].

Table 1 Some relevant references on fuzzy clustering of non-standard data

Full size table

In a similar way, Table 2 collects some of the most relevant references on the methodological statistical studies on the analysis of fuzzy data (see also D’Urso 2017b).

Table 2 Some relevant references on the analysis of fuzzy data

Full size table

5 On this special issue

The papers in this special issue are to be considered as a sample of recent advances in data analysis and classification involving fuzziness, illustrating the need of taking advantage of other topics like Fuzzy Logic to enrich and to widen statistical methodologies. Although small samples are not usually informative enough from a statistical perspective, and this special issue is certainly a very small sample, we trust that readers can get a flavour of some of the current trends about.

The first four papers in the issue concern fuzzy data analysis or classification:

The paper “A fuzzy approach to robust regression clustering”, by Dotto, Farcomeni, García-Escudero and Mayo-Iscar, proposes a fuzzy regression clustering method based on a maximum likelihood approach but in such a way that the method resists well to data contamination.
The paper “A novel method for forecasting time series based on fuzzy logic and visibility graph”, by Zhang, Ashuri and Deng, relates to a new suggestion to forecast time series, which is based on fuzzy logic, visibility graph and link prediction.
The paper “Fuzzy rule based classification systems for big data with MapReduce: granularity analysis”, by Fernández, del Río, Bawakid and Herrera, aims to discuss the effect of the granularity level and the number of selected Maps on the performance of the Chi-Fuzzy Rule Based Classification Systems with a MapReduce approach for big data.
The paper “On ill-conceived initialization in archetypal analysis”, by Suleman, addresses the problem of initialization and the performance of fuzzy clustering by means of an archetypal analysis.

The last two papers in the issue concern fuzzy data analysis or classification:

The paper “Robust scale estimators for fuzzy data”, by de la Rosa de Sáa, Lubiano, Sinova and Filzmoser, regards the introduction of some robust location-based scale measures/estimates for random fuzzy numbers, along with the analysis of their robustness.
The paper “Parametric classification with soft labels using the evidential EM algorithm. Linear discriminant analysis vs logistic regression”, by Denoeux, Quost and Li, analyzes the problem of partially supervised classification when learning instances are labeled by means of Dempster–Shafer mass functions (which include fuzzy sets as a particular case).

References

Ansari ZA, Sattar SA, Babu AV (2017) A fuzzy neural network based framework to discover user access patterns from web log data. Adv Data Anal Classif 11(3):519–546
Article MathSciNet Google Scholar
Arnold BF (1998) Testing fuzzy hypotheses with crisp data. Fuzzy Sets Syst 94(3):323–333
Article MathSciNet MATH Google Scholar
Aşan Z, Greenacre M (2011) Biplots of fuzzy coded data. Fuzzy Sets Syst 183:57–71
Article MathSciNet Google Scholar
Auephanwiriyakul S, Keller JM (2002) Analysis and efficient implementation of a linguistic fuzzy c-means. IEEE Trans Fuzzy Syst 10:563–582
Article Google Scholar
Bandemer H, Näther W (1992) Fuzzy data analysis. Springer, Dordrecht
Book MATH Google Scholar
Bellman RE, Kalaba R, Zadeh LA (1966) Abstraction and pattern classification. J Math Anal Appl 13:1–7
Article MathSciNet MATH Google Scholar
Belohlavek R, Dauben JW, Klir GJ (2017) Fuzzy logic and mathematics. A historical perspective. Oxford University Press, New York
Book MATH Google Scholar
Berlinger J, Hüllermeier E (2007) Fuzzy clustering of parallel data streams. In: De Oliveira and Pedrycz (2007), pp 333–352
Bezdek JC (1973) Cluster validity with fuzzy sets. Cybern Syst/J Cybern 3(3):58–73
MathSciNet MATH Google Scholar
Bezdek JC (1974) Numerical taxonomy with fuzzy sets. J Math Biol 1(1):57–71
Article MathSciNet MATH Google Scholar
Bezdek JC (1980) Convergence theorem for the fuzzy ISODATA clustering algorithms. IEEE Trans Pattern Anal Mach Intell 2(1):1–8
Article MathSciNet MATH Google Scholar
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York
Book MATH Google Scholar
Bezdek JC, Ehrlich R, Full W (1984) FCM—the fuzzy c-means clustering-algorithm. Comput Geosci 10(2–3):191–203
Article Google Scholar
Blanco-Fernández A, Casals MR, Colubi A, Corral N, García-Bárzana M, Gil MA, González-Rodríguez G, López MT, Lubiano MA, Montenegro M, Ramos-Guajardo AB, de la Rosa de Sáa S, Sinova B (2014a) A distance-based statistical analysis of fuzzy number-valued data. Int J Approx Reason 55(7):1487–1501
Article MathSciNet MATH Google Scholar
Blanco-Fernández A, Casals MR, Colubi A, Corral N, García-Bárzana M, Gil MA, González-Rodríguez G, López MT, Lubiano MA, Montenegro M, Ramos-Guajardo AB, de la Rosa de Sáa S, Sinova B (2014b) Rejoinder on “A distance-based statistical analysis of fuzzy number-valued data”. Int J Approx Reason 55(7):1601–1605
Article MATH Google Scholar
Buckley JJ (2004) Fuzzy statistics. Studies in fuzziness and soft computing series 149. Springer, Berlin
Google Scholar
Calcagnì A, Lombardi L, Pascali E (2016) A dimension reduction technique for two-mode non-convex fuzzy data. Soft Comput 20:749–762
Article Google Scholar
Cappelli C, D’Urso P, Di Iorio F (2013) Change point analysis for imprecise time series. Fuzzy Sets Syst 225:23–38
Article MathSciNet MATH Google Scholar
Celminš A (1987) Multidimensional least-squares fitting of fuzzy models. Math Model 9:669–690
Article MATH Google Scholar
Chen N, Xu Z, Xia M (2013) Correlation coefficients of hesitant fuzzy sets and their applications to clustering analysis. Appl Math Model 37:2197–2211
Article MathSciNet MATH Google Scholar
Colubi A, González-Rodríguez G, Gil MA, Trutschnig W (2011) Nonparametric criteria for supervised classification of fuzzy data. Int J Approx Reason 52:1272–1282
Article MathSciNet MATH Google Scholar
Coppi R, D’Urso P (2002) Fuzzy k-means clustering models for triangular fuzzy time trajectories. Stat Methods Appl 11(1):21–40
Article MATH Google Scholar
Coppi R, D’Urso P (2003) Three-way fuzzy clustering models for LR fuzzy time trajectories. Comput Stat Data Anal 43:149–177
Article MathSciNet MATH Google Scholar
Coppi R, D’Urso P (2006) Fuzzy unsupervised classification of multivariate time trajectories with the Shannon entropy regularization. Comput Stat Data Anal 50:1452–1477
Article MathSciNet MATH Google Scholar
Coppi R, D’Urso P, Giordani P, Santoro A (2006) Least squares estimation of a linear regression model with LR fuzzy response. Comput Stat Data Anal 51:267–286
Article MathSciNet MATH Google Scholar
Coppi R, D’Urso P, Giordani P (2010) A fuzzy clustering model for multivariate spatial time series. J Classif 27:54–88
Article MathSciNet MATH Google Scholar
Coppi R, D’Urso P, Giordani P (2012) Fuzzy and possibilistic clustering for fuzzy data. Comput Stat Data Anal 56(4):915–927
Article MathSciNet MATH Google Scholar
Couso I, Dubois D (2014) Statistical reasoning with set-valued information: ontic vs. epistemic views. Int J Approx Reason 55(7):1502–1518
Article MathSciNet MATH Google Scholar
D’Urso P (2003) Linear regression analysis for fuzzy/crisp input and fuzzy/crisp output data. Comput Stat Data Anal 42(1–2):47–72
Article MathSciNet MATH Google Scholar
D’Urso P (2005) Fuzzy clustering for data time array with inlier and outlier time trajectories. IEEE Trans Fuzzy Syst 13:583–604
Article Google Scholar
D’Urso P (2007) Fuzzy clustering of fuzzy data. In: De Oliveira and Pedrycz (2007), pp 155–192
D’Urso P (2016) Fuzzy clustering. In: Hennig C, Meila M, Murtagh F, Rocci R (eds) Handbook of cluster analysis. Chapman & Hall, Boca Raton, pp 545–573
Google Scholar
D’Urso P (2017a) Informational Paradigm, management of uncertainty and theoretical formalisms in the clustering framework: a review. Inform Sci 400–401:30–62
Article Google Scholar
D’Urso P (2017b) Exploratory multivariate analysis for empirical information affected by uncertainty and modeled in a fuzzy manner: a review. Granul Comput 2:225–247
Article Google Scholar
D’Urso P, De Giovanni L (2014) Robust clustering of imprecise data. Chem Intel Lab Syst 136:58–80
Article Google Scholar
D’Urso P, Gastaldi T (2000) A least-squares approach to fuzzy linear regression analysis. Comput Stat Data Anal 34:427–440
Article MATH Google Scholar
D’Urso P, Giordani P (2005) A possibilistic approach to latent component analysis for symmetric fuzzy data. Fuzzy Sets Syst 150:285–305
Article MathSciNet MATH Google Scholar
D’Urso P, Giordani P (2006) A robust fuzzy k-means clustering model for interval valued data. Comput Stat 21:251–269
Article MathSciNet MATH Google Scholar
D’Urso P, Leski J (2016) Fuzzy C-ordered medoids clustering of interval-valued data. Pattern Recogn 58:49–67
Article Google Scholar
D’Urso P, Massari R (2013) Fuzzy clustering of human activity patterns. Fuzzy Sets Syst 215:29–54
Article MathSciNet Google Scholar
D’Urso P, Santoro A (2006) Fuzzy clusterwise regression analysis with symmetrical fuzzy output variable. Comput Stat Data Anal 51:287–313
Article MathSciNet MATH Google Scholar
D’Urso P, Maharaj EA, Galagedera DUA (2010) Wavelets-based fuzzy clustering of time series. J Classif 27:231–275
Article MathSciNet MATH Google Scholar
D’Urso P, Massari R, Santoro A (2011) Robust fuzzy regression analysis. Inform Sci 181:4154–4174
Article MathSciNet MATH Google Scholar
D’Urso P, De Giovanni L, Massari R (2014) Self-organizing maps for imprecise data. Fuzzy Sets Syst 237:63–89
Article MathSciNet MATH Google Scholar
D’Urso P, De Giovanni L, Massari R (2015a) Trimmed fuzzy clustering for interval-valued data. Adv Data Anal Classif 8(1):21–40
Article MathSciNet Google Scholar
D’Urso P, De Giovanni L, Massari R (2015b) Time series clustering by a robust autoregressive metric with application to air pollution. Chemom Intel Lab Syst 141(15):107–124
Article Google Scholar
D’Urso P, De Giovanni L, Massari R (2016) GARCH-based robust fuzzy clustering of time series. Fuzzy Sets Syst 305:1–28
Article MATH Google Scholar
D’Urso P, De Giovanni L, Massari R, Cappelli C (2017a) Exponential distance-based fuzzy clustering for interval-valued data. Fuzzy Optim Decis Mak 16:51–70
Article MathSciNet Google Scholar
D’Urso P, Maharaj EA, Alonso AM (2017b) Fuzzy clustering of time series using extremes. Fuzzy Sets Syst 318:56–79
Article MathSciNet Google Scholar
Davé RN (1991) Characterization and detection of noise in clustering. Pattern Recogn Lett 12:657–664
Article Google Scholar
De la Rosa de Sáa S, Gil MA, González-Rodríguez G, López MT, Lubiano MA (2016) Fuzzy rating scale-based questionnaires and their statistical analysis. IEEE Trans Fuzzy Syst 23(1):111–126
Article Google Scholar
De Oliveira JV, Pedrycz W (2007) Advances in fuzzy clustering and its applications. Wiley, Chichester
Book Google Scholar
Denoeux T (2011) Maximum likelihood estimation from fuzzy data using the EM algorithm. Fuzzy Sets Syst 183:72–91
Article MathSciNet MATH Google Scholar
Denoeux T, Masson MH (2000) Multidimensional scaling of interval-valued dissimilarity data. Pattern Recogn Lett 21:83–92
Article Google Scholar
Denoeux T, Masson MH (2004a) EVCLUS: evidential clustering of proximity data. IEEE Trans Syst Man Cybern Part B-Cybern 34:95–109
Article Google Scholar
Denoeux T, Masson MH (2004b) Principal component analysis of fuzzy data using autoassociative neural networks. IEEE Trans Fuzzy Syst 12:336–349
Article Google Scholar
Diamond P (1988) Fuzzy least squares. Inform Sci 46:141–157
Article Google Scholar
Disegna M, D’Urso P, Durante F (2017) Copula-based fuzzy clustering of spatial time series. Spat Stat 21:209–225
Article MathSciNet Google Scholar
Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. Cybern Syst/J Cybern 3(3):32–57
MathSciNet MATH Google Scholar
Dunn JC (1974) Well-separated clusters and optimal fuzzy partitions. Cybern Syst/J Cybern 4(1):95–104
MathSciNet MATH Google Scholar
Esogbue AO (1986) Optimal clustering of fuzzy data via fuzzy dynamic programming. Fuzzy Sets Syst 18(3):283–298
Article MathSciNet MATH Google Scholar
Féron R (1979) Sur les notions de distance et d’ecart dans une structure floue et leurs applications aux ensembles aléatoires flous. C R Acad Sci Paris A 289:35–38
MathSciNet MATH Google Scholar
Ferraro MB, Giordani P (2017) Possibilistic and fuzzy clustering methods for robust analysis of non-precise data. Int J Approx Reason 88:23–38
Article MathSciNet MATH Google Scholar
Ferraro MB, Vichi M (2015) Fuzzy double clustering: a robust proposal. In: Grzegorzewski P, Gagolewski M, Hryniewicz O, Gil MA (eds) Strengthening links between data analysis and soft computing. Springer, Cham, pp 225–232
Google Scholar
Ferraro MB, Colubi A, González-Rodríguez G, Coppi R (2011) A determination coefficient for a linear regression model with imprecise response. Environmetrics 22(4):516–529
Article MathSciNet Google Scholar
Fréchet M (1948) Les éléments aléatoires de nature quelconque dans un espace distancié. Ann L’Inst H Poincaré 10:215–310
MATH Google Scholar
Frigui H, Krishnapuram R (1996) A robust algorithm for automatic extraction of an unknown number of clusters from noisy data. Pattern Recogn Lett 17:1223–1232
Article MATH Google Scholar
Fritz H, García-Escudero LA, Mayo-Iscar A (2013) Robust constrained fuzzy clustering. Inform Sci 245:38–52
Google Scholar
Gil MA (1992) Sufficiency and fuzziness in random experiments. Ann Inst Stat Math 44(3):451–462
MathSciNet MATH Google Scholar
Gil MA, Jain P (1992) Comparison of experiments in statistical decision problems with fuzzy utilities. IEEE Trans Syst Man Cybern 22(4):662–670
Article MathSciNet MATH Google Scholar
Gil MA, López-Díaz M (1996) Fundamentals and Bayesian analyses of decision problems with fuzzy-valued utilities. Int J Approx Reason 15(3):203–224
Article MathSciNet MATH Google Scholar
Gil MA, Corral N, Gil P (1988) The minimum inaccuracy estimates in \(\chi ^2\) tests for goodness of fit with fuzzy observations. J Stat Plan Inference 19(1):95–115
Article MathSciNet MATH Google Scholar
Gil MA, López-Díaz M, López-García H (1998) The fuzzy hyperbolic inequality index associated with fuzzy random variables. Eur J Oper Res 110(2):377–391
Article MATH Google Scholar
Gil MA, Montenegro M, González-Rodríguez G, Colubi A, Casals MR (2006) Bootstrap approach to the multi-sample test of means with imprecise data. Comput Stat Data Anal 51:148–162
Article MathSciNet MATH Google Scholar
Gil MA, Lubiano MA, de la Rosa de Sáa S, Sinova B (2015) Analyzing data from a fuzzy rating scale-based questionnaire. A case study. Psicothema 27(2):182–191
Google Scholar
Giordani P (2010) Three-way analysis of imprecise data. J Multivar Anal 101:568–582
Article MathSciNet MATH Google Scholar
Giordani P, Kiers HAL (2006) A comparison of three methods for principal component analysis of fuzzy interval data. Comput Stat Data Anal 51:379–397
Article MathSciNet MATH Google Scholar
Gong MG, Su LZ, Jia M, Chen WS (2014) Fuzzy clustering with a modified MRF energy function for change detection in synthetic aperture radar images. IEEE Trans Fuzzy Syst 22(1):98–109
Article Google Scholar
González-Rodríguez G, Blanco-Fernández A, Colubi A, Lubiano MA (2009) Estimation of a simple linear regression model for fuzzy random variables. Fuzzy Sets Syst 160(3):357–370
Article MathSciNet MATH Google Scholar
González-Rodríguez G, Colubi A, Gil MA (2012) Fuzzy data treated as functional data: a one-way ANOVA test approach. Comput Stat Data Anal 56(4):943–955
Article MathSciNet MATH Google Scholar
Grzegorzewski P (1998) Statistical inference about the median from vague data. Control Cybern 27(3):447–464
MathSciNet MATH Google Scholar
Grzegorzewski P, Hryniewicz O (2000) Soft methods in statistical quality control. Control Cybern 29(1):119–140
MathSciNet MATH Google Scholar
Grzegorzewski P, Szymanowski H (2014) Goodness-of-fit tests for fuzzy data. Inform Sci 288(1):374–386
Article MathSciNet MATH Google Scholar
Hathaway RJ, Bezdek JC (2001) Fuzzy c-means clustering of incomplete data. IEEE Trans Syst Man Cybern Part B-Cybern 31:735–744
Article Google Scholar
Hathaway RJ, Bezdek JC, Pedrycz W (1996) A parametric model for fusing heterogeneous fuzzy data. IEEE Trans Fuzzy Syst 4(3):270–281
Article Google Scholar
Havens TC, Bezdek JC, Leckie C, Hall LO, Palaniswami M (2012) Fuzzy c-means algorithms for very large data. IEEE Trans Fuzzy Syst 20:1130–1146
Article Google Scholar
Hébert PA, Denoeux T, Masson MH (2006) Fuzzy multidimensional scaling. Comput Stat Data Anal 51:335–359
Article MathSciNet MATH Google Scholar
Hryniewicz O (2006) Possibilistic decisions and fuzzy statistical tests. Fuzzy Sets Syst 157(19):2665–2673
Article MathSciNet MATH Google Scholar
Huang Z, Ng MK (1999) A fuzzy k-modes algorithm for clustering categorical data. IEEE Trans Fuzzy Syst 7:446–452
Article Google Scholar
Hung W-L, Lee J-S, Fuh C-D (2004) Fuzzy clustering based on intuitionistic fuzzy relations. Int J Uncertain Fuzz Know-Based Syst 12:513–529
Article MathSciNet MATH Google Scholar
Hwang C, Rhee F (2007) Uncertain fuzzy clustering: interval type-2 fuzzy approach to C-means. IEEE Trans Fuzzy Syst 15:107–120
Article Google Scholar
Irpino A, Verde R, de Carvalho FAT (2017) Fuzzy clustering of distributional data with automatic weighting of variable components. Inform Sci 406–407:248–268
Article MathSciNet Google Scholar
Jain A, Dubes R (1988) Algorithms for clustering data. Prentice-Hall, Upper Saddle River
MATH Google Scholar
Kesemen O, Tezel Ö, Özkul E (2016) Fuzzy c-means clustering algorithm for directional data (FCM4DD). Exp Syst Appl 58:76–82
Article Google Scholar
Körner R (2000) An asymptotic \(\alpha \)-test for the expectation of random fuzzy variables. J Stat Plan Inference 83(2):331–346
Article MathSciNet MATH Google Scholar
Krishnapuram R, Keller J (1993) A possibilistic approach to clustering. IEEE Trans Fuzzy Syst 1:98–110
Article Google Scholar
Kruse R (1984) Statistical estimation with linguistic data. Inform Sci 33(3):197–207
Article MathSciNet MATH Google Scholar
Kruse R (1987) On a software tool for statistics with linguistic data. Fuzzy Sets Syst 24(3):377–383
Article MathSciNet Google Scholar
Kruse R, Meyer KD (1987) Statistics with vague data. Mathematical and statistical methods. Series theory and decision library B, vol 6. D. Reidel Pub Co., Dordrecht
Google Scholar
Kruse R, Held P, Moewes C (2013) On fuzzy data analysis. In: Seising R, Trillas E, Moraga C, Termini S (eds) On fuzziness—a homage to Lotfi A. Zadeh, volume 1. Series studies in fuzziness and soft computing, vol 298. Springer, Heidelberg, pp 343–347
Google Scholar
Kwakernaak H (1978) Fuzzy random variables, part I: definitions and theorems. Inform Sci 15:1–15
Article MathSciNet MATH Google Scholar
Kwakernaak H (1979) Fuzzy random variables, part II: algorithms and examples for the discrete case. Inform Sci 17:253–278
Article MathSciNet MATH Google Scholar
Laviolette M, Seaman JW, Barrett JD, Woodall WH (1995) A probabilistic and statistical view of fuzzy methods. Technometrics 37(3):249–261
Article MATH Google Scholar
Lee M, Pedrycz W (2009) The fuzzy C-means algorithm with fuzzy P-mode prototypes for clustering objects having mixed features. Fuzzy Sets Syst 160:3590–3600
Article MathSciNet MATH Google Scholar
Lertworaprachaya Y, Yang Y, John R (2014) Interval-valued fuzzy decision trees with optimal neighbourhood perimeter. Appl Soft Comput 24:851–866
Article Google Scholar
Lingras P, West C (2004) Interval set clustering of web users with rough k-means. J Intel Inform Syst 23:5–16
Article MATH Google Scholar
Liu J (2010) Detecting the fuzzy clusters of complex networks. Pattern Recog 43:1334–1345
Article MATH Google Scholar
Liu S, Matzavinos A, Sethuraman S (2013) Random walk distances in data clustering and applications. Adv Data Anal Classif 7(1):83–108
Article MathSciNet MATH Google Scholar
Lubiano MA, Gil MA (1999) Estimating the expected value of fuzzy random variables in random samplings from finite populations. Stat Pap 40(3):277–295
Article MathSciNet MATH Google Scholar
Lubiano MA, de la Rosa de Sáa S, Montenegro M, Sinova M, Gil MA (2016a) Descriptive analysis of responses to items in questionnaires. Why not using a fuzzy rating scale? Inform Sci 360:131–148
Article MATH Google Scholar
Lubiano MA, Montenegro M, Sinova B, de la Rosa de Sáa S, Gil MA (2016b) Hypothesis testing for means in connection with fuzzy rating scale-based data: algorithms and applications. Eur J Oper Res 251(3):918–929
Article MathSciNet MATH Google Scholar
Lubiano MA, Salas A, Carleos C, de la Rosa de Sáa S, Gil MA (2017) Hypothesis testing-based comparative analysis between rating scales for intrinsically imprecise data. Int J Approx Reason 88:128–147
Article MathSciNet MATH Google Scholar
Maharaj EA, D’Urso P (2011) Fuzzy clustering of time series in the frequency domain. Inform Sci 181:1187–1211
Article MATH Google Scholar
Miyamoto S, Ichihashi H, Honda K (2008) Algorithms for fuzzy clustering—methods in c-means clustering with applications. Springer, Berlin
MATH Google Scholar
Montenegro M, Casals MR, Lubiano MA, Gil MA (2001) Two-sample hypothesis tests of means of a fuzzy random variable. Inform Sci 133(1–2):89–100
Article MathSciNet MATH Google Scholar
Näther W (1997) Linear statistical inference for random fuzzy data. Statistics 29(3):221–240
Article MathSciNet MATH Google Scholar
Näther W (2006) Regression with random fuzzy data. Comput Stat Data Anal 51(1):235–252
Article MATH Google Scholar
Näther W, Albrecht M (1990) Linear regression with random fuzzy observations. Statistics 21(4):521–531
Article MathSciNet MATH Google Scholar
Nguyen-Trang T, Vo-Van T (2017) A new approach for determining the prior probabilities in the classification problem by Bayesian method. Adv Data Anal Classif 11(3):629–643
Article MathSciNet Google Scholar
Okuda T, Tanaka H, Asai K (1978) A formulation of fuzzy decision problems with fuzzy information using probability measures of fuzzy events. Inform Control 38:135–147
Article MathSciNet MATH Google Scholar
Parchami A, Taheri SM, Mashinchi M (2009) Fuzzy \(p\)-value in testing fuzzy hypotheses with crisp data. Stat Pap 51(1):209–226
Article MathSciNet MATH Google Scholar
Pedrycz W (1998) Shadowed sets: representing and processing fuzzy sets. IEEE Trans Syst Man Cybern Part B-Cybern 28:103–109
Article Google Scholar
Pedrycz W, Bezdek JC, Hathaway RJ, Rogers GW (1998) Two nonparametric models for fusing heterogeneous fuzzy data. IEEE Trans Fuzzy Syst 6(3):411–425
Article Google Scholar
Pham DL (2001) Spatial models for fuzzy clustering. Comput Vis Image Underst 84:285–297
Article MATH Google Scholar
Puri ML, Ralescu DA (1985) The concept of normality for fuzzy random variables. Ann Probab 11:1373–1379
Article MathSciNet MATH Google Scholar
Puri ML, Ralescu DA (1986) Fuzzy random variables. J Math Anal Appl 114:409–422
Article MathSciNet MATH Google Scholar
Ramos-Guajardo AB, Lubiano MA (2012) K-sample tests for equality of variances of random fuzzy sets. Comput Stat Data Anal 56:956–966
Article MathSciNet MATH Google Scholar
Ramos-Guajardo AB, Colubi A, González-Rodríguez G, Gil MA (2010) One-sample tests for a generalized Fréchet variance of a fuzzy random variable. Metrika 71:185–202
Article MathSciNet MATH Google Scholar
Rocci R, Vichi M (2005) Three-mode component analysis with crisp or fuzzy partition of units. Psychometrika 70(4):715–736
Article MathSciNet MATH Google Scholar
Ross TJ, Booker JM, Parkinson WJ (eds) (2002) Fuzzy logic and probability applications: bridging the gap. ASA-SIAM series on statistics and applied probability. SIAM, Philadelphia
Google Scholar
Ruan JH, Wang XP, Chan FTS, Shi Y (2016) Optimizing the intermodal transportation of emergency medical supplies using balanced fuzzy clustering. Int J Prod Res 54(14):4368–4386
Article Google Scholar
Runkler TA, Bezdek JC (2003) Web mining with relational clustering. Int J Approx Reason 32:217–236
Article MATH Google Scholar
Ruspini EH (1969) A new approach to clustering. Inform Control 15:22–32
Article MATH Google Scholar
Ruspini EH (1970) Numerical methods for fuzzy clustering. Inform Sci 2:319–350
Article MATH Google Scholar
Shan J, Cheng HD, Wang Y (2012) A novel segmentation method for breast ultrasound images based on neutrosophic l-means clustering. Med Phys 3:5669–5682
Article Google Scholar
Silva L, Moura E, Canuto AMP, Santiago RHN, Bedregal B (2015) An interval-based framework for fuzzy clustering applications. IEEE Trans Fuzzy Syst 23:2174–2186
Article Google Scholar
Singpurwalla ND, Booker JM (2004) Membership functions and probability measures of fuzzy sets. J Am Stat Assoc 99(467):867–877
Article MathSciNet MATH Google Scholar
Sinova B, Gil MA, Van Aelst S (2016) M-estimates of location for the robust central tendency of fuzzy data. IEEE Trans Fuzzy Syst 24(4):945–956
Article Google Scholar
Skala HJ (1975) Non-Archimedean utility theory. Series theory and decision library, vol 9. D. Reidel Pub Co., Dordrecht
Book MATH Google Scholar
Son LH (2015) DPFCM: a novel distributed picture fuzzy clustering method on picture fuzzy sets. Exp Syst Appl 42:51–66
Article Google Scholar
Statistical science issue on artificial intelligence and expert systems. Stat Sci 2(1):3–44
Tamura S, Higuchi S, Tanaka K (1971) Pattern classification based on fuzzy relations. IEEE Trans Syst Man Cybern 1:61–66
Article MathSciNet MATH Google Scholar
Tan T, Suk HW, Hwang H, Lim J (2013) Functional fuzzy clusterwise regression analysis. Adv Data Anal Classif 7(1):57–82
Article MathSciNet MATH Google Scholar
Tanaka H, Watada J (1988) Possibilistic linear systems and their application to the linear regression model. Fuzzy Sets Syst 27(3):275–289
Article MathSciNet MATH Google Scholar
Tanaka H, Uejima S, Asai K (1982) Linear regression analysis with fuzzy model. IEEE Trans Syst Man Cybern 12(6):903–907
Article MATH Google Scholar
Theodorou Y, Drossos C, Alevizos P (2007) Correspondence analysis with fuzzy data: the fuzzy eigenvalue problem. Fuzzy Sets Syst 158:704–721
Article MathSciNet MATH Google Scholar
Tokushige S, Yadohisa H, Inada K (2007) Crisp and fuzzy k-means clustering algorithms for multivariate functional data. Comput Stat 22:1–16
Article MathSciNet MATH Google Scholar
Viertl R (2006) Univariate statistical analysis with fuzzy data. Comput Stat Data Anal 51(1):133–147
Article MathSciNet MATH Google Scholar
Wang D (2004) A note on consistency and unbiasedness of point estimation with fuzzy data. Metrika 60:93–104
Article MathSciNet MATH Google Scholar
Watanabe N, Imaizumi T (1993) A fuzzy statistical test of fuzzy hypotheses. Fuzzy Sets Syst 53:167–178
Article MathSciNet MATH Google Scholar
Wu H-C (2005) Statistical hypotheses testing for fuzzy data. Inform Sci 279:446–459
MathSciNet Google Scholar
Wu K-L, Yang M-S (2002) Alternative c-means clustering algorithms. Pattern Recogn 35(10):2267–2278
Article MATH Google Scholar
Yamashita N, Mayekawa S-I (2015) A new biplot procedure with joint classification of objects and variables by fuzzy c-means clustering. Adv Data Anal Classif 9(3):243–266
Article MathSciNet Google Scholar
Yang M-S, Nataliani Y (2017) Robust-learning fuzzy c-means clustering algorithm with unknown number of clusters. Pattern Recogn 71:45–59
Article Google Scholar
Yang M-S, Pan J-A (1997) On fuzzy clustering of directional data. Fuzzy Sets Syst 91:319–326
Article MATH Google Scholar
Yang M-S, Hwang P-Y, Chen D-H (2004) Fuzzy clustering algorithms for mixed feature variables. Fuzzy Sets Syst 141:301–317
Article MathSciNet MATH Google Scholar
Zadeh LA (1965) Fuzzy sets. Inform Control 8(3):338–353
Article MATH Google Scholar
Zadeh LA (1968) Probability measures of fuzzy events. J Math Anal Appl 23:421–427
Article MathSciNet MATH Google Scholar
Zadeh LA (1975a) The concept of a linguistic variable and its application to approximate reasoning. Part 1. Inform Sci 8:199–249
Article MATH Google Scholar
Zadeh LA (1975b) The concept of a linguistic variable and its application to approximate reasoning. Part 2. Inform Sci 8:301–353
Article MATH Google Scholar
Zadeh LA (1975c) The concept of a linguistic variable and its application to approximate reasoning. Part 3. Inform Sci 9:43–80
Article MATH Google Scholar
Zadeh LA (1995) Discussion: probability theory and fuzzy logic are complementary rather than competitive. Technometrics 37(3):271–276
Article Google Scholar
Zadeh LA (2004) Comment: membership functions and probability measures of fuzzy sets. J Am Stat Assoc 99(467):880–881
Article Google Scholar
Zadeh LA (2015) Fuzzy logic—a personal perspective. Fuzzy Sets Syst 281:4–20
Article MathSciNet MATH Google Scholar
Zhou J, Hung CC, Wang X, Chen S (2007) Fuzzy clustering based on credibility measure. In: Proceedings of the 6th international conference on management science, Lhasa, pp 404–411

Download references

Acknowledgements

We are deeply indebted to the ADAC Co-Editor Professor Maurizio Vichi because of having kindly and patiently supported this initiative, which is dedicated to Professor Lotfi A. Zadeh, the father of Fuzzy Logic.

Author information

Authors and Affiliations

Department of Social Sciences and Economics, Sapienza Università di Roma, P.za Aldo Moro, 5, Rome, Italy
Pierpaolo D’Urso
Department of Statistics, OR and TM, Universidad de Oviedo, C/ Federico García Lorca, 18, Oviedo, Spain
María Ángeles Gil

Authors

Pierpaolo D’Urso
View author publications
You can also search for this author in PubMed Google Scholar
María Ángeles Gil
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pierpaolo D’Urso.

Rights and permissions

Reprints and permissions

About this article

Cite this article

D’Urso, P., Gil, M.Á. Fuzzy data analysis and classification. Adv Data Anal Classif 11, 645–657 (2017). https://doi.org/10.1007/s11634-017-0304-z

Download citation

Published: 16 November 2017
Issue Date: December 2017
DOI: https://doi.org/10.1007/s11634-017-0304-z

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Fuzzy data analysis and classification

1 Introduction

2 On the fuzzy analysis and the fuzzy classification of non-fuzzy/standard data

3 On the analysis and classification of fuzzy data

4 Additional related literature

5 On this special issue

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation