Abstract
The fuzzy min–max neural network classifier is a supervised learning method. This classifier takes the hybrid neural networks and fuzzy systems approach. All input variables in the network are required to correspond to continuously valued variables, and this can be a significant constraint in many real-world situations where there are not only quantitative but also categorical data. The usual way of dealing with this type of variables is to replace the categorical by numerical values and treat them as if they were continuously valued. But this method, implicitly defines a possibly unsuitable metric for the categories. A number of different procedures have been proposed to tackle the problem. In this article, we present a new method. The procedure extends the fuzzy min–max neural network input to categorical variables by introducing new fuzzy sets, a new operation, and a new architecture. This provides for greater flexibility and wider application. The proposed method is then applied to missing data imputation in voting intention polls. The micro data—the set of the respondents’ individual answers to the questions—of this type of poll are especially suited for evaluating the method since they include a large number of numerical and categorical attributes.
Similar content being viewed by others
References
Rubin DB (1976) Inference and missing data. Biometrika 63:581–592
Rubin DB (1977) Formalizing subjective notions about the effect of non-respondents in sample surveys. J Am Stat Assoc 72(359):538–543
Dempster P, Rubin DB (1983) Incomplete data in sample surveys. In: Madow WG, Olkin I, Rubin DB (eds) Sample surveys. II. Theory and Annotated Bibliograph. Academic Press, New York
Schafer JL, Graham JW (2002) Missing data: our view of the state of the art. Psychol Methods 7(2):147–177
Durrant GB (2005) Imputation methods for handling item-non-response in the social sciences: a methodological review. Tech. Rep. NCRM/002, National Centre for Research Methods and Southampton Statistical Sciences Research Institute, University of Southampton
Myrtveit I, Stensrud E, Olsson U (2002) Analyzing data sets with missing data: an empirical evaluation of imputation methods and likelihood-based methods. IEEE Trans Softw Eng 27(11):999–1013
Klir G, Yuan B (1995) Fuzzy sets and fuzzy logic, theory and applications. Prentice-Hall, New Jersey
Tanaka K (1997) An introduction to fuzzy logic for practical applications. Springer, New York
Yager RR, Filev DP (1996) Relational partitioning of fuzzy rules. Fuzzy Sets Syst 80(1):57–69
Dubois D, Prade H (1996) What are fuzzy rules and how to use them. Fuzzy Sets Syst 84(2):169–185
Pedrycz W (1992) Fuzzy neural networks with reference neurons as pattern classifiers. IEEE Trans Neural Netw 3(5):770–775
Mitra S, Pal SK (1994) Self-organizing neural network as a fuzzy classifier. IEEE Trans Syst Man Cybern A Syst Hum 24(3):385–399
Meneganti M, Saviello FS, Tagliaferri R (1998) Fuzzy neural networks for classification and detection of anomalies. IEEE Trans Neural Netw 9(5):848–861
Gabrys B (2004) Learning hybrid neuro-fuzzy classifier models from data: to combine or not to combine? Fuzzy Sets Syst 147:39–56
Mitra S, Pal SK, Mitra P (2002) Data mining in soft computing framework: a survey. IEEE Trans Neural Netw 13(1):3–14
Simpson PK (1992) Fuzzy min–max neural networks—part 1: classification. IEEE Trans Neural Netw 3:776–786
Simpson PK (1993) Fuzzy min–max neural networks—part 2: clustering. IEEE Trans Fuzzy Syst 1:32–45
Gabrys B, Bargiela A (2000) General fuzzy min–max neural network for clustering and classification. IEEE Trans Neural Netw 11:769–783
Gabrys B (2002) Neuro-fuzzy approach to processing inputs with missing values in pattern recognition problems. Int J Approx Reason 30:149–179
Quteishat M, Lim CP (2006) A modified fuzzy min–max neural network and its application to fault classification. In: 11th Online world conference soft computing in industrial applications (WSC11)
Gabrys B (2002) Agglomerative learning algorithms for general fuzzy min–max neural network. J VLSI Signal Process 32:67–82
Bargiela A, Pedrycz W, Tanaka M (2004) An inclusion/exclusion fuzzy hyperbox classifier. Int J Knowl Based Intell Eng Syst 8(2):91–98
Nandedkar P, Biswas PK (2007) A fuzzy min–max neural network classifier with compensatory neuron architecture. IEEE Trans Neural Netw 18(1):42–54
Brouwer RK (2002) A feed-forward network for input which is both categorical and quantitative. Neural Netw 15(7):881–890
Farhangfar A, Kurgan LA, Pedrycz W (2007) A novel framework for imputation of missing values in databases. IEEE Trans Syst Man Cybern A Syst Hum 37(5):692–709
Zhang X, Hang CH, Tan S, Wang P (1996) The min–max function differentiation and training of fuzzy neural networks. IEEE Trans Neural Netw 7(5):1139–1150
Song Q, Shepperd M (2007) Missing data imputation techniques. Int J Bus Intell Data Min 2(3):262–291
Greenacre MJ (1984) Theory and applications of correspondence analysis. Academic Press, London
Cox R (2006) Principles of statistical inference. Cambridge University Press, Cambridge
Allison P (2002) Missing data. Sage, California
Little RJ, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, New York
Cardeñosa J, Rey-del-Castillo P (2007) A fuzzy control approach for vote estimation. In: Proceedings of 5th international conference on information technologies and applications, Varna
Abdella M, Marwala T (2005) The use of genetic algorithms and neural networks to approximate missing data in databases. In: IEEE 3rd international conference on computational cybernetics, pp 207–212
Nelwamondo V, Mohamed S, Marwala T (2007) Missing data: a comparison of neural network and expectation maximization techniques. Curr Sci 93(11):1514–1521
Lingras P, Zhong M, Sharma S (2008) Evolutionary regression and neural imputations of missing values. In: soft computing applications in industry. Studies in Fuzziness and Soft Computing Series, vol 226. Springer, Berlin, pp 151–163
Witten H, Frank E (2005) Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, USA
Santner TJ, Duffy DE (1986) A note on A. Albert and J. A. Anderson’s conditions for the existence of maximum likelihood estimates in logistic regression models. Biometrika 73:755–758
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rey-del-Castillo, P., Cardeñosa, J. Fuzzy min–max neural networks for categorical data: application to missing data imputation. Neural Comput & Applic 21, 1349–1362 (2012). https://doi.org/10.1007/s00521-011-0574-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-011-0574-x