Skip to main content

Fuzzy min–max neural networks for categorical data: application to missing data imputation

Abstract

The fuzzy min–max neural network classifier is a supervised learning method. This classifier takes the hybrid neural networks and fuzzy systems approach. All input variables in the network are required to correspond to continuously valued variables, and this can be a significant constraint in many real-world situations where there are not only quantitative but also categorical data. The usual way of dealing with this type of variables is to replace the categorical by numerical values and treat them as if they were continuously valued. But this method, implicitly defines a possibly unsuitable metric for the categories. A number of different procedures have been proposed to tackle the problem. In this article, we present a new method. The procedure extends the fuzzy min–max neural network input to categorical variables by introducing new fuzzy sets, a new operation, and a new architecture. This provides for greater flexibility and wider application. The proposed method is then applied to missing data imputation in voting intention polls. The micro data—the set of the respondents’ individual answers to the questions—of this type of poll are especially suited for evaluating the method since they include a large number of numerical and categorical attributes.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

References

  1. 1.

    Rubin DB (1976) Inference and missing data. Biometrika 63:581–592

    MathSciNet  MATH  Article  Google Scholar 

  2. 2.

    Rubin DB (1977) Formalizing subjective notions about the effect of non-respondents in sample surveys. J Am Stat Assoc 72(359):538–543

    MATH  Article  Google Scholar 

  3. 3.

    Dempster P, Rubin DB (1983) Incomplete data in sample surveys. In: Madow WG, Olkin I, Rubin DB (eds) Sample surveys. II. Theory and Annotated Bibliograph. Academic Press, New York

    Google Scholar 

  4. 4.

    Schafer JL, Graham JW (2002) Missing data: our view of the state of the art. Psychol Methods 7(2):147–177

    Article  Google Scholar 

  5. 5.

    Durrant GB (2005) Imputation methods for handling item-non-response in the social sciences: a methodological review. Tech. Rep. NCRM/002, National Centre for Research Methods and Southampton Statistical Sciences Research Institute, University of Southampton

  6. 6.

    Myrtveit I, Stensrud E, Olsson U (2002) Analyzing data sets with missing data: an empirical evaluation of imputation methods and likelihood-based methods. IEEE Trans Softw Eng 27(11):999–1013

    Article  Google Scholar 

  7. 7.

    Klir G, Yuan B (1995) Fuzzy sets and fuzzy logic, theory and applications. Prentice-Hall, New Jersey

    MATH  Google Scholar 

  8. 8.

    Tanaka K (1997) An introduction to fuzzy logic for practical applications. Springer, New York

    Google Scholar 

  9. 9.

    Yager RR, Filev DP (1996) Relational partitioning of fuzzy rules. Fuzzy Sets Syst 80(1):57–69

    MathSciNet  Article  Google Scholar 

  10. 10.

    Dubois D, Prade H (1996) What are fuzzy rules and how to use them. Fuzzy Sets Syst 84(2):169–185

    MathSciNet  MATH  Article  Google Scholar 

  11. 11.

    Pedrycz W (1992) Fuzzy neural networks with reference neurons as pattern classifiers. IEEE Trans Neural Netw 3(5):770–775

    Article  Google Scholar 

  12. 12.

    Mitra S, Pal SK (1994) Self-organizing neural network as a fuzzy classifier. IEEE Trans Syst Man Cybern A Syst Hum 24(3):385–399

    Article  Google Scholar 

  13. 13.

    Meneganti M, Saviello FS, Tagliaferri R (1998) Fuzzy neural networks for classification and detection of anomalies. IEEE Trans Neural Netw 9(5):848–861

    Article  Google Scholar 

  14. 14.

    Gabrys B (2004) Learning hybrid neuro-fuzzy classifier models from data: to combine or not to combine? Fuzzy Sets Syst 147:39–56

    MathSciNet  MATH  Article  Google Scholar 

  15. 15.

    Mitra S, Pal SK, Mitra P (2002) Data mining in soft computing framework: a survey. IEEE Trans Neural Netw 13(1):3–14

    Article  Google Scholar 

  16. 16.

    Simpson PK (1992) Fuzzy min–max neural networks—part 1: classification. IEEE Trans Neural Netw 3:776–786

    Article  Google Scholar 

  17. 17.

    Simpson PK (1993) Fuzzy min–max neural networks—part 2: clustering. IEEE Trans Fuzzy Syst 1:32–45

    Article  Google Scholar 

  18. 18.

    Gabrys B, Bargiela A (2000) General fuzzy min–max neural network for clustering and classification. IEEE Trans Neural Netw 11:769–783

    Article  Google Scholar 

  19. 19.

    Gabrys B (2002) Neuro-fuzzy approach to processing inputs with missing values in pattern recognition problems. Int J Approx Reason 30:149–179

    MathSciNet  MATH  Article  Google Scholar 

  20. 20.

    Quteishat M, Lim CP (2006) A modified fuzzy min–max neural network and its application to fault classification. In: 11th Online world conference soft computing in industrial applications (WSC11)

  21. 21.

    Gabrys B (2002) Agglomerative learning algorithms for general fuzzy min–max neural network. J VLSI Signal Process 32:67–82

    MATH  Article  Google Scholar 

  22. 22.

    Bargiela A, Pedrycz W, Tanaka M (2004) An inclusion/exclusion fuzzy hyperbox classifier. Int J Knowl Based Intell Eng Syst 8(2):91–98

    Google Scholar 

  23. 23.

    Nandedkar P, Biswas PK (2007) A fuzzy min–max neural network classifier with compensatory neuron architecture. IEEE Trans Neural Netw 18(1):42–54

    Article  Google Scholar 

  24. 24.

    Brouwer RK (2002) A feed-forward network for input which is both categorical and quantitative. Neural Netw 15(7):881–890

    Article  Google Scholar 

  25. 25.

    Farhangfar A, Kurgan LA, Pedrycz W (2007) A novel framework for imputation of missing values in databases. IEEE Trans Syst Man Cybern A Syst Hum 37(5):692–709

    Article  Google Scholar 

  26. 26.

    Zhang X, Hang CH, Tan S, Wang P (1996) The min–max function differentiation and training of fuzzy neural networks. IEEE Trans Neural Netw 7(5):1139–1150

    Article  Google Scholar 

  27. 27.

    Song Q, Shepperd M (2007) Missing data imputation techniques. Int J Bus Intell Data Min 2(3):262–291

    Google Scholar 

  28. 28.

    Greenacre MJ (1984) Theory and applications of correspondence analysis. Academic Press, London

    MATH  Google Scholar 

  29. 29.

    Cox R (2006) Principles of statistical inference. Cambridge University Press, Cambridge

    MATH  Book  Google Scholar 

  30. 30.

    Allison P (2002) Missing data. Sage, California

    MATH  Google Scholar 

  31. 31.

    Little RJ, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  32. 32.

    Cardeñosa J, Rey-del-Castillo P (2007) A fuzzy control approach for vote estimation. In: Proceedings of 5th international conference on information technologies and applications, Varna

  33. 33.

    Abdella M, Marwala T (2005) The use of genetic algorithms and neural networks to approximate missing data in databases. In: IEEE 3rd international conference on computational cybernetics, pp 207–212

  34. 34.

    Nelwamondo V, Mohamed S, Marwala T (2007) Missing data: a comparison of neural network and expectation maximization techniques. Curr Sci 93(11):1514–1521

    Google Scholar 

  35. 35.

    Lingras P, Zhong M, Sharma S (2008) Evolutionary regression and neural imputations of missing values. In: soft computing applications in industry. Studies in Fuzziness and Soft Computing Series, vol 226. Springer, Berlin, pp 151–163

  36. 36.

    Witten H, Frank E (2005) Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, USA

    MATH  Google Scholar 

  37. 37.

    Santner TJ, Duffy DE (1986) A note on A. Albert and J. A. Anderson’s conditions for the existence of maximum likelihood estimates in logistic regression models. Biometrika 73:755–758

    MathSciNet  MATH  Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Pilar Rey-del-Castillo.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Rey-del-Castillo, P., Cardeñosa, J. Fuzzy min–max neural networks for categorical data: application to missing data imputation. Neural Comput & Applic 21, 1349–1362 (2012). https://doi.org/10.1007/s00521-011-0574-x

Download citation

Keywords

  • Classification
  • Fuzzy systems
  • Fuzzy min–max neural networks
  • Imputation
  • Missing data