Neural Computing and Applications

, Volume 18, Issue 7, pp 689–706 | Cite as

A joint investigation of misclassification treatments and imbalanced datasets on neural network performance

  • Jyh-shyan Lan
  • Victor L. Berardi
  • B. Eddy Patuwo
  • Michael Hu
Original Article

Abstract

Two important factors that impact a classification model’s performance are imbalanced data and unequal misclassification cost consequences. These are especially important considerations for neural network models developed to estimate the posterior probabilities of group membership used in classification decisions. This paper explores the issues of asymmetric misclassification costs and unbalanced group sizes on neural network classification performance using an artificial data approach that is capable of generating more complex datasets than used in prior studies and which adds new insights to the problem and the results. A different performance measure, that is capable of directly measuring classification performance consistency with Bayes decision rule, is used. The results show that both asymmetric misclassification costs and imbalanced group sizes have significant effects on neural network classification performance both independently and via interaction effects. These are not always intuitive; they supplement prior findings, and raise issues for the future.

Keywords

Artificial intelligence Neural network Decision analysis Bayesian classification Imbalanced datasets Misclassification costs 

References

  1. 1.
    Barnard E, Botha E (1993) Backpropagation uses prior information efficiently. IEEE Trans Neural Netw 4(5):794–802. doi:10.1109/72.248457 CrossRefGoogle Scholar
  2. 2.
    Berardi VL, Patuwo BE, Hu M (2004) A principled approach for building and evaluating neural network classifiers for e-commerce applications. Decis Support Syst 38(2):233–246. doi:10.1016/S0167-9236(03)00093-9 CrossRefGoogle Scholar
  3. 3.
    Berardi VL, Patuwo BE, Hu M, Kline DM (2007) Using artificial data to access neural network classification performance. Technical ReportGoogle Scholar
  4. 4.
    Berardi VL, Zhang GP (1999) The effect of misclassification costs on neural network classifiers. Decis Sci 30(3):659–682. doi:10.1111/j.1540-5915.1999.tb00902.x CrossRefGoogle Scholar
  5. 5.
    Chawla N, Bowyer K, Hall L, Kegelmeyer W (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357MATHGoogle Scholar
  6. 6.
    Chawla N, Japkowicz N, Kolcz A (eds) (2004) Special issue on learning from imbalanced datasets. SIGKDD 6(1):ACM PressGoogle Scholar
  7. 7.
    Cybenko G (1989) Approximation by superposition of a sigmoidal function, mathematics of control, signals, and systems. 2:303–314Google Scholar
  8. 8.
    Duda RO, Hart PE (1973) Pattern classification and scene analysis. Wiley, New YorkMATHGoogle Scholar
  9. 9.
    Elazmeh W, Japkowicz N, Matwin S (2006) A framework for measuring classification difference with imbalance (technical report ws-06-06). AAAI press, Menlo ParkGoogle Scholar
  10. 10.
    Fawcett T, Provost F (1996) Combining data mining and machine learning for effective user profile. Proceedings of the 2nd international conference on knowledge discovery and data mining. pp 8–13Google Scholar
  11. 11.
    Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188Google Scholar
  12. 12.
    Geman S, Bienenstock E, Doursat R (1992) Neural networks and the bias/variance dilemma. Neural Comput 4(1):1–58Google Scholar
  13. 13.
    Holte RC, Acker LE, Porter BW (1989) Concept learning and the accuracy of small disjuncts. Proceedings of the 11th international joint conference on artificial intelligence. Morgan Kaufmann, Detroit, pp 813–818Google Scholar
  14. 14.
    Hornik K (1991) Approximation capabilities of multilayer feed-forward networks. Neural Netw 4:251–257. doi:10.1016/0893-6080(91)90009-T CrossRefGoogle Scholar
  15. 15.
    Hornik K, Stinchcombe M, White H (1989) Multilayer feed-forward networks are universal approximators. Neural Netw 2:359–366. doi:10.1016/0893-6080(89)90020-8 CrossRefGoogle Scholar
  16. 16.
    Hung MS, Hu MY, Patuwo BE, Shanker M (1996) Estimating posterior probabilities in classification problems with neural networks. Int J Comput Intell Organ 1:49–60Google Scholar
  17. 17.
    Japkowicz N (2000) Learning from imbalanced data sets: a comparison of various strategies. In: Japkowicz N (ed) Proceedings of the AAAI 2000 workshop on learning from imbalanced data sets. AAAI Press, Menlo ParkGoogle Scholar
  18. 18.
    Japkowicz N, Stephen S (2002) The class imbalance problem: a systematic study. Intell Data Anal 6(5):429–449MATHGoogle Scholar
  19. 19.
    Jo T, Japkowicz N (2004) Class imbalances versus small disjuncts. SIGKDD Explor Newsl 6(1):40–49. doi:10.1145/1007730.1007737 CrossRefMathSciNetGoogle Scholar
  20. 20.
    Kline DM, Berardi VL (2005) Revisiting squared-error and cross-entropy functions for training neural network classifiers. Neural computing and applications. (in press)Google Scholar
  21. 21.
    Kohers G, Rakes TR, Rees LP (1996) Predicting weekly portfolio returns with the use of composite models: a comparison of neural networks and traditional composite models. Proceedings of the 1996 annual meeting of the decision sciences institute, Atlanta, pp 1332–1334Google Scholar
  22. 22.
    Kubat M, Holte R, Matwin S (1998) Machine learning for the detection of oil spills in satellite radar images. Mach Learn 30:195–215. doi:10.1023/A:1007452223027 CrossRefGoogle Scholar
  23. 23.
    Lowe D, Webb AR (1990) Exploiting prior knowledge in network optimization: an illustration from medical prognosis. Network 1(3):299–323CrossRefGoogle Scholar
  24. 24.
    Lowe D, Webb AR (1991) Optimized feature extraction and the Bayes decision in feed-forward classifier networks. IEEE Trans Pattern Anal Mach Intell 13(4):355–364. doi:10.1109/34.88570 CrossRefGoogle Scholar
  25. 25.
    Maloof M (2003) Learning when data sets are imbalanced and when costs are unequal. Workshop on ICML 2003Google Scholar
  26. 26.
    Mazurowski M, Habas P, Zurada J, Lo J, Baker J, Tourassi G (2008) Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance. Neural Netw (in press)Google Scholar
  27. 27.
    Pearson R, Goney G, Shwaber J (2003) Imbalanced clustering for microarray time- series. Proceedings of the ICML 2003 workshop on learning from imbalanced data setsGoogle Scholar
  28. 28.
    Philipoom PR, Wiegmann L, Rees LP (1997) Cost-based due-date assignment with the use of classical and neural network approaches. Nav Res Logist 44(1):825–845CrossRefGoogle Scholar
  29. 29.
    Provost F, Fawcett T (2001) Robust classification for imprecise environments. Mach Learn 42(3):203–231MATHCrossRefGoogle Scholar
  30. 30.
    Quinlan (1991) Improved estimates for the accuracy of small disjuncts. Mach Learn 6(1):93Google Scholar
  31. 31.
    Richard MD, Lippmann RP (1991) Neural network classifiers estimate Bayesian posterior probabilities. Neural Comput 3:461–483. doi:10.1162/neco.1991.3.4.461 CrossRefGoogle Scholar
  32. 32.
    Salchenberger LM, Cinar EM, Lash NA (1992) Neural networks: a new tool for predicting thrift failures. Decis Sci 23(4):899–916. doi:10.1111/j.1540-5915.1992.tb00425.x CrossRefGoogle Scholar
  33. 33.
    Swets J, Pickett R (1982) Evaluation of diagnostic systems: methods from signal detection theory. Academic Press, New YorkGoogle Scholar
  34. 34.
    Tango T (1998) Equivalence test and confidence interval for the difference in proportions for the paired-sample design. Stat Med 17:891–908. doi:10.1002/(SICI)1097-0258(19980430)17:8<891::AID-SIM780>3.0.CO;2-B CrossRefGoogle Scholar
  35. 35.
    Visa S, Ralescu A (2003) Learning from imbalanced and overlapped data using fuzzy sets. Proceedings of ICML 2003 workshop: learning with imbalanced data sets II, pp 97–104Google Scholar
  36. 36.
    Weiss GM (1995) Learning with rare case and small disjuncts. Proceedings of the 17th international conference on machine learning. pp 558–565Google Scholar
  37. 37.
    Weiss GM, Hirsh H (2000) A quantitative study of small disjuncts. Proceedings of the 17th national conference on artificial intelligence. AAAI Press, Menlo Park, pp 665–670Google Scholar
  38. 38.
    Wu G, Chang EY (2003) Class-boundary alignment for imbalanced dataset learning. Proceedings of the ICML 2003 workshop on learning from imbalanced data setsGoogle Scholar
  39. 39.
    Zhou Z-Z, Liu X-Y (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 18(1):63–77. doi:10.1109/TKDE.2006.17 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2009

Authors and Affiliations

  • Jyh-shyan Lan
    • 1
  • Victor L. Berardi
    • 2
    • 3
  • B. Eddy Patuwo
    • 2
  • Michael Hu
    • 2
  1. 1.Providence UniversityTaichungTaiwan
  2. 2.Graduate School of ManagementKent State UniversityKentUSA
  3. 3.CantonUSA

Personalised recommendations