Performance Evaluation of Data Mining Techniques

Conference paper
Part of the Lecture Notes in Networks and Systems book series (LNNS, volume 9)

Abstract

Data mining has gained immense popularity in various fields of medical, education and industry as well. Data mining is a process of predicting the result and extraction of useful information from huge dataset. In this paper, we have surveyed various data mining techniques. Further, performance of various data mining techniques, namely decision tree, random forest, naive Bayes, AdaBoost, multilayer perception neural network, radial basis function, sequential minimal optimization and decision stump, have been evaluated using UCI communities and crime dataset for classifying crime in US states. On the basis of results obtained, we found that the decision tree outperforms with 96.4% accuracy and minimal false-positive rate.

Keywords

Data mining Classification techniques Decision tree 

References

  1. 1.
    Kirkos E, Spathis C (2007) Data mining techniques for the detection of fraudulent financial statements. Expert Syst Appl: Int J 995–1003Google Scholar
  2. 2.
    Merceron A, Yacef K (2005) Educational data mining: a case study. In: Proceedings of the 2005 conference on artificial intelligence in education: supporting learning through intelligent and socially informed technology. IOS Press, Amsterdam, Netherland, pp 467–474Google Scholar
  3. 3.
    Bâra A, Lungu I (2012) Improving decision support systems. In: Advances in data mining knowledge discovery and applications, pp 397–417Google Scholar
  4. 4.
    Lakshmi BN, Raghunandhan G (2011) A conceptual overview of data mining. In: 2011 national conference on innovations in emerging technology (NCOIET). IEEE, Erode, Tamil Nadu, pp 27–32Google Scholar
  5. 5.
    Purwar A, Singh SK (2014) Issues in data mining: a comprehensive survey. In: 2014 IEEE international conference on computational intelligence and computing research (ICCIC). IEEE, Coimbatore, pp 1–6Google Scholar
  6. 6.
  7. 7.
    Chen L, Li X, Yang Y (2016) Personal health indexing based on medical examinations. Decis Support Syst 54–65Google Scholar
  8. 8.
    Shouman M, Turner T (2012) Using data mining techniques in heart disease diagnosis and treatment. In: 2012 Japan-Egypt conference electronics, communications and computers (JEC-ECC). IEEE, Alexandria, pp 173–177Google Scholar
  9. 9.
    Kumar S, Toshniwal D (2015) A data mining framework to analyze road accident data. J Big DataGoogle Scholar
  10. 10.
    Bahari TF, Elayidom MS (2015) An efficient CRM-data mining framework for the prediction of customer behaviour. In: Proceedings of the international conference on information and communication technologies, ICICT 2014. Elsevier, Kochi, pp 725–731Google Scholar
  11. 11.
    Anand SS, Grobelnik M (2007) Knowledge discovery standards. Artif Intell Rev 21–56Google Scholar
  12. 12.
    Han J, Kamber M (2012) Data mining concept and techniques. Elsevier, USAGoogle Scholar
  13. 13.
    Crone SF, Lessmann S (2006) The impact of preprocessing on data mining: an evaluation of classifier sensitivity in direct marketing. Eur J Oper Res 781–800Google Scholar
  14. 14.
    Ramaswami M, Bhaskaran R (2009) A study on feature selection techniques in educational data mining. J Comput 7–11Google Scholar
  15. 15.
    Barros RC, Basgalupp MP (2012) A survey of evolutionary algorithms for decision-tree induction. IEEE Trans Syst Man Cybern Part C: Appl RevGoogle Scholar
  16. 16.
    Mantaras RL (1991) A distance-based attribute selection measure. Mach Learn 81–92Google Scholar
  17. 17.
    Quinlan JR (1986) Induction of decision trees. Mach Learn 81–106Google Scholar
  18. 18.
    Breiman L (2001) Random forests. Mach Learn 5–32Google Scholar
  19. 19.
    Kulkarni VY, Sinha PK (2013) Random forest classifiers: a survey and future research direction. Int J Adv ComputGoogle Scholar
  20. 20.
    Breiman L (1996) Bagging predictor. Mach Learn 123–140Google Scholar
  21. 21.
    Schapire RE (2002) The boosting approach to machine learning: an overview. In: Nonlinear estimation and classification, pp 149–171Google Scholar
  22. 22.
    Platt JC (1999) Fast training of support vector machines using sequential minimal optimization. MIT Press, Cambridge, MA, USAGoogle Scholar
  23. 23.
    Verma B (2002) Fast training of multilayer perceptrons. IEEE Trans Neural Netw 1314–1320Google Scholar
  24. 24.
    Delashmit WH, Manry MT (2005) Recent developments in multilayer perceptron neural networks. In: Proceedings of the 7th annual memphis area engineering and science conferenceGoogle Scholar
  25. 25.
    Orr MJ (1996) Introduction to radial basis function networkGoogle Scholar
  26. 26.
    Oyang Y-J, Hwang S-C, Ou Y-Y, Chen CY, Chen ZW (2005) Data classification with radial basis function networks based on a novel kernel density estimation algorithm. IEEE Trans Neural Netw 225–236Google Scholar
  27. 27.
    Schapire RE (2013) Explaining AdaBoost. In: Empirical inference, pp 37–52Google Scholar
  28. 28.
    Choy M (2010) Building decision trees from decision stumpsGoogle Scholar
  29. 29.
    Iba W, Langley P (1992) Induction of one-level decision tree. In: Proceedings of the ninth international workshop on machine learning, ML ’92, USA, pp 233–240Google Scholar
  30. 30.
    Akinola OS, Afolabi AC (2012) Evaluating classification effectiveness on sequential minimal optimization (SMO) algorithm chemical parameterization of granitoids. IJRRASGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.USICTGuru Gobind Singh Indraprastha UniversityDwarkaIndia
  2. 2.Ambedkar Institute of Advanced Communication Technologies and ResearchGeeta ColonyIndia

Personalised recommendations