Data Mining with Neural Networks and Support Vector Machines Using the R/rminer Tool

  • Paulo Cortez
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6171)

Abstract

We present rminer, our open source library for the R tool that facilitates the use of data mining (DM) algorithms, such as neural Networks (NNs) and support vector machines (SVMs), in classification and regression tasks. Tutorial examples with real-world problems (i.e. satellite image analysis and prediction of car prices) were used to demonstrate the rminer capabilities and NN/SVM advantages. Additional experiments were also held to test the rminer predictive capabilities, revealing competitive performances.

Keywords

Classification Regression Sensitivity Analysis Neural Networks Support Vector Machines 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Asuncion, A., Newman, D.: UCI Machine Learning Repository, Univ. of California, Irvine (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
  2. 2.
    Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., Wirth, R.: CRISP-DM 1.0: Step-by-step data mining guide. CRISP-DM consortium (2000)Google Scholar
  3. 3.
    Cherkassy, V., Ma, Y.: Practical Selection of SVM Parameters and Noise Estimation for SVM Regression. Neural Networks 17(1), 113–126 (2004)CrossRefGoogle Scholar
  4. 4.
    Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J.: Modeling wine preferences by data mining from physicochemical properties. Decision Support Systems 47(4), 547–553 (2009)CrossRefGoogle Scholar
  5. 5.
    Cortez, P., Lopes, C., Sousa, P., Rocha, M., Rio, M.: Symbiotic Data Mining for Personalized Spam Filtering. In: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (WI 2009), pp. 149–156. IEEE, Los Alamitos (2009)Google Scholar
  6. 6.
    Cortez, P., Teixeira, J., Cerdeira, A., Almeida, F., Matos, T., Reis, J.: Using data mining for wine quality assessment. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds.) DS 2009. LNCS, vol. 5808, pp. 66–79. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  7. 7.
    Goebel, M., Gruenwald, L.: A Survey of Data Mining and Knowledge Discovery Software Tools. SIGKDD Explorations 1(1), 20–33 (1999)CrossRefGoogle Scholar
  8. 8.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, NY (2008)Google Scholar
  9. 9.
    Kewley, R., Embrechts, M., Breneman, C.: Data Strip Mining for the Virtual Design of Pharmaceuticals with Neural Networks. IEEE Trans. Neural Networks 11(3), 668–679 (2000)CrossRefGoogle Scholar
  10. 10.
    Piatetsky-Shapiro, G.: Data Mining Tools Used Poll (2009), http://www.kdnuggets.com/polls/2009/data-mining-tools-used.htm
  11. 11.
    Provost, F., Domingos, P.: Tree Induction for Probability-Based Ranking. Machine Learning 52(3), 199–215 (2003)MATHCrossRefGoogle Scholar
  12. 12.
    R Development Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2009), ISBN 3-900051-00-3 http://www.R-project.org
  13. 13.
    Rexer, K.: Second annual data miner survey. Technical report, Rexer Analytics (2008)Google Scholar
  14. 14.
    Rocha, M., Cortez, P., Neves, J.: Evolution of Neural Networks for Classification and Regression. Neurocomputing 70, 2809–2816 (2007)CrossRefGoogle Scholar
  15. 15.
    Tinoco, J., Correia, A.G., Cortez, P.: A Data Mining Approach for Jet Grouting Uniaxial Compressive Strength Prediction. In: World Congress on Nature and Biologically Inspired Computing (NaBIC 2009), Coimbatore, India, December 2009, pp. 553–558. IEEE, Los Alamitos (2009)CrossRefGoogle Scholar
  16. 16.
    Turban, E., Sharda, R., Aronson, J., King, D.: Business Intelligence, A Managerial Approach. Prentice-Hall, Englewood Cliffs (2007)Google Scholar
  17. 17.
    Williams, G.: Rattle: A Data Mining GUI for R. The R Journal 1(2), 45–55 (2009)Google Scholar
  18. 18.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, 2nd edn. Morgan Kaufmann, San Francisco (2005)Google Scholar
  19. 19.
    Wu, T.F., Lin, C.J., Weng, R.C.: Probability estimates for multi-class classification by pairwise coupling. The Journal of Machine Learning Research 5, 975–1005 (2004)MathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Paulo Cortez
    • 1
  1. 1.Department of Information Systems/R&D Centre AlgoritmiUniversity of MinhoGuimarãesPortugal

Personalised recommendations