Skip to main content

A Survey on Open Source Data Mining Tools for SMEs

  • Conference paper
  • First Online:
New Advances in Information Systems and Technologies

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 444))

Abstract

Data Mining is a software tool dedicated to scan data repositories, generate information, and discover knowledge. Currently, the traditional data processing tools and its applications are not capable of managing the massive amounts of data available inside SMEs environments. Therefore, it is critical to use effective and efficient Data Mining tools which represent a valuable support for SMEs decision-making. In this paper we describe and analyze seven popular open source data mining tools—KEEL, KNIME, Orange, RapidMiner, R Project, Tanagra and WEKA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fayyad, M. U., Piatetsky-Shapiro, G., Smyth, P. Advances in knowledge discovery and data mining, pp. 1—34. American Association for Artificial Intelligence, Menlo Park, CA (1996)

    Google Scholar 

  2. Borges, C. L., Marques, M. V., Bernardino, J. Comparison of data mining techniques and tools for data classification. C3S2E ‘13 Proceedings of the International C* Conference on Computer Science and Software Engineering. pp 113–116 (2013)

    Google Scholar 

  3. Witten, H. I., Frank, E., Hall, A. M. Data Mining: Practical Machine Learning Tools and Techniques, 3rd Edition. Morgan Kaufmann, Massachusetts (2011)

    Google Scholar 

  4. Hasim, N., Haris, A. N. A study of open-source data mining tools for forecasting. IMCOM - Proc. of 9th Int. Conf, on Ubiquitous Information Management and Communication. (2015)

    Google Scholar 

  5. Shearer, C.: The CRISP-DM Model: The New Blueprint for Data Mining. Journal of Data Warehousing. 5, 13—23 (2000)

    Google Scholar 

  6. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd Edition. Morgan Kaufmann, Massachusetts (2012)

    Google Scholar 

  7. Bell, G., Gray, N. J.: The revolution yet to happen. Beyond calculation. Copernicus, New York (1997)

    Google Scholar 

  8. Jovic, A., Brkic, K., Bogunovic, N.. An overview of free software tools for general data mining. 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO). pp 1112 -– 1117 (2014)

    Google Scholar 

  9. Fernández, A., Río, S., López, V., Bawakid, A., Jesus, M. J., Benítez, J. M., & Herrera, F. Big Data with Cloud Computing: an insight on the computing environment, MapReduce, and programming frameworks. WIREs Data Mining Knowl Discov, 4: 380–409. (2014)

    Google Scholar 

  10. Alcalá-Fdez, J., Fernandez, A., Luengo, J., Derrac, J., García, S., Sánchez, L., Herrera, F.. KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework. Journal of Soft Computing 17:2-3 pp. 255—287 (2011)

    Google Scholar 

  11. KNIME, http://www.knime.org

  12. Demšar, J., Curk, T., Erjavec, A. Orange: Data Mining Toolbox in Python; Journal of Machine Learning Research, vol. 14, pp. 2349−2353, (2013)

    Google Scholar 

  13. Rapid Miner, http://rapidminer.com

  14. R Project, http://www.r-project.org/

  15. Hornik, K. The R FAQ. http://cran.r-project.org/doc/FAQ/R-FAQ (2015)

  16. Morandat F., Hill, B., Osvald, L., Vitek, J. Evaluating the design of the R language: objects and functions for data analysis. ECOOP’12 Proceedings of the 26th European conference on Object-Oriented Programming, pp 104–131 (2012)

    Google Scholar 

  17. RStudio, http://www.rstudio.com/products/rstudio

  18. Rakotomalala, R. TANAGRA : un logiciel gratuit pour l’enseignement et la recherché. Actes de EGC’2005, RNTI-E-3, vol. 2, pp.697–702, (2005)

    Google Scholar 

  19. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, H. I.. The WEKA Data Mining Software: An Update. SIGKDD Explorations, vol. 11 (1), pp. 10–18 (2009)

    Google Scholar 

  20. Olowe-Adedoyin, M., Gabet, M., Stahl, F. A Survey of Data Mining Techniques for Social

    Google Scholar 

  21. Network Analysis. Journal of Data Mining & Digital Humanities, vol. 2014 (2014)

    Google Scholar 

  22. Talia, D. 2nd IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services (ICSDM), pp 1-4 (2015)

    Google Scholar 

  23. Woerner, S., Wixom, B. Big data: extending the business strategy toolbox. Journal of Information Technology, vol. 30, pp 60-62 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pedro Almeida .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Almeida, P., Bernardino, J. (2016). A Survey on Open Source Data Mining Tools for SMEs. In: Rocha, Á., Correia, A., Adeli, H., Reis, L., Mendonça Teixeira, M. (eds) New Advances in Information Systems and Technologies. Advances in Intelligent Systems and Computing, vol 444. Springer, Cham. https://doi.org/10.1007/978-3-319-31232-3_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-31232-3_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-31231-6

  • Online ISBN: 978-3-319-31232-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics