Skip to main content

An Engineering Approach to Data Mining Projects

  • Conference paper
Intelligent Data Engineering and Automated Learning - IDEAL 2007 (IDEAL 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4881))

Abstract

Both the number and complexity of Data Mining projects has increased in late years. Unfortunately, nowadays there isn’t a formal process model for this kind of projects, or existing approaches are not right or complete enough. In some sense, present situation is comparable to that in software that led to ’software crisis’ in latest 60’s. Software Engineering matured based on process models and methodologies. Data Mining’s evolution is being parallel to that in Software Engineering. The research work described in this paper proposes a Process Model for Data Mining Projects based on the study of current Software Engineering Process Models (IEEE Std 1074 and ISO 12207) and the most used Data Mining Methodology CRISP-DM (considered as a “facto” standard) as basic references.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Naur, P., Randell, B.: Software engineering: Report on NATO conference (1969)

    Google Scholar 

  2. Piatetsky-Shaphiro, G., Frawley, W.: Knowledge Discovery in Databases. AAAI/MIT Press, MA (1991)

    Google Scholar 

  3. Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., Wirth, R.: CRISP-DM 1.0 step-by-step data mining guide. Technical report, CRISP-DM (2000)

    Google Scholar 

  4. Eisenfeld, B., Kolsky, E., Topolinski, T.: 42 percent of CRM software goes unused (February 2003), http://www.gartner.com

  5. Eisenfeld, B., Kolsky, E., Topolinski, T., Hagemeyer, D., Grigg, J.: Unused CRM software increases TCO and decreases ROI (Febrero 2003), http://www.gartner.com

  6. Zornes, A.: The top 5 global 3000 data mining trends for 2003/04. META Group Research-Delta Summary, 2061 (March 2003)

    Google Scholar 

  7. Edelstein, H.A., Edelstein, H.C.: Building, Using, and Managing the Data Warehouse. In: Data Warehousing Institute, 1st edn., Prentice Hall PTR, Englewood Cliffs (1997)

    Google Scholar 

  8. Strand, M.: The Business Value of Data Warehouses - Opportunities, Pitfalls and Future Directions. PhD thesis, University of Skövde (December 2000)

    Google Scholar 

  9. Gondar, J.E.: Metodología Del Data Mining. Data Mining Institute, S.L (2005)

    Google Scholar 

  10. Pressman, R.: Software Engineering: A Practitioner’s Approach. McGraw-Hill, New York (2005)

    Google Scholar 

  11. Moore, J.: Software Engineering Standards: A User’s Road Map. IEEE, CA (1998)

    Google Scholar 

  12. Fayyad, U., Piatetsky-Shapiro, G., Smith, P., Uthurusamy, R.: Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, MA (1996)

    Google Scholar 

  13. Two Crows Corp. Introduction to Data Mining and Knowledge Discovery. 3rd edn. (1999)

    Google Scholar 

  14. SAS Institute. SEMMA data mining methodology (2005), http://www.sas.com

  15. de Martínez Pisón, F.J.: Optimización Mediante Técnicas de Minería de Datos Del Ciclo de Recocido de Una Línea de Galvanizado. PhD thesis, Universidad de La Rioja (2003)

    Google Scholar 

  16. Solarte, J.: A proposed data mining methodoloy and its aplication to industrial engineering. Master’s thesis, University of Tennessee, Knoxville (2002)

    Google Scholar 

  17. IEEE. Standard for Developing Software Life Cycle Processes. IEEE Std. 1074-1997. IEEE Computer Society, Nueva York (EE.UU.) (1991)

    Google Scholar 

  18. ISO. ISO/IEC Standard 12207:1995. Software Life Cycle Processes. International Organization for Standarization, Ginebra (Suiza) (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Hujun Yin Peter Tino Emilio Corchado Will Byrne Xin Yao

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Marbán, Ó., Mariscal, G., Menasalvas, E., Segovia, J. (2007). An Engineering Approach to Data Mining Projects. In: Yin, H., Tino, P., Corchado, E., Byrne, W., Yao, X. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2007. IDEAL 2007. Lecture Notes in Computer Science, vol 4881. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77226-2_59

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77226-2_59

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77225-5

  • Online ISBN: 978-3-540-77226-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics