Skip to main content

Finding Suitable Data Mining Techniques for Software Development Effort Estimation

  • Conference paper
  • First Online:
Intelligent Computing (SAI 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 739))

Included in the following conference series:

  • 662 Accesses


An organization's new projects all go through an analysis process. The data gathered throughout the study serve as the cornerstone for important choices about complexity, resources, frameworks, timetables, costs, etc. Numerous methods have been developed throughout time to make the project analysis phase simpler, but most of them are still insufficient when it comes to the accuracy of the results. Without a precise analysis technique in place, even initiatives with clear goals might unravel in the later stages. Software project management still faces challenges in producing accurate and trustworthy estimates of software effort, particularly in the early stages of the software life cycle when the information available is more categorical than numerical. Predicting the number of person-hours, or months, required for software development is seen as a difficult task in Software Effort Estimation (SEE). Project cancellation or project failure is the outcome of overestimating or underestimating the software effort. Although useful sizing tools and methods derived from function points don't take into consideration the unique project management culture of a business. Data processing techniques are being investigated as a substitute estimation method as a result of these shortcomings in recent years. This research aims to propose a mixture method of functional sizing measurement and three data processing methods for effort estimation at the first stage of projects: Generalized Linear Models (GLM), Deep Learning Neural Networks (DLNN), and Decision Trees - Gradient Boosting Machine (GBM). These models’ estimation accuracies were contrasted so as to assess their potential value for implementation within businesses. Additionally, a combined strategy that mixes the output of the many algorithms is usually recommended so as to enhance prediction accuracy and forestall the incidence of over-fitting.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others


  1. Sami, M.: 5 Steps to Software Development Effort Estimation (2018)

    Google Scholar 

  2. Silhavy, P., Silhavy, R., Prokopova, Z.: Categorical variable segmentation model for software development effort estimation. IEEE Access 7, 9618–9626 (2019)

    Article  Google Scholar 

  3. Pospieszny, P., Czarnacka-Chrobot, B., Kobyliński, A.: Application of function points and data mining techniques for software estimation - a combined approach. In: Kobyliński, A., Czarnacka-Chrobot, B., Świerczek, J. (eds.) IWSM/Mensura -2015. LNBIP, vol. 230, pp. 96–113. Springer, Cham (2015).

    Chapter  Google Scholar 

  4. Varshini1, A.G.P., Kumari, K.A.: Predictive Analytics Approaches for Software Effort Estimation: A Review (2020)

    Google Scholar 

  5. Nassif, A.B., et al.: Software Development Effort Estimation Using Regression Fuzzy Models (2019)

    Google Scholar 

  6. International Function Point Users Group (IFPUG) Simple Function Point (SFP) Counting Practices Manual Release 2.1 (2021)

    Google Scholar 

  7. Kamber, H., et al.: Data Mining: Concepts and Techniques (3rd Ed.). Morgan Kaufmann. ISBN 978-0-12-381479-1 (2011)

    Google Scholar 

  8. ACM SIGKDD (2006–04–30), Retrieved (2014–01–27): Data Mining Curriculum

    Google Scholar 

  9. Clifton, C.: Encyclopedia Britannica: Definition of Data Mining (2010). Retrieved 9 Dec 2010

    Google Scholar 

  10. Trevor, H., et al.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Archived from the original on 2009–11–10 (2009). Retrieved 7 Aug 2012

    Google Scholar 

  11. Jiawei, H., Micheline, K.: Data Mining: Concepts and Techniques (2000)

    Google Scholar 

  12. Sehra, S.K., et al.: Analysis of Data Mining techniques for software effort estimation (2014)

    Google Scholar 

  13. Dejaeger, K., et al.: Data Mining Techniques for Software Effort Estimation: A Comparative Study (2012)

    Google Scholar 

  14. Weiss, G.M., Davison, B.D.: Data mining. In: Bidgoli, H. (ed.) Handbook of Technology Management. Wiley (2010)

    Google Scholar 

  15. Berson, A., et al.: An Overview of Data Mining Techniques (Excerpts from the book by Alex Berson, Stephen Smith, and Kurt Thearling) (2005)

    Google Scholar 

  16. Mehmed, K.: Data Mining Concepts, Models, Methods, and Algorithms, 2nd edn (2011)

    Google Scholar 

  17. Software Testing Help: Data Mining Techniques: Algorithm, Methods & Top Data Mining Tools (2020)

    Google Scholar 

  18. Kushwaha, D.S., Misra, A.K.: Software Test Effort Estimation (2008)

    Google Scholar 

Download references


I, Julius Olufemi Ogunleye (the author), would love to express my gratitude to Ass. Prof. Zdenka Prokopova and Ass. Prof. Petr Silhavy for their support and guidance towards making this research work possible. This work was supported by the Faculty of Applied Informatics, Tomas Bata University in Zlín, under Projects IGA/CebiaTech/2023/001.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Julius Olufemi Ogunleye .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ogunleye, J.O. (2023). Finding Suitable Data Mining Techniques for Software Development Effort Estimation. In: Arai, K. (eds) Intelligent Computing. SAI 2023. Lecture Notes in Networks and Systems, vol 739. Springer, Cham.

Download citation

Publish with us

Policies and ethics