Skip to main content

Predicting Code Runtime Complexity Using ML Techniques

  • Conference paper
  • First Online:
Advances in Computing and Information (ERCICA 2023)

Abstract

There are several approaches to solving every coding algorithm in Computer Science. To achieve the same result, these methods may use various techniques and reasoning. The difficulty is that as the number of inputs increases, certain algorithms tend to perform poorly. Several metrics may be used to assess the quality of any code. The code runtime complexity is one of these measurements. To determine this runtime complexity, substantial study and a thorough understanding of algorithms are necessary, which is a challenging manual undertaking. In this study, the worst-case runtime complexity of codes in programming languages C, Java and Python are calculated as Big-O notations utilising code features like Abstract Syntax Trees, ML approaches and static code analysis. The novelty of the research is our labelled runtime complexity dataset which was constructed manually, implementing Deep Learning Algorithms like Bi-LSTM and calculating Code Runtime Complexity for the worst-case scenario as Big-O notations for codes in three languages, C, Java and Python. To predict the runtime complexity for a given code more accurately than the traditional legacy methods such as manually asserting code runtime complexity or running code for different amounts of input, we have presented a more effective manner for the same. The results portray that the XGBoost classifier outperforms the other models with an accuracy of 96%. The current study can also be extended to other high-level programming languages, including more training samples and making use of graph neural networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Shunnarski A (2022) Welcome to the Big O Notation calculator! https://shunnarski.github.io/BigO.html. Accessed 05 May 2022

  2. Sikka J, Satya K, Kumar Y, Uppal S, Shah RR, Zimmermann R (2020) Learning based methods for code runtime complexity prediction. Lect Notes Comput Sci 12035:313–325

    Article  Google Scholar 

  3. Agenis-Nevers M, Bokde ND, Yaseen ZM, Shende MK (2020) An empirical estimation for time and memory algorithm complexities: newly developed R package. Multimedia Tools Appl 80(2):2997–3015

    Article  Google Scholar 

  4. Hutter F, Xu L, Hoos HH, Leyton-Brown K (2014) Algorithm runtime prediction: methods & evaluation. Artif Intell 206:79–111

    Article  MathSciNet  Google Scholar 

  5. Haridas P, Chennupati G, Santhi N, Romero P, Eidenbenz S (2020) Code characterization with graph convolutions and capsule networks. IEEE Access 8:136307–136315. https://doi.org/10.1109/ACCESS.2020.3011909

    Article  Google Scholar 

  6. Gao Y, Gu X, Zhang H, Lin H, Yang M (2021) Runtime performance prediction for deep learning models with graph neural network. MSR-TR-2021-3/Microsoft

    Google Scholar 

  7. Chen L, Ye W, Zhang S (2019) Capturing source code semantics via tree-based convolution over API-enhanced AST. In: Proceedings of the 16th ACM international conference on computing frontiers (n. Pag.)

    Google Scholar 

  8. Zhang J, Wang X, Zhang H, Sun H, Wang K, Liu X (2019) A novel neural source code representation based on abstract syntax tree. In: 2019 IEEE/ACM 41st international conference on software engineering (ICSE), pp 783–794. https://doi.org/10.1109/ICSE.2019.00086

  9. Lin C, Ouyang Z, Zhuang J, Chen J, Li H, Wu R (2021) Improving code summarization with block-wise abstract syntax tree splitting. In: 2021 IEEE/ACM 29th international conference on program comprehension (ICPC), pp 184–195. https://doi.org/10.1109/ICPC52881.2021.00026

  10. Kurniawati G, Karnalim O (2018) Introducing a practical educational tool for correlating algorithm time complexity with real program execution. J Inf Technol Comput Sci 3(1):1–15

    Google Scholar 

  11. Büch L, Andrzejak A (2019) Learning-based recursive aggregation of abstract syntax trees for code clone detection. In: 2019 IEEE 26th international conference on software analysis, evolution and reengineering (SANER), pp 95–104. https://doi.org/10.1109/SANER.2019.8668039

  12. Wang W, Li, Bo Ma, Xin Xia, Zhi Jin. “Detecting Code Clones with Graph Neural Network and Flow-Augmented Abstract Syntax Tree.” SANER 2020, London, ON, Canada 978–1–7281–5143–4/20/© 2020 IEEE.

    Google Scholar 

  13. Feng Q, Feng C, Hong W (2020) Graph neural network-based vulnerability predication. In: 2020 IEEE international conference on software maintenance and evolution (ICSME), pp 800–801. https://doi.org/10.1109/ICSME46990.2020.00096

  14. Reza SM, Rahman Md, Parvez Md, Badreddin O, Al Mamun S (2020) Performance analysis of machine learning approaches in software complexity prediction. https://doi.org/10.1007/978-981-33-4673-4_3

  15. Guzman J, Limoanco T (2017) An empirical approach to algorithm analysis resulting in approximations to big theta time complexity. J Softw 12:964–976. https://doi.org/10.17706/jsw.12.12.964-976

  16. Ströder T, Aschermann C, Frohn F, Hensel J, Giesl J (2015) Aprove: termination and memory safety of C programs. In: Tools and algorithms for the construction and analysis of systems, pp 417–419

    Google Scholar 

  17. Rozemberczki B, Kiss O, Sarkar R (2020) Karate Club: An API oriented open-source python framework for unsupervised learning on graphs. In: Presented at proceedings of the 29th ACM international conference on information & knowledge management, Ireland

    Google Scholar 

  18. Narayanan A, Chandramohan M, Venkatesan R, Chen L, Liu Y, Jaiswal S (2017) graph2vec: learning distributed representations of graphs

    Google Scholar 

  19. Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems (NIPS’17), Red Hook, NY, USA, pp 4768–4777

    Google Scholar 

  20. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nidhi Gupta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Deepa Shree, C.V., Kotian, J.D., Gupta, N., Adyapak, N.M., Ananthanagu, U. (2024). Predicting Code Runtime Complexity Using ML Techniques. In: Shetty, N.R., Prasad, N.H., Nalini, N. (eds) Advances in Computing and Information. ERCICA 2023. Lecture Notes in Electrical Engineering, vol 1104. Springer, Singapore. https://doi.org/10.1007/978-981-99-7622-5_26

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7622-5_26

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7621-8

  • Online ISBN: 978-981-99-7622-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics