Skip to main content
Log in

Automatically Identifying Calling-Prone Higher-Order Functions of Scala Programs to Assist Testers

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

For the rapid development of internetware, functional programming languages, such as Haskell and Scala, can be used to implement complex domain-specific applications. In functional programming languages, a higher-order function is a function that takes functions as parameters or returns a function. Using higher-order functions in programs can increase the generality and reduce the redundancy of source code. To test a higher-order function, a tester needs to check the requirements and write another function as the test input. However, due to the complex structure of higher-order functions, testing higher-order functions is a time-consuming and labor-intensive task. Testers have to spend an amount of manual effort in testing all higher-order functions. Such testing is infeasible if the time budget is limited, such as a period before a project release. In practice, not every higher-order function is actually called. We refer to higher-order functions that are about to be called as calling-prone ones. Calling-prone higher-order functions should be tested first. In this paper, we propose an automatic approach, namely Phof, which predicts whether a higher-order function of Scala programs will be called in the future, i.e., identifying calling-prone higher-order functions. Our approach can assist testers to reduce the number of higher-order functions of Scala programs under test. In Phof, we extracted 24 features from source code and logs to train a predictive model based on known higher-order function calls. We empirically evaluated our approach on 4 832 higher-order functions from 27 real-world Scala projects. Experimental results show that Phof based on the random forest algorithm and the Synthetic Minority Oversampling Technique Processing strategy (SMOTE) performs well in the prediction of calls of higher-order functions. Our work can be used to support the scheduling of limited test resources.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Mei H, Huang G, Xie T. Internetware: A software paradigm for Internet computing. IEEE Computer, 2012, 45(6): 26-31.

    Article  Google Scholar 

  2. Wang B, Zhao H, ZhangW, Jin Z, Mei H. A problem-driven collaborative approach to eliciting requirements of Internetwares. In Proc. the 2nd Asia-Pacific Symposium on Internetware, November 2010, Article No. 22.

  3. van der Lippe T, Smith T, Pelsmaeker D, Visser E. A scalable infrastructure for teaching concepts of programming languages in Scala with WebLab: An experience report. In Proc. the 7th ACM SIGPLAN Symposium on Scala, October 2016, pp.65-74.

  4. Kroll L, Carbone P, Haridi S. Kompics Scala: Narrowing the gap between algorithmic specification and executable code (short paper). In Proc. the 8th ACM SIGPLAN International Symposium on Scala, October 2017, pp.73-77.

  5. Bassoy C, Schatz V. Fast higher-order functions for tensor calculus with tensors and subtensors. In Proc. the 18th International Conference on Computer Science, June 2018, pp.639-652.

  6. Nakaguchi T, Murakami Y, Lin D, Ishida T. Higher-order functions for modeling hierarchical service bindings. In Proc. the 2016 IEEE International Conference on Services Computing, July 2016, pp.798-803.

  7. Selakovic M, Pradel M, Karim R, Tip F. Test generation for higher-order functions in dynamic languages. Proceedings of the ACM on Programming Languages, 2018, 2(OOPSLA): Article No. 161.

  8. Ma P, Cheng H, Zhang J, Xuan J. Can this fault be detected: A study on fault detection via automated test generation. Journal of Systems and Software, 2020, 170: Article No. 110769.

  9. Xu Y, Jia X, Xuan J. Writing tests for this higher-order function first: Automatically identifying future callings to assist testers. In Proc. the 11th Asia-Pacific Symposium on Internetware, October 2019, Article No. 6.

  10. Odersky M, Altherr P, Cremet V, Emir B, Maneth S, Micheloud S, Mihaylov N, Schinz M, Stenman E, Zenger M. An overview of the Scala programming language. Technical Report, École Polytechnique Fédérale de Lausanne, 2006. https://www.scala-lang.org/docu/files/ScalaOverview.pdf, September 2020.

  11. Nystrom N. A Scala framework for supercompilation. In Proc. the 8th ACM SIGPLAN International Symposium on Scala, October 2017, pp.18-28.

  12. McCabe T J. A complexity measure. IEEE Trans. Software Eng., 1976, 2(4): 308-320.

    Article  MathSciNet  Google Scholar 

  13. Wang T, Zhang Y, Yin G, Yu Y, Wang H. Who will become a long-term contributor? A prediction model based on the early phase behaviors. In Proc. the 10th Asia-Pacific Symposium on Internetware, September 2018, Article No. 9.

  14. Quinlan J R. C4.5: Programs for Machine Learning (1st edition). Morgan Kaufmann, 1993.

  15. Breiman L. Random forests. Machine Learning, 2001, 45(1): 5-32.

    Article  Google Scholar 

  16. Cortes C, Vapnik V. Support-vector networks. Machine Learning, 1995, 20(3): 273-297.

    MATH  Google Scholar 

  17. Hastie T, Tibshirani R, Friedman J H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd edition). Springer, 2009.

  18. Pearl J. Bayesian networks: A model of self-activated memory for evidential reasoning. In Proc. the 7th Conference of Cognitive Science Society, August 1985, pp.15-17.

  19. Tolles J, Meurer W J. Logistic regression: Relating patient characteristics to outcomes. The Journal of the American Medical Association, 2016, 316(5): 533-534.

    Article  Google Scholar 

  20. He H, Garcia E A. Learning from imbalanced data. IEEE Trans. Knowl. Data Eng., 2009, 21(9): 1263-1284.

    Article  Google Scholar 

  21. Chawla N V, Bowyer K W, Hall L O, Kegelmeyer W P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res., 2002, 16: 321-357.

    Article  Google Scholar 

  22. Sajnani H, Saini V, Svajlenko J, Roy C K, Lopes C V. SourcererCC: Scaling code clone detection to big-code. In Proc. the 38th International Conference on Software Engineering, May 2016, pp.1157-1168.

  23. Rahman W, Xu Y, Pu F, Xuan J, Jia X, Basios M, Kanthan L, Li L, Wu F, Xu B. Clone detection on large Scala codebases. In Proc. the 14th IEEE International Workshop on Software Clones, February 2020, pp.38-44.

  24. Zhang X, Chen Y, Gu Y, ZouW, Xie X, Jia X, Xuan J. How do multiple pull requests change the same code: A study of competing pull requests in GitHub. In Proc. the 2018 IEEE International Conference on Software Maintenance and Evolution, September 2018, pp.228-239.

  25. Hall M A, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I H. The WEKA data mining software: An update. SIGKDD Explorations, 2009, 11(1): 10-18.

    Article  Google Scholar 

  26. He H, Bai Y, Garcia E A, Li S. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In Proc. the International Joint Conference on Neural Networks, June 2008, pp.1322-1328.

  27. Karlsson O, Haller P. Extending Scala with records: Design, implementation, and evaluation. In Proc. the 9th ACM SIGPLAN International Symposium on Scala, September 2018, pp.72-82.

  28. Egghe L, Leydesdorff L. The relation between Pearson’s correlation coefficient r and Salton’s cosine measure. J. Assoc. Inf. Sci. Technol., 2009, 60(5): 1027-1036.

    Article  Google Scholar 

  29. Han J, Kamber M, Pei J. Data Mining: Concepts and Techniques (3rd edition). Morgan Kaufmann, 2011.

  30. Wang G, Lochovsky F H. Feature selection with conditional mutual information maximin in text categorization. In Proc. the 2004 ACM CIKM International Conference on Information and Knowledge Management, November 2004, pp.342-349.

  31. Hall M A. Correlation-based feature subset selection for machine learning [Ph.D. Thesis]. University of Waikato, 1998.

  32. Reynders B, Greefs M, Devriese D, Piessens F. Scalagna 0.1: Towards multi-tier programming with Scala and Scala.js. In Proc. the Conference Companion of the 2nd International Conference on Art, April 2018, pp.69-74.

  33. Cassez F, Sloane A M. ScalaSMT: Satisfiability modulo theory in Scala (tool paper). In Proc. the 8th ACM SIGPLAN International Symposium on Scala, October 2017, pp.51-55.

  34. Xu Y, Wu F, Jia X, Li L, Xuan J. Mining the use of higher-order functions: An exploratory study on Scala programs. Empirical Software Engineering, 2020, 25(6): 4547-4584.

    Article  Google Scholar 

  35. Koopman P W M, Plasmeijer R. Automatic testing of higher order functions. In Proc. the 4th Asian Symposium on Programming Languages and Systems, November 2006, pp.148-164.

  36. Madhavan R, Kulal S, Kuncak V. Contract-based resource verification for higher-order functions with memoization. In Proc. the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, January 2017, pp.330-343.

  37. Voirol N, Kneuss E, Kuncak V. Counter-example complete verification for higher-order functions. In Proc. the 6th ACM SIGPLAN Symposium on Scala, June 2015, pp.18-29.

  38. Rusu V, Arusoaie A. Executing and verifying higher-order functional-imperative programs in Maude. Journal of Logic and Algebraic Methods in Programming, 2017, 93: 68-91.

    Article  MathSciNet  Google Scholar 

  39. Lincke D, Schupp S. From HOT to COOL: Transforming higher-order typed languages to concept-constrained object-oriented languages. In Proc. the International Workshop on Language Descriptions, Tools, and Applications, April 2012, Article No. 3.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ji-Feng Xuan.

Supplementary Information

ESM 1

(PDF 103 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, YS., Jia, XY., Wu, F. et al. Automatically Identifying Calling-Prone Higher-Order Functions of Scala Programs to Assist Testers. J. Comput. Sci. Technol. 35, 1278–1294 (2020). https://doi.org/10.1007/s11390-020-0526-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-020-0526-y

Keywords

Navigation