Automatically Identifying Calling-Prone Higher-Order Functions of Scala Programs to Assist Testers

Xu, Yi-Sen; Jia, Xiang-Yang; Wu, Fan; Li, Lingbo; Xuan, Ji-Feng

doi:10.1007/s11390-020-0526-y

Automatically Identifying Calling-Prone Higher-Order Functions of Scala Programs to Assist Testers

Regular Paper
Published: 30 November 2020

Volume 35, pages 1278–1294, (2020)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Yi-Sen Xu¹,
Xiang-Yang Jia¹,
Fan Wu²,
Lingbo Li² &
…
Ji-Feng Xuan¹

82 Accesses
1 Citation
Explore all metrics

Abstract

For the rapid development of internetware, functional programming languages, such as Haskell and Scala, can be used to implement complex domain-specific applications. In functional programming languages, a higher-order function is a function that takes functions as parameters or returns a function. Using higher-order functions in programs can increase the generality and reduce the redundancy of source code. To test a higher-order function, a tester needs to check the requirements and write another function as the test input. However, due to the complex structure of higher-order functions, testing higher-order functions is a time-consuming and labor-intensive task. Testers have to spend an amount of manual effort in testing all higher-order functions. Such testing is infeasible if the time budget is limited, such as a period before a project release. In practice, not every higher-order function is actually called. We refer to higher-order functions that are about to be called as calling-prone ones. Calling-prone higher-order functions should be tested first. In this paper, we propose an automatic approach, namely Phof, which predicts whether a higher-order function of Scala programs will be called in the future, i.e., identifying calling-prone higher-order functions. Our approach can assist testers to reduce the number of higher-order functions of Scala programs under test. In Phof, we extracted 24 features from source code and logs to train a predictive model based on known higher-order function calls. We empirically evaluated our approach on 4 832 higher-order functions from 27 real-world Scala projects. Experimental results show that Phof based on the random forest algorithm and the Synthetic Minority Oversampling Technique Processing strategy (SMOTE) performs well in the prediction of calls of higher-order functions. Our work can be used to support the scheduling of limited test resources.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining the use of higher-order functions:

Article 04 September 2020

Yisen Xu, Fan Wu, … Jifeng Xuan

SCRUTINIZER: Detecting Code Reuse in Malware via Decompilation and Machine Learning

Mining Function Call Sequence Patterns Across Different Versions of the Project for Defect Detection

References

Mei H, Huang G, Xie T. Internetware: A software paradigm for Internet computing. IEEE Computer, 2012, 45(6): 26-31.
Article Google Scholar
Wang B, Zhao H, ZhangW, Jin Z, Mei H. A problem-driven collaborative approach to eliciting requirements of Internetwares. In Proc. the 2nd Asia-Pacific Symposium on Internetware, November 2010, Article No. 22.
van der Lippe T, Smith T, Pelsmaeker D, Visser E. A scalable infrastructure for teaching concepts of programming languages in Scala with WebLab: An experience report. In Proc. the 7th ACM SIGPLAN Symposium on Scala, October 2016, pp.65-74.
Kroll L, Carbone P, Haridi S. Kompics Scala: Narrowing the gap between algorithmic specification and executable code (short paper). In Proc. the 8th ACM SIGPLAN International Symposium on Scala, October 2017, pp.73-77.
Bassoy C, Schatz V. Fast higher-order functions for tensor calculus with tensors and subtensors. In Proc. the 18th International Conference on Computer Science, June 2018, pp.639-652.
Nakaguchi T, Murakami Y, Lin D, Ishida T. Higher-order functions for modeling hierarchical service bindings. In Proc. the 2016 IEEE International Conference on Services Computing, July 2016, pp.798-803.
Selakovic M, Pradel M, Karim R, Tip F. Test generation for higher-order functions in dynamic languages. Proceedings of the ACM on Programming Languages, 2018, 2(OOPSLA): Article No. 161.
Ma P, Cheng H, Zhang J, Xuan J. Can this fault be detected: A study on fault detection via automated test generation. Journal of Systems and Software, 2020, 170: Article No. 110769.
Xu Y, Jia X, Xuan J. Writing tests for this higher-order function first: Automatically identifying future callings to assist testers. In Proc. the 11th Asia-Pacific Symposium on Internetware, October 2019, Article No. 6.
Odersky M, Altherr P, Cremet V, Emir B, Maneth S, Micheloud S, Mihaylov N, Schinz M, Stenman E, Zenger M. An overview of the Scala programming language. Technical Report, École Polytechnique Fédérale de Lausanne, 2006. https://www.scala-lang.org/docu/files/ScalaOverview.pdf, September 2020.
Nystrom N. A Scala framework for supercompilation. In Proc. the 8th ACM SIGPLAN International Symposium on Scala, October 2017, pp.18-28.
McCabe T J. A complexity measure. IEEE Trans. Software Eng., 1976, 2(4): 308-320.
Article MathSciNet Google Scholar
Wang T, Zhang Y, Yin G, Yu Y, Wang H. Who will become a long-term contributor? A prediction model based on the early phase behaviors. In Proc. the 10th Asia-Pacific Symposium on Internetware, September 2018, Article No. 9.
Quinlan J R. C4.5: Programs for Machine Learning (1st edition). Morgan Kaufmann, 1993.
Breiman L. Random forests. Machine Learning, 2001, 45(1): 5-32.
Article Google Scholar
Cortes C, Vapnik V. Support-vector networks. Machine Learning, 1995, 20(3): 273-297.
MATH Google Scholar
Hastie T, Tibshirani R, Friedman J H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd edition). Springer, 2009.
Pearl J. Bayesian networks: A model of self-activated memory for evidential reasoning. In Proc. the 7th Conference of Cognitive Science Society, August 1985, pp.15-17.
Tolles J, Meurer W J. Logistic regression: Relating patient characteristics to outcomes. The Journal of the American Medical Association, 2016, 316(5): 533-534.
Article Google Scholar
He H, Garcia E A. Learning from imbalanced data. IEEE Trans. Knowl. Data Eng., 2009, 21(9): 1263-1284.
Article Google Scholar
Chawla N V, Bowyer K W, Hall L O, Kegelmeyer W P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res., 2002, 16: 321-357.
Article Google Scholar
Sajnani H, Saini V, Svajlenko J, Roy C K, Lopes C V. SourcererCC: Scaling code clone detection to big-code. In Proc. the 38th International Conference on Software Engineering, May 2016, pp.1157-1168.
Rahman W, Xu Y, Pu F, Xuan J, Jia X, Basios M, Kanthan L, Li L, Wu F, Xu B. Clone detection on large Scala codebases. In Proc. the 14th IEEE International Workshop on Software Clones, February 2020, pp.38-44.
Zhang X, Chen Y, Gu Y, ZouW, Xie X, Jia X, Xuan J. How do multiple pull requests change the same code: A study of competing pull requests in GitHub. In Proc. the 2018 IEEE International Conference on Software Maintenance and Evolution, September 2018, pp.228-239.
Hall M A, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I H. The WEKA data mining software: An update. SIGKDD Explorations, 2009, 11(1): 10-18.
Article Google Scholar
He H, Bai Y, Garcia E A, Li S. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In Proc. the International Joint Conference on Neural Networks, June 2008, pp.1322-1328.
Karlsson O, Haller P. Extending Scala with records: Design, implementation, and evaluation. In Proc. the 9th ACM SIGPLAN International Symposium on Scala, September 2018, pp.72-82.
Egghe L, Leydesdorff L. The relation between Pearson’s correlation coefficient r and Salton’s cosine measure. J. Assoc. Inf. Sci. Technol., 2009, 60(5): 1027-1036.
Article Google Scholar
Han J, Kamber M, Pei J. Data Mining: Concepts and Techniques (3rd edition). Morgan Kaufmann, 2011.
Wang G, Lochovsky F H. Feature selection with conditional mutual information maximin in text categorization. In Proc. the 2004 ACM CIKM International Conference on Information and Knowledge Management, November 2004, pp.342-349.
Hall M A. Correlation-based feature subset selection for machine learning [Ph.D. Thesis]. University of Waikato, 1998.
Reynders B, Greefs M, Devriese D, Piessens F. Scalagna 0.1: Towards multi-tier programming with Scala and Scala.js. In Proc. the Conference Companion of the 2nd International Conference on Art, April 2018, pp.69-74.
Cassez F, Sloane A M. ScalaSMT: Satisfiability modulo theory in Scala (tool paper). In Proc. the 8th ACM SIGPLAN International Symposium on Scala, October 2017, pp.51-55.
Xu Y, Wu F, Jia X, Li L, Xuan J. Mining the use of higher-order functions: An exploratory study on Scala programs. Empirical Software Engineering, 2020, 25(6): 4547-4584.
Article Google Scholar
Koopman P W M, Plasmeijer R. Automatic testing of higher order functions. In Proc. the 4th Asian Symposium on Programming Languages and Systems, November 2006, pp.148-164.
Madhavan R, Kulal S, Kuncak V. Contract-based resource verification for higher-order functions with memoization. In Proc. the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, January 2017, pp.330-343.
Voirol N, Kneuss E, Kuncak V. Counter-example complete verification for higher-order functions. In Proc. the 6th ACM SIGPLAN Symposium on Scala, June 2015, pp.18-29.
Rusu V, Arusoaie A. Executing and verifying higher-order functional-imperative programs in Maude. Journal of Logic and Algebraic Methods in Programming, 2017, 93: 68-91.
Article MathSciNet Google Scholar
Lincke D, Schupp S. From HOT to COOL: Transforming higher-order typed languages to concept-constrained object-oriented languages. In Proc. the International Workshop on Language Descriptions, Tools, and Applications, April 2012, Article No. 3.

Download references

Author information

Authors and Affiliations

School of Computer Science, Wuhan University, Wuhan, 430072, China
Yi-Sen Xu, Xiang-Yang Jia & Ji-Feng Xuan
Turing Intelligence Technology Limited, London, EC2Y 9ST, U.K.
Fan Wu & Lingbo Li

Authors

Yi-Sen Xu
View author publications
You can also search for this author in PubMed Google Scholar
Xiang-Yang Jia
View author publications
You can also search for this author in PubMed Google Scholar
Fan Wu
View author publications
You can also search for this author in PubMed Google Scholar
Lingbo Li
View author publications
You can also search for this author in PubMed Google Scholar
Ji-Feng Xuan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ji-Feng Xuan.

Supplementary Information

ESM 1

(PDF 103 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, YS., Jia, XY., Wu, F. et al. Automatically Identifying Calling-Prone Higher-Order Functions of Scala Programs to Assist Testers. J. Comput. Sci. Technol. 35, 1278–1294 (2020). https://doi.org/10.1007/s11390-020-0526-y

Download citation

Received: 11 April 2020
Revised: 31 October 2020
Published: 30 November 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s11390-020-0526-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatically Identifying Calling-Prone Higher-Order Functions of Scala Programs to Assist Testers

Abstract

Access this article

Similar content being viewed by others

Mining the use of higher-order functions:

SCRUTINIZER: Detecting Code Reuse in Malware via Decompilation and Machine Learning

Mining Function Call Sequence Patterns Across Different Versions of the Project for Defect Detection

References

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automatically Identifying Calling-Prone Higher-Order Functions of Scala Programs to Assist Testers

Abstract

Access this article

Similar content being viewed by others

Mining the use of higher-order functions:

SCRUTINIZER: Detecting Code Reuse in Malware via Decompilation and Machine Learning

Mining Function Call Sequence Patterns Across Different Versions of the Project for Defect Detection

References

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation