Skip to main content
Log in

An empirical study on API usages from code search engine and local library

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

To help programmers find proper API methods and learn API usages, researchers have proposed various code search engines. Given an API of interest, a code search engine can retrieve its code samples from online software repositories. Through such tools, Internet code becomes a major resource for learning API usages. Besides Internet code, local library code also contains API usages, and researchers have found that library code contains many API usages that are more concise than those in client code. As samples from a code search engine are typically client code, it is interesting to explore the API usages inside library code, but the samples inside library code contain internal method invocations. If an empirical study does not remove them from API usages, it can significantly overestimate the API usages from library code. Due to this challenge, no prior study has ever analyzed API usages inside libraries, and many research questions are still open. For example, how many API usages are there inside libraries? The answers are useful to motivate future research on APIs and code search engines. The internal usages in library code will introduce compilation errors when they are directly called from the client side. To support the exploration of the above questions, in this paper, we propose CodeEx that extracts Internet code samples from a popular code search engine and local code samples by removing internal usages from library code. With the support of CodeEx, we conduct the first empirical study on API usages of five libraries, and summarize our results into six findings as the answers to five research questions. Our results are useful for researchers to motivate their future research. For example, our results show that although code samples from library code are only half of those from the code search engine, they cover 4.0 times more API classes, 4.7 times more API methods, and 3.0 times more call sequences. Meanwhile, in a controlled experiment, we compare their effectiveness in assisting programming. We find that more API usages do not lead to more complete tasks, and it highlights the importance of code recommendation approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • accumulo (2019). https://accumulo.apache.org

  • cassandra (2019). http://cassandra.apache.org

  • Guice (2019a). https://searchcode.com/api/

  • JDT (2019). http://www.eclipse.org/jdt/

  • karaf (2019). https://karaf.apache.org

  • lucene (2019). https://lucene.apache.org

  • poi (2019). https://poi.apache.org

  • The searchcode engine (2019b). https://searchcode.com/

  • Cassandra archive (2020). http://archive.apache.org/dist/cassandra

  • Tha API documents of accumulo 1.9 (2020). https://accumulo.apache.org/1.9/apidocs/

  • Ammons G, Bodík R, Larus JR (2002) Mining specifications. In: Proc 29th POPL, pp 4–16

  • Asyrofi MH, Thung F, Lo D, Jiang L (2020) Ausearch: Accurate API usage search in github repositories with type resolution. In: Proc SANER, pp 637–641

  • Bajracharya S, Ngo T, Linstead E, Dou Y, Rigor P, Baldi P, Lopes C (2006) Sourcerer: a search engine for open source code supporting structure-based search. In: Companion to Proc. OOPSLA, pp 681–682

  • Bian P, Liang B, Shi W, Huang J, Cai Y (2018) Nar-miner: discovering negative association rules from code for bug detection. In: Proc. ESEC/FSE, pp 411–422

  • Bornholt J, Torlak E (2017) Synthesizing memory models from framework sketches and litmus tests. In: Proc. PLDI, pp 467–481

  • Brito G, Hora A, Valente MT, Robbes R (2018) On the use of replacement messages in api deprecation: an empirical study. J Syst Softw 137:306–321

    Article  Google Scholar 

  • Bruce BR, Zhang T, Arora J, Xu GH, Kim M (2020) Jshrink: In-depth investigation into debloating modern java applications. In: Proc. ESEC/FSE, pp 135–146

  • Buse RP, Weimer W (2012) Synthesizing api usage examples. In: Proc. ICSE, pp 782–792

  • Chatterjee S, Juvekar S, Sen K (2009) Sniff: a search engine for java using free-form queries. In: Proc. FASE, pp 385–400

  • Dagenais B, Hendren LJ (2008) Enabling static analysis for partial Java programs. In: Proc. OOPSLA, pp 313–328

  • Feng Y, Martins R, Bastani O, Dillig I (2018) Program synthesis using conflict-driven learning. In: Proc. PLDI, pp 420–435

  • Gabel M, Su Z (2008) Javert: fully automatic mining of general temporal properties from dynamic traces. In: Proc. ESEC/FSE, pp 339–349

  • Ghafari M, Moradi H (2017) A framework for classifying and comparing source code recommendation systems. In: Proc. SANER, pp 555–556

  • Ghafari M, Ghezzi C, Mocci A, Tamburrelli G (2014) Mining unit tests for code recommendation. In: Proc. ICPC, pp 142–145

  • Ghafari M, Rubinov K, Pourhashem KMM (2017) Mining unit test cases to synthesize api usage examples. Journal of software: evolution and process 29(12):e1841

    Google Scholar 

  • Hassan F, Wang X (2018) HireBuild: An automatic approach to history-driven repair of build scripts. In: Proc. ICSE, pp 1078–1089

  • Hindle A, Barr ET, Su Z, Gabel M, Devanbu P (2012) On the naturalness of software. In: Proc. 34th ICSE, pp 837–847

  • Holmes R, Murphy GC (2005) Using structural context to recommend source code examples. In: Proc. 27th ICSE, pp 117–125

  • Kawrykow D, Robillard MP (2009) Improving API usage through automatic detection of redundant code. In: Proc. ASE, pp 111–122

  • Keivanloo I, Rilling J, Zou Y (2014) Spotting working code examples. In: Proc. ICSE, pp 664–675

  • Kim K, Kim D, Bissyandé TF, Choi E, Li L, Klein J, Traon YL (2018) Facoy: a code-to-code search engine. In: Proc. ICSE, pp 946–957

  • Kula R G, German D M, Ouni A, Ishio T, Inoue K (2018) Do developers update their library dependencies? Empir Softw Eng 23(1):384–417

    Article  Google Scholar 

  • Lemieux C, Park D, Beschastnikh I (2015) General LTL specification mining. In: Proc. ASE, pp 81–92

  • Lemos OAL, Bajracharya SK, Ossher J, Morla RS, Masiero PC, Baldi P, Lopes CV (2007) Codegenie: using test-cases to search and reuse source code. In: Proc. ASE, pp 525–526

  • Linares-Vásquez M, Bavota G, Bernal-Cárdenas C, Oliveto R, Di Penta M, Poshyvanyk D (2014) Mining energy-greedy api usage patterns in android apps: an empirical study. In: Proc. MSR, pp 2–11

  • Liu X, Huang L, Ng V (2018) Effective api recommendation without historical software repositories. In: Proc. ASE, pp 282–292

  • Lo D, Khoo SC (2006) Smartic: Towards building an accurate, robust and scalable specification miner. In: Proc. ESEC/FSE, pp 265–275

  • Lv F, Zhang H, Lou Jg, Wang S, Zhang D, Zhao J (2015) Codehow: Effective code search based on api understanding and extended boolean model (e). In: Proc. ASE, pp 260–270

  • Mandelin D, Xu L, Bodík R, Kimelman D (2005) Jungloid mining: helping to navigate the API jungle. In: Proc. PLDI, pp 48–61

  • Maoz S, Ringert JO (2015) GR(1) synthesis for LTL specification patterns. In: Proc. ESEC/FSE, pp 96–106

  • McDonnell T, Ray B, Kim M (2013) An empirical study of API stability and adoption in the android ecosystem. In: Proc. ICSM, pp 70–79

  • McMillan C, Grechanik M, Poshyvanyk D, Fu C, Xie Q (2011) Exemplar: a source code search engine for finding highly relevant applications. IEEE Trans Softw Eng 38(5):1069–1087

    Article  Google Scholar 

  • Michail A (2000) Data mining library reuse patterns using generalized association rules. In: Proc. ICSE, pp 167–176

  • Monperrus M, Eichberg M, Tekes E, Mezini M (2012) What should developers be aware of? an empirical study on the directives of api documentation. Empir Softw Eng 17(6):703–737

    Article  Google Scholar 

  • Nguyen T, Vu P, Nguyen T (2020) Code recommendation for exception handling. In: Proc. ESEC/FSE, pp 1027–1038

  • Niu H, Keivanloo I, Zou Y (2017) Learning to rank code examples for code search engines. Empir Softw Eng 22(1):259–291

    Article  Google Scholar 

  • Piccioni M, Furia CA, Meyer B (2013) An empirical study of API usability. In: Proc. ESEM, pp 5–14

  • Reiss SP (2009) Semantics-based code search. In: Proc. ICSE, pp 243–253

  • Robillard M P, DeLine R (2011) A field study of API learning obstacles. Empir Softw Eng 16(6):703–732

    Article  Google Scholar 

  • Sadowski C, Stolee KT, Elbaum S (2015) How developers search for code: a case study. In: Proc. ESEC/FSE, pp 191–201

  • Saied MA, Abdeen H, Benomar O, Sahraoui H (2015) Could we infer unordered api usage patterns only using the library source code?. In: Proc. ICPC, pp 71–81

  • Saied M A, Ouni A, Sahraoui H, Kula R G, Inoue K, Lo D (2018) Improving reusability of software libraries through usage pattern mining. J Syst Softw 145:164–179

    Article  Google Scholar 

  • Saied M A, Raelijohn E, Batot E, Famelis M, Sahraoui H (2020) Towards assisting developers in api usage by automated recovery of complex temporal patterns. Inf Softw Technol 119:106213

    Article  Google Scholar 

  • Sawant AA, Robbes R, Bacchelli A (2016) On the reaction to deprecation of 25,357 clients of 4 + 1 popular Java APIs. In: Proc. ICSME, pp 400–410

  • Scaffidi C (2005) Why are APIs difficult to learn and use? Crossroads 12(4):4–4

    Article  Google Scholar 

  • Sim S E, Umarji M, Ratanotayanon S, Lopes C V (2011) How well do search engines support code retrieval on the web? ACM Trans Softw Eng Methodol 21(1):1–25

    Article  Google Scholar 

  • Stolee K T, Elbaum S, Dobos D (2014) Solving the search for source code. ACM Trans Softw Eng Methodol 23(3):1–45

    Article  Google Scholar 

  • Sven A, Nguyen HA, Nadi S, Nguyen TN, Mezini M (2019) Investigating next steps in static API-misuse detection. In: Proc. MSR, pp 265–275

  • Tansalarak N, Claypool K (2006) XSnippet: mining for sample code. In: Proc 21st OOPSLA pp 413–430

  • Thummalapenta S, Xie T (2007) PARSEWeb: a programmer assistant for reusing open source code on the web. In: Proc. 22nd ASE, pp 204–213

  • Thung F, Lo D, Lawall J (2013) Automated library recommendation. In: Proc. WCRE, pp 182–191

  • Wang Y, Dong J, Shah R, Dillig I (2019) Synthesizing database applications for schema refactoring. In: Proc. PLDI, p to appear

  • Xia X, Bao L, Lo D, Kochhar PS, Hassan AE, Xing Z (2017) What do developers search for on the web? Empir Softw Eng 22(6):3149–3185

    Article  Google Scholar 

  • Yang J, Evans D, Bhardwaj D, Bhat T, Das M (2006) Perracotta: mining temporal API rules from imperfect traces. In: Proc. 28th ICSE, pp 282–291

  • Ying AT, Robillard MP (2014) Selection and presentation practices for code example summarization. In: Proc. ESEC/FSE, pp 460–471

  • Zeng H, Chen J, Shen B, Zhong H (2021) Mining API constraints from library and client to detect API misuses. In: Proc. APSEC, pp 161–170

  • Zhang H, Wang S, Chen THP, Zou Y, Hassan AE (2019) An empirical study of obsolete answers on stack overflow. IEEE Trans Softw Eng

  • Zhang N, Zou Y, Xia X, Huang Q, Lo D, Li S (2022) Web APIs: Features, issues, and expectations–a large-scale empirical study of Web APIs from two publicly accessible registries using stack overflow and a user survey. IEEE Trans Softw Eng

  • Zhang T, Upadhyaya G, Reinhardt A, Rajan H, Kim M (2018) Are code examples on an online q&a forum reliable?: a study of api misuse on stack overflow. In: Proc. ICSE, pp 886–896

  • Zhong H, Mei H (2019) An empirical study on API usages. IEEE Trans Softw Eng 45(4):319–334

    Article  Google Scholar 

  • Zhong H, Su Z (2013) Detecting API documentation errors. In: Proc. OOPSLA, pp 803–816

  • Zhong H, Wang X (2017) Boosting complete-code tools for partial program. In: Proc. ASE, pp 671–681

  • Zhong H, Xie T, Zhang L, Pei J, Mei H (2009) MAPO: Mining and recommending API usage patterns. In: Proc. 23rd ECOOP, pp 318–343

  • Zhong H, Meng N, Li Z, Jia L (2020) An empirical study on API parameter rules. In: Proc. ICSE, pp 899–911

  • Zhong H, Wang X, Mei H (2022) Inferring bug signatures to detect real bugs. IEEE Trans Softw Eng 48(2):571–584

    Article  Google Scholar 

  • Zhou S, Shen B, Zhong H (2019) Lancer: Your code tell me what you need. In: Proc. ASE, pp 1202–1205

  • Zhu Z, Zou Y, Xie B, Jin Y, Lin Z, Zhang L (2014) Mining API usage examples from test code. In: Proc. ICSME, pp 301–310

Download references

Acknowledgments

We appreciate reviewers for their insightful comments. Hao Zhong is sponsored by the National Nature Science Foundation of China No. 62232003 and 62272295. Xiaoyin Wang is supported in part by NSF Grant CCF-1846467.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hao Zhong.

Additional information

Communicated by: Ali Ouni

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhong, H., Wang, X. An empirical study on API usages from code search engine and local library. Empir Software Eng 28, 63 (2023). https://doi.org/10.1007/s10664-023-10304-z

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-023-10304-z

Keywords

Navigation