Skip to main content

JARAD: An Approach for Java API Mention Recognition and Disambiguation in Stack Overflow

  • Conference paper
  • First Online:
Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2023)

Abstract

Invoking APIs is a common way to improve the efficiency of software development. Developers often discuss various problems encountered or share the experience of using the API in communities, like Stack Overflow and GitHub. To avoid the duplicate discussion of issues and support downstream tasks such as API recommendation and API Mining, it is necessary to recognize APIs mentioned in these communities and link them to the fully qualified name. This work is often referred to as the task of API mention recognition and disambiguation in informal texts, which is the main focus of our paper. We start from Java posts in Stack Overflow and analyze the proportion of the posts that involve discussion on API (API Post for short), with short names or fully qualified names, and the characteristics of API Post. We also extract the APIs associated with more than 30,000 posts in Stack Overflow, and automatically establish \(<post, APIs>\) pairs to construct the dataset JAPD. Finally, we propose a novel approach JARAD to infer the associated APIs in a post. In our approach, we first use BiLSTM and CRF to fuse context information in text and code snippets to obtain a set of associated API candidates. The candidate API is then scored by the frequency of the API type appearing in the post to infer API’s fully qualified name. Our evaluation experiments demonstrate that JARAD achieves 71.58%, 76.84% and 74.12% on Precision, Recall and F1 respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://anonymous.4open.science/r/JARAD-EDAE.

  2. 2.

    https://archive.org/details/stackexchange.

  3. 3.

    https://stackoverflow.com/tags.

  4. 4.

    https://docs.oracle.com/javase/8/docs/api/.

References

  1. Ye, D., Bao, L., Xing, Z., et al.: APIReal: an API recognition and linking approach for online developer forums. Empir. Softw. Eng. 23, 3129–3160 (2018)

    Article  Google Scholar 

  2. Huo, Y., Su, Y., Zhang, H., et al.: ARCLIN: automated API mention resolution for unformatted texts. In: Proceedings of the 44th International Conference on Software Engineering, pp. 138–149 (2022)

    Google Scholar 

  3. Treude, C., Robillard, M.P.: Augmenting API documentation with insights from stack overflow. In: Proceedings of the 38th International Conference on Software Engineering, pp. 392–403 (2016)

    Google Scholar 

  4. Rigby, P.C., Robillard, M.P.: Discovering essential code elements in informal documentation. In: 2013 35th International Conference on Software Engineering (ICSE), pp. 832–841. IEEE (2013)

    Google Scholar 

  5. Ma, S., Xing, Z., Chen, C., et al.: Easy-to-deploy API extraction by multi-level feature embedding and transfer learning. IEEE Trans. Software Eng. 47(10), 2296–2311 (2019)

    Article  Google Scholar 

  6. Ye, D., Xing, Z., Foo, C.Y., et al.: Learning to extract API mentions from informal natural language discussions. In: 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 389–399. IEEE (2016)

    Google Scholar 

  7. Ge, C., Liu, X., Chen, L., et al.: Make it easy: an effective end-to-end entity alignment framework. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 777–786 (2021)

    Google Scholar 

  8. Ye, D., Xing, Z., Foo, C.Y., et al.: Software-specific named entity recognition in software engineering social content. In: 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), vol. 1, pp. 90–101. IEEE (2016)

    Google Scholar 

  9. Chen, C., Xing, Z., Wang, X.: Unsupervised software-specific morphological forms inference from informal discussions. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), pp. 450–461. IEEE (2017)

    Google Scholar 

  10. Yin, H., Zheng, Y., Sun, Y., et al.: An API learning service for inexperienced developers based on API knowledge graph. In: 2021 IEEE International Conference on Web Services (ICWS), pp. 251–261. IEEE (2021)

    Google Scholar 

  11. Baltes, S., Treude, C., Diehl, S.: SOTorrent: studying the origin, evolution, and usage of stack overflow code snippets. In: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), pp. 191–194. IEEE (2019)

    Google Scholar 

  12. Luong, K., Thung, F., Lo, D.: Disambiguating mentions of API methods in stack overflow via type scoping. In: 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 679–683. IEEE (2021)

    Google Scholar 

  13. Luong, K., Hadi, M., Thung, F., et al.: ARSeek: identifying API resource using code and discussion on stack overflow. In: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, pp. 331–342 (2022)

    Google Scholar 

  14. Luong, K., Thung, F., Lo, D.: ARSearch: searching for API related resources from stack overflow and GitHub. In: Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings, pp. 11–15 (2022)

    Google Scholar 

  15. Huang, Q., Xia, X., Xing, Z., et al.: API method recommendation without worrying about the task-API knowledge gap. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pp. 293–304 (2018)

    Google Scholar 

  16. Rahman, M.M., Roy, C.K., Lo, D.: RACK: automatic API recommendation using crowdsourced knowledge. In: 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), vol. 1, pp. 349–359. IEEE (2016)

    Google Scholar 

  17. Bacchelli, A., Lanza, M., Robbes, R.: Linking e-mails and source code artifacts. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, vol. 1, pp. 375–384 (2010)

    Google Scholar 

  18. Liu, M., Peng, X., Marcus, A., et al.: API-related developer information needs in stack overflow. IEEE Trans. Software Eng. 48(11), 4485–4500 (2021)

    Article  Google Scholar 

  19. Velázquez-Rodríguez, C., Constantinou, E., De Roover, C.: Uncovering library features from API usage on Stack Overflow. In: 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 207–217. IEEE (2022)

    Google Scholar 

  20. Singh, R., Mangat, N.S.: Elements of Survey Sampling. Springer, Dordrecht (2013). https://doi.org/10.1007/978-94-017-1404-4

    Book  Google Scholar 

  21. Li, H., Li, S., Sun, J., et al.: Improving API caveats accessibility by mining API caveats knowledge graph. In: 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 183–193. IEEE (2018)

    Google Scholar 

  22. Wang, C., Peng, X., Liu, M., et al.: A learning-based approach for automatic construction of domain glossary from source code and documentation. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 97–108 (2019)

    Google Scholar 

  23. Antoniol, G., Canfora, G., Casazza, G., et al.: Recovering traceability links between code and documentation. IEEE Trans. Software Eng. 28(10), 970–983 (2002)

    Article  Google Scholar 

  24. Dagenais, B., Robillard, M.P.: Recovering traceability links between an API and its learning resources. In: 2012 34th International Conference on Software Engineering (ICSE), pp. 47–57. IEEE (2012)

    Google Scholar 

  25. Marcus, A., Maletic, J.I.: Recovering documentation-to-source-code traceability links using latent semantic indexing. In: 25th International Conference on Software Engineering, 2003. Proceedings, pp. 125–135. IEEE (2003)

    Google Scholar 

  26. Phan, H., Nguyen, H.A., Tran, N.M., et al.: Statistical learning of API fully qualified names in code snippets of online forums. In: Proceedings of the 40th International Conference on Software Engineering, pp. 632–642 (2018)

    Google Scholar 

  27. Saifullah, C.M.K., Asaduzzaman, M., Roy, C.K.: Learning from examples to find fully qualified names of API elements in code snippets. In: 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 243–254. IEEE (2019)

    Google Scholar 

  28. Subramanian, S., Inozemtseva, L., Holmes, R.: Live API documentation. In: Proceedings of the 36th International Conference on Software Engineering, pp. 643–652 (2014)

    Google Scholar 

  29. Nguyen, T., Tran, N., Phan, H., et al.: Complementing global and local contexts in representing API descriptions to improve API retrieval tasks. In: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 551–562 (2018)

    Google Scholar 

  30. Ye, X., Shen, H., Ma, X., et al.: From word embeddings to document similarities for improved information retrieval in software engineering. In: Proceedings of the 38th International Conference on Software Engineering, pp. 404–415 (2016)

    Google Scholar 

  31. Rój, M.: Exploiting user knowledge during retrieval of semantically annotated API operations. In: Proceedings of the Fourth Workshop on Exploiting Semantic Annotations in Information Retrieval, pp. 21–22 (2011)

    Google Scholar 

  32. Zhou, Y., Wang, C., Yan, X., et al.: Automatic detection and repair recommendation of directive defects in Java API documentation. IEEE Trans. Software Eng. 46(9), 1004–1023 (2018)

    Article  Google Scholar 

  33. Xie, W., Peng, X., Liu, M., et al.: API method recommendation via explicit matching of functionality verb phrases. In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 1015–1026 (2020)

    Google Scholar 

  34. Ren, X., Sun, J., Xing, Z., et al.: Demystify official API usage directives with crowdsourced API misuse scenarios, erroneous code examples and patches. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, pp. 925–936 (2020)

    Google Scholar 

  35. Ren, X., Ye, X., Xing, Z., et al.: API-misuse detection driven by fine-grained API-constraint knowledge graph. In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, pp. 461–472 (2020)

    Google Scholar 

  36. Li, J., Sun, A., Xing, Z., et al.: API caveat explorer–surfacing negative usages from practice: an API-oriented interactive exploratory search system for programmers. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 1293–1296 (2018)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the High Performance Computing Center of Central South University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Kuang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liang, Q., Jin, Y., Xie, Q., Kuang, L., Sheng, Y. (2024). JARAD: An Approach for Java API Mention Recognition and Disambiguation in Stack Overflow. In: Gao, H., Wang, X., Voros, N. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 561. Springer, Cham. https://doi.org/10.1007/978-3-031-54521-4_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-54521-4_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-54520-7

  • Online ISBN: 978-3-031-54521-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics