Abstract
Web Application Programming Interface (API) allows third-party and subscribed users to access data and functions of a software application through the network or the Internet. Web APIs expose data and functions to the public users, authorized users or enterprise users. Web API providers publish API documentations to help users to understand how to interact with web-based API services, and how to use the APIs in their integration systems. The exponential raise of the number of public web service APIs may cause a challenge for software engineers to choose an efficient API. The challenge may become more complicated when web APIs updated regularly by API providers. In this paper, we introduce a novel transformation-based approach which crawls the web to collect web API documentations (unstructured documents). It generates a web API Language model from API documentations, employs different machine learning algorithms to extract information and produces a structured web API specification that compliant to Open API Specification (OAS) format. The proposed approach improves information extraction patterns and learns the variety of structured and terminologies. In our experiment, we collect a sheer number of web API documentations. Our evaluation shows that the proposed approach find RESTful API documentations with 75% accuracy, constructs API endpoints with 84%, constructs endpoint attributes with 95%, and assigns endpoints to attributes with an accuracy 98%. The proposed approach were able to produces more than 2,311 OAS web API Specifications.
Keywords
- Web API service
- REST API
- Natural language processing
- Machine learning
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
- 4.
Available at: https://www.runmyprocess.com.
References
Abney, S.: Semisupervised Learning for Computational Linguistics. Chapman and Hall/CRC, Boca Raton (2007)
Automation, S.B.: Selenium ide (2014)
Bahrami, M., Chen, W.P.: WATAPI: composing web API specification from API documentations through an intelligent and interactive annotation tool. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 4573–4578. IEEE (2019)
Bahrami, M., Park, J., Liu, L., Chen, W.P.: API learning: applying machine learning to manage the rise of API economy. In: Companion Proceedings of the The Web Conference 2018, pp. 151–154 (2018)
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb), 1137–1155 (2003)
Choudhary, S., Thomas, I., Bahrami, M., Sumioka, M.: Accelerating the digital transformation of business and society through composite business ecosystems. In: Barolli, L., Takizawa, M., Xhafa, F., Enokido, T. (eds.) AINA 2019. AISC, vol. 926, pp. 419–430. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-15032-7_36
Cremaschi, M., De Paoli, F.: Toward automatic semantic API descriptions to support services composition. In: De Paoli, F., Schulte, S., Broch Johnsen, E. (eds.) ESOCC 2017. LNCS, vol. 10465, pp. 159–167. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67262-5_12
Dehak, N., Dehak, R., Glass, J.R., Reynolds, D.A., Kenny, P.: Cosine similarity scoring without score normalization techniques. In: Odyssey, p. 15 (2010)
Goldberg, Y., Levy, O.: Word2vec explained: deriving Mikolov et al’.s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722 (2014)
Gu, X., Zhang, H., Zhang, D., Kim, S.: Deep API learning. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 631–642. ACM (2016)
Hou, L., Zhao, S., Li, X., Chatzimisios, P., Zheng, K.: Design and implementation of application programming interface for internet of things cloud. Int. J. Netw. Manag. 27(3), e1936 (2017)
Li, Y., Krishnamurthy, R., Raghavan, S., Vaithyanathan, S., Jagadish, H.: Regular expression learning for information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 21–30. Association for Computational Linguistics (2008)
Masse, M.: REST API Design Rulebook: Designing Consistent RESTful Web Service Interfaces. O’Reilly Media, Inc., Sebastopol (2011)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Myers, B.A., Stylos, J.: Improving API usability. Commun. ACM 59(6), 62–69 (2016)
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)
Ramshaw, L.A., Marcus, M.P.: Text chunking using transformation-based learning. In: Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D. (eds.) Natural Language Processing Using Very Large Corpora. TLTB, vol. 11, pp. 157–176. Springer, Dordrecht (1999). https://doi.org/10.1007/978-94-017-2390-9_10
Rehurek, R., Sojka, P.: Gensim-python framework for vector space modelling. NLP Centre, Faculty of Informatics, Masaryk University, Brno, Czech Republic, vol. 3, no. 2 (2011)
Robillard, M.P., Deline, R.: A field study of API learning obstacles. Empir. Softw. Eng. 16(6), 703–732 (2011)
Rong, X.: Word2vec parameter learning explained. arXiv preprint arXiv:1411.2738 (2014)
Schmidt, M., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. Math. Program. 162(1–2), 83–112 (2017)
Thomas, R., et al.: Architectural styles and the design of network-based software architectures. University of California, Irvine (2000)
Yang, J., Wittern, E., Ying, A.T., Dolby, J., Tan, L.: Automatically extracting web API specifications from HTML documentation. arXiv preprint arXiv:1801.08928 (2018)
Zhong, H., Zhang, L., Xie, T., Mei, H.: Inferring resource specifications from natural language API documentation. In: Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering, pp. 307–318 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Bahrami, M., Chen, WP. (2020). Automated Web Service Specification Generation Through a Transformation-Based Learning. In: Wang, Q., Xia, Y., Seshadri, S., Zhang, LJ. (eds) Services Computing – SCC 2020. SCC 2020. Lecture Notes in Computer Science(), vol 12409. Springer, Cham. https://doi.org/10.1007/978-3-030-59592-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-59592-0_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59591-3
Online ISBN: 978-3-030-59592-0
eBook Packages: Computer ScienceComputer Science (R0)