API Prober – A Tool for Analyzing Web API Features and Clustering Web APIs

  • Shang-Pin MaEmail author
  • Ming-Jen Hsu
  • Hsiao-Jung Chen
  • Yu-Sheng Su
Conference paper
Part of the Lecture Notes on Data Engineering and Communications Technologies book series (LNDECT, volume 41)


Nowadays, Web services attract more and more attentions. Many companies expose their data or services by publishing Web APIs (Application Programming Interface) to let users create innovative services or applications. To ease the use of various and complex APIs, multiple API directory services or API search engines, such as Mashape, API Harmony, and ProgrammableWeb, are emerging in recent years. However, most API systems are only able to help developers to understand Web APIs. Furthermore, these systems do neither provide usage examples for users, nor help users understand the “closeness” between APIs. Therefore, we propose a system, referred to as API Prober, to address the above issues by constructing an API “dictionary”. There are multiple main features of API Prober. First, API Prober transforms OAS (OpenAPI Specification 2.0) into the graph structure in Neo4J database and annotates the semantic concepts on each graph node by using LDA (Latent Dirichlet Allocation) and WordNet. Second, by parsing source codes in the GitHub, API Prober is able to retrieve code examples that utilize APIs. Third, API Prober performs API classification through cluster analysis for OAS documents. Finally, the experimental results show that API Prober can appropriately produce service clusters.


Web API analysis Semantic annotation GitHub Cluster analysis 



This research was sponsored by Ministry of Science and Technology in Taiwan under the grant MOST 108-2221-E-019-026-MY3.


  1. 1.
    Gat, I., Succi, G.: A Survey of the API Economy. Cut. Consort (2013)Google Scholar
  2. 2.
    Fielding, R.T., Taylor, R.N.: Principled design of the modern Web architecture. ACM Trans. Internet Technol. (TOIT) 2(2), 115–150 (2002)CrossRefGoogle Scholar
  3. 3.
    Amundsen, M.: RESTful Web Clients: Enabling Reuse Through Hypermedia. O’Reilly Media, Inc, Sebastopol (2017)Google Scholar
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
    OpenAPI Specification (OAS).
  9. 9.
    Neumann, A., Laranjeiro, N., Bernardino, J.: An analysis of public REST web service APIs. IEEE Trans. Serv. Comput. 2018, 1 (2018)Google Scholar
  10. 10.
    Webber, J.: A programmatic introduction to neo4j. In: Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity. ACM (2012)Google Scholar
  11. 11.
    Agrawal, R., Phatak, M.: A novel algorithm for automatic document clustering. In: 2013 3rd IEEE International Advance Computing Conference (IACC) (2013)Google Scholar
  12. 12.
    Reddy, V.S., Kinnicutt, P., Lee, R.: Text document clustering: the application of cluster analysis to textual document. In: 2016 International Conference on Computational Science and Computational Intelligence (CSCI) (2016)Google Scholar
  13. 13.
    Wittern, E., et al.: API harmony: graph-based search and selection of APIs in the cloud. IBM J. Res. Dev. 60(2–3), 12:1–12:11 (2016)CrossRefGoogle Scholar
  14. 14.
    Ma, S., et al.: Real-world RESTful service composition: a transformation-annotation-discovery approach. In: 2017 IEEE 10th Conference on Service-Oriented Computing and Applications (SOCA) (2017)Google Scholar
  15. 15.
    Porter, M.: The Porter Stemming Algorithm.
  16. 16.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)Google Scholar
  17. 17.
    Li, Y., Bandar, Z.A., Mclean, D.: An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans. Knowl. Data Eng. 15(4), 871–882 (2003)CrossRefGoogle Scholar
  18. 18.
    Haupt, F., et al.: A framework for the structural analysis of REST APIs. In: 2017 IEEE International Conference on Software Architecture (ICSA) (2017)Google Scholar
  19. 19.
    Cosentino, V., Izquierdo, J.L.C., Cabot, J.: Findings from GitHub: methods, datasets and limitations. In: 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR). IEEE (2016)Google Scholar
  20. 20.
    Aggarwal, C., Zhai, C.: A Survey of Text Clustering Algorithms (2012)Google Scholar
  21. 21.
    Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Shang-Pin Ma
    • 1
    Email author
  • Ming-Jen Hsu
    • 1
  • Hsiao-Jung Chen
    • 1
  • Yu-Sheng Su
    • 1
  1. 1.National Taiwan Ocean UniversityKeelungTaiwan

Personalised recommendations