Towards Seamless Analysis of Software Interoperability: Automatic Identification of Conceptual Constraints in API Documentation

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9839)


Building successful and meaningful interoperation with external software APIs requires satisfying their conceptual interoperability constraints. These constraints, which we call the COINs, include structure, dynamic, and quality specifications that if missed they lead to costly implications of unexpected mismatches and running-late projects. However, for software architects and analysts, manual analysis of unstructured text in API documents to identify conceptual interoperability constraints is a tedious and time-consuming task that requires knowledge about constraint types. In this paper, we present our empirically-based research in addressing the aforementioned issues by utilizing machine learning techniques. We started with a multiple-case study through which we contributed a ground truth dataset. Then, we built a model for this dataset and tested its robustness through experiments using different machine learning text-classification algorithms. The results show that our model enables achieving \(70.4\,\%\) precision and \(70.2\,\%\) recall in identifying seven classes of constraints (i.e., Syntax, Semantic, Structure, Dynamic, Context, Quality, and Not-COIN). This achievement increases to \(81.9\,\%\) precision and \(82.0\,\%\) recall when identifying two classes (i.e., COIN, Not-COIN). Finally, we implemented a tool prototype to demonstrate the value of our findings for architects in a practical context.


Interoperability analysis Conceptual constraints Black-box interoperation API documentation Empirical study Machine learning 



This work is supervised by Prof. Dieter Rombach and funded by the Ph.D. Program of the CS Department of Kaiserslautern University. We thank Mohammed Abufouda and the anonymous reviewers for the valuable comments and feedback.


  1. 1.
    Abukwaik, H., Abujayyab, M., Humayoun, S.R., Rombach, D.: Extracting conceptual interoperability constraints from API documentation using machine learning. In: ICSE 2016 (2016)Google Scholar
  2. 2.
    Abukwaik, H., Naab, M., Rombach, D.: A proactive support for conceptual interoperability analysis in software systems. In: WICSA 2015 (2015)Google Scholar
  3. 3.
    Anvaari, M., Zimmermann, O.: Semi-automated design guidance enhancer (SADGE): a framework for architectural guidance development. In: Avgeriou, P., Zdun, U. (eds.) ECSA 2014. LNCS, vol. 8627, pp. 41–49. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-09970-5_4 Google Scholar
  4. 4.
    Banko, M., Brill, E.: Scaling to very very large corpora for natural language disambiguation. In: Proceedings of 39th Annual Meeting of the Association for Computational Linguistics (2001)Google Scholar
  5. 5.
    Caldiera, V., Rombach, H.D.: The goal question metric approach. Encyclopedia Softw. Eng. 2, 528–532 (1994)Google Scholar
  6. 6.
    Chu, W., Lin, T.Y.: Foundations and advances in data mining (2005)Google Scholar
  7. 7.
    Figueiredo, A.M., Dos Reis, J.C., Rodrigues, M.A.: Improving access to software architecture knowledge an ontology-based search approach (2012)Google Scholar
  8. 8.
    Garlan, D., Allen, R., Ockerbloom, J.: Architectural mismatch or why it’s hard to build systems out of existing parts. In: ICSE 1995 (1995)Google Scholar
  9. 9.
    Hallé, S., Bultan, T., Hughes, G., Alkhalaf, M., Villemaire, R.: Runtime verification of web service interface contracts. Computer 43(3), 59–66 (2010)CrossRefGoogle Scholar
  10. 10.
    John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Conference on Uncertainty in Artificial Intelligence (1995)Google Scholar
  11. 11.
    Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai, vol. 14 (1995)Google Scholar
  12. 12.
    López, C., Codocedo, V., Astudillo, H., Cysneiros, L.M.: Bridging the gap between software architecture rationale formalisms and actual architecture documents: an ontology-driven approach. Sci. Comput. Program. 77, 66–80 (2012)CrossRefGoogle Scholar
  13. 13.
    Pandita, R., Xiao, X., Zhong, H., Xie, T., Oney, S., Paradkar, A.: Inferring method specifications from natural language API descriptions. In: ICSE 2012 (2012)Google Scholar
  14. 14.
    Powers, D.M.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation (2011)Google Scholar
  15. 15.
    Robertson, S.: Understanding inverse document frequency: on theoretical arguments for IDF. J. Documentation 60(5), 503–520 (2004)CrossRefGoogle Scholar
  16. 16.
    Oakes, M.P., Ji, M. (eds.): Quantitative Methods in Corpus-Based Translation Studies: A Practical Guide to Descriptive Translation Research, vol. 51. John Benjamins Publishing, Amsterdam (2012)Google Scholar
  17. 17.
    Runeson, P., Höst, M.: Guidelines for conducting and reporting case study research in software engineering. Empirical Softw. Eng. 14(2), 131–164 (2009)CrossRefGoogle Scholar
  18. 18.
    Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2, 45–66 (2001)zbMATHGoogle Scholar
  19. 19.
    Wu, Q., Wu, L., Liang, G., Wang, Q., Xie, T., Mei, H.: Inferring dependency constraints on parameters for web services. In: WWW 2013 (2013)Google Scholar
  20. 20.
    Zhong, H., Zhang, L., Xie, T., Mei, H.: Inferring resource specifications from natural language API documentation. In: ASE 2009 (2009)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.University of KaiserslauternKaiserslauternGermany

Personalised recommendations