Extending Enterprise Service Design Knowledge Using Clustering

  • Marcus Roy
  • Ingo Weber
  • Boualem Benatallah
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7636)


Automatically constructing or completing knowledge bases of SOA design knowledge puts traditional clustering approaches beyond their limits. We propose an approach to amend incomplete knowledge bases of Enterprise Service (ES) design knowledge, based on a set of ES signatures. The approach employs clustering, complemented with various filtering and ranking techniques to identify potentially new entities. We implemented and evaluated the approach, and show that it significantly improves the detection of entities compared to a state-of-the-art clustering technique. Ultimately, extending an existing knowledge base with entities is expected to further improve ES search result quality.


Directed Acyclic Graph Service Design Hierarchical Agglomerative Cluster Naming Convention Name Entity Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Agichtein, E., Gravano, L.: Snowball: Extracting Relations From Large Plain-Text Collections. In: DL 2000, pp. 85–94. ACM, New York (2000)CrossRefGoogle Scholar
  2. 2.
    Bennett, S.G., Gee, C., Laird, R., Manes, A.T., Schneider, R., Shuster, L., Tost, A., Venable, C.: SOA Governance: Governing Shared Services On-Premise and in the Cloud. Prentice Hall (2011)Google Scholar
  3. 3.
    Brauer, F., Huber, M., Hackenbroich, G., Leser, U., Naumann, F., Barczynski, W.M.: Graph-Based Concept Identification and Disambiguation for Enterprise Search. In: WWW 2010, pp. 171–180. ACM, New York (2010)Google Scholar
  4. 4.
    Chakaravarthy, V.T., Gupta, H., Roy, P., Mohania, M.: Efficiently Linking Text Documents With Relevant Structured Information. In: VLDB 2006, pp. 667–678 (2006)Google Scholar
  5. 5.
    Chandel, A., Nagesh, P., Sarawagi, S.: Efficient Batch Top-k Search for Dictionary-based Entity Recognition. In: ICDE 2006, p. 28 (April 2006)Google Scholar
  6. 6.
    Chieu, H.L., Ng, H.T.: Named Entity Recognition: A Maximum Entropy Approach Using Global Information. In: COLING 2002, USA, pp. 1–7 (2002)Google Scholar
  7. 7.
    Dong, X., Halevy, A., Madhavan, J., Nemes, E., Zhang, J.: Similarity Search for Web Services. In: VLDB 2004, pp. 372–383. VLDB Endowment (2004)Google Scholar
  8. 8.
    Falkl, J., Laird, R., Carrato, T., Kreger, H.: IBM Advantage for SOA Governance Standards (July 2009),
  9. 9.
    Hassell, J., Aleman-Meza, B., Budak Arpinar, I.: Ontology-Driven Automatic Entity Disambiguation in Unstructured Text. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 44–57. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    Heß, A., Kushmerick, N.: Learning to Attach Semantic Metadata to Web Services. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 258–273. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  11. 11.
    Irmak, U., Kraft, R.: A Scalable Machine-Learning Approach for Semi-Structured Named Entity Recognition. In: WWW 2010, pp. 461–470. ACM, USA (2010)Google Scholar
  12. 12.
    Karypis, G., Han, E.-H., Kumar, V.: Chameleon: Hierarchical Clustering Using Dynamic Modeling. Computer 32(8), 68–75 (1999)CrossRefGoogle Scholar
  13. 13.
    Malinverno, P.: Service-oriented architecture craves governance (October 2006),
  14. 14.
    Oldham, N., Thomas, C., Sheth, A., Verma, K.: METEOR-S Web Service Annotation Framework with Machine Learning Classification. In: Cardoso, J., Sheth, A. (eds.) SWSWPC 2004. LNCS, vol. 3387, pp. 137–146. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  15. 15.
    Roy, M., Suleiman, B., Schmidt, D., Weber, I., Benatallah, B.: Using SOA Governance Design Methodologies to Augment Enterprise Service Descriptions. In: Mouratidis, H., Rolland, C. (eds.) CAiSE 2011. LNCS, vol. 6741, pp. 566–581. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  16. 16.
    SAP. Governance for Modeling and Implementing Enterprise Services at SAP (April 2007),
  17. 17.
    Saquete, E., Ferrández, O., Ferrández, S., Martínez-Barco, P., Muñoz, R.: Combining Automatic Acquisition of Knowledge With Machine Learning Approaches for Multilingual Temporal Recognition and Normalization. In: IS 2008, pp. 3319–3332 (2008)Google Scholar
  18. 18.
    Voorhees, E.M.: The Effectiveness and Efficiency of Agglomerative Hierarchic Clustering in Document Retrieval. PhD thesis, Ithaca, NY, USA (1986)Google Scholar
  19. 19.
    Watson, B.W.: A New Algorithm for the Construction of Minimal Acyclic DFAs. Science of Computer Programming 48(2-3), 81–97 (2003)MathSciNetzbMATHCrossRefGoogle Scholar
  20. 20.
    Wang, W., Xiao, C., Lin, X., Zhang, C.: Efficient Approximate Entity Extraction With Edit Distance Constraints. In: SIGMOD 2009, pp. 759–770. ACM, USA (2009)CrossRefGoogle Scholar
  21. 21.
    Willett, P.: Recent Trends in Hierarchic Document Clustering: A Critical Review. Information Processing and Management 24(5), 577–597 (1988)CrossRefGoogle Scholar
  22. 22.
    Witten, I.H., Moffat, A., Bell, T.C.: Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kaufmann, San Francisco (1999)Google Scholar
  23. 23.
    Zamir, O., Etzioni, O., Madani, O., Karp, R.: Fast and Intuitive Clustering of Web Documents. In: Knowledge Discovery and Data Mining, pp. 287–290 (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Marcus Roy
    • 1
    • 2
  • Ingo Weber
    • 2
    • 3
  • Boualem Benatallah
    • 2
  1. 1.SAP ResearchSydneyAustralia
  2. 2.School of Computer Science & EngineeringUniversity of New South WalesAustralia
  3. 3.Software Systems Research GroupNICTASydneyAustralia

Personalised recommendations