Abstract
In Information Retrieval the Probabilistic Topic Models were originally developed and utilized for topic extraction and document modeling. In this paper, we explore several probabilistic topic models: Probabilistic Latent Semantic Analysis (PLSA), Latent Dirichlet Allocation (LDA) and Correlated Topic Model (CTM) to extract latent factors from web service descriptions. These extracted latent factors are then used to group the services into clusters. In our approach, topic models are used as efficient dimension reduction techniques, which are able to capture semantic relationships between word-topic and topic-service interpreted in terms of probability distributions. To address the limitation of keywords-based queries, we represent web service description as a vector space and we introduce a new approach for discovering web services using latent factors. In our experiment, we compared the accuracy of the three probabilistic clustering algorithms (PLSA, LDA and CTM) with that of a classical clustering algorithm. We evaluated also our service discovery approach by calculating the precision (P@n) and normalized discounted cumulative gain (NDCGn). The results show that both approaches based on CTM and LDA perform better than other search methods.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Abramowicz, W., Haniewicz, K., Kaczmarek, M., Zyskowski, D.: Architecture for Web services filtering and clustering. In: ICIW 2007 (2007)
Atkinson, C., Bostan, P., Hummel, O., Stoll, D.: A Practical Approach to Web service Discovery and Retrieval. In: ICWS 2007 (2007)
Blei, D., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Blei, D., Lafferty, J.D.: A Correlated Topic model of Science. In: AAS 2007, pp. 17–35 (2007)
Cassar, G., Barnaghi, P., Moessner, K.: Probabilistic methods for service clustering. In: Proceeding of the 4th International Workshop on Semantic Web Service Matchmaking and Resource Retrieval, Organised in conjonction the ISWC 2010 (2010)
Cassar, G., Barnaghi, P., Moessner, K.: A Probabilistic Latent Factor approach to service ranking. In: ICCP 2011, pp. 103–109 (2011)
Elgazzar, K., Hassan, A., Martin, P.: Clustering WSDL Documents to Bootstrap the Discovery of Web Services. In: ICWS 2010, pp. 147–154 (2010)
Dong, X., Halevy, A., Madhavan, J., Nemes, E., Zhang, J.: Similarity Search for Web Services. In: VLDB Conference, Toronto, Canada, pp. 372–383 (2004)
Heß, A., Kushmerick, N.: Learning to Attach Semantic Metadata to Web Services. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 258–273. Springer, Heidelberg (2003)
Hofmann, T.: Probabilistic Latent Semantic Analysis. In: UAI, pp. 289–296 (1999)
Kokash, N.: A Comparison of Web Service Interface Similarity Measures. Frontiers in Artificial Intelligence and Applications, vol. 142, pp. 220–231 (2006)
Lausen, H., Haselwanter, T.: Finding Web services. In: European Semantic Technology Conference, Vienna, Austria (2007)
Liu, W., Wong, W.: Web service clustering using text mining techniques. IJAOSE 2009 3(1), 6–26 (2009)
Ma, J., Zhang, Y., He, J.: Efficiently finding web services using a clustering semantic approach. In: CSSSIA 2008, pp. 1–8. ACM, New York (2008)
Nayak, R., Lee, B.: Web service Discovery with Additional Semantics and Clustering. In: IEEE/WIC/ACM 2007 (2007)
Porter, M.F.: An Algorithm for Suffix Stripping. Program 1980 14(3), 130–137 (1980)
Platzer, C., Rosenberg, F., Dustdar, S.: Web service clustering using multidimentional angles as proximity measures. ACM Trans. Internet Technol. 9(3), 1–26 (2009)
Salton, G.: Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley Longman Publishing Co., Inc., Boston (1989)
Sivashanmugam, K., Verma, A.P., Miller, J.A.: Adding Semantics to Web services Standards. In: ICWS 2003, pp. 395–401 (2003)
Steyvers, M., Griffiths, T.: Probabilistic topic models. In: Landauer, T., Mcnamara, D., Dennis, S., Kintsch, W. (eds.) Latent Semantic Analysis: A Road to Meaning. Lawrence Erlbaum (2007)
Yu, Q.: Place Semantics into Context: Service Community Discovery from the WSDL Corpus. In: Kappel, G., Maamar, Z., Motahari-Nezhad, H.R. (eds.) ICSOC 2011. LNCS, vol. 7084, pp. 188–203. Springer, Heidelberg (2011)
Zhao, Y., Karypis, G.: Empirical and theoretical comparisons of selected criterion functions for document clustering. In: Machine Learning 2004, vol. 55, pp. 311–331 (2004)
Zhao, Y., Karypis, G.: Evaluation of hierarchical clustering algorithms for document datasets. In: CIKM 2002 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Aznag, M., Quafafou, M., Rochd, E.M., Jarir, Z. (2013). Probabilistic Topic Models for Web Services Clustering and Discovery. In: Lau, KK., Lamersdorf, W., Pimentel, E. (eds) Service-Oriented and Cloud Computing. ESOCC 2013. Lecture Notes in Computer Science, vol 8135. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40651-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-40651-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40650-8
Online ISBN: 978-3-642-40651-5
eBook Packages: Computer ScienceComputer Science (R0)