Service Oriented Computing and Applications

, Volume 12, Issue 2, pp 169–182 | Cite as

A Web service search engine for large-scale Web service discovery based on the probabilistic topic modeling and clustering

  • Afnan Bukhari
  • Xumin Liu
Original Research Paper


With the ever increasing number of Web services, discovering an appropriate Web service requested by users has become a vital yet challenging task. We need a scalable and efficient search engine to deal with the large volume of Web services. The aim of this approach is to provide an efficient search engine that can retrieve the most relevant Web services in a short time. The proposed Web service search engine (WSSE) is based on the probabilistic topic modeling and clustering techniques that are integrated to support each other by discovering the semantic meaning of Web services and reducing the search space. The latent Dirichlet allocation (LDA) is used to extract topics from Web service descriptions. These topics are used to group similar Web services together. Each Web service description is represented as a topic vector, so the topic model is an efficient technique to reduce the dimensionality of word vectors and to discover the semantic meaning that is hidden in Web service descriptions. Also, the Web service description is represented as a word vector to address the drawbacks of the keyword-based search system. The accuracy of the proposed WSSE is compared with the keyword-based search system. Also, the precision and recall metrics are used to evaluate the performance of the proposed approach and the keyword-based search system. The results show that the proposed WSSE based on LDA and clustering outperforms the keyword-based search system.


Web service Discovery Clustering Topic model Vector 


  1. 1.
  2. 2.
    programmableweb website.
  3. 3.
    scikit-learn, machine learning in python.
  4. 4.
  5. 5.
    Al-Masri E, Mahmoud QH (2007) Wsce: a crawler engine for large-scale discovery of web services. In: IEEE International conference on Web Services, 2007. ICWS 2007, pp 1104–1111Google Scholar
  6. 6.
    Aznag M, Quafafou M, Rochd EM, Jarir Z (2013) Service-oriented and cloud computing: second European Conference, ESOCC 2013, Málaga, Spain, September 11–13, 2013. In: Proceedings, chapter probabilistic topic models for Web services clustering and discovery, pp 19–33. Springer, Berlin, Heidelberg, Berlin, HeidelbergGoogle Scholar
  7. 7.
    Chen L, Hu L, Zheng Z, Wu J, Yin J, Li Y, Deng S (2011) Wtcluster: Utilizing tags for web services clustering. In: Service-Oriented Computing, pp 204–218Google Scholar
  8. 8.
    Chen L, Wang Y, Yu Q, Zheng Z, Wu J (2013) Service-oriented computing: 11th International Conference, ICSOC 2013, Berlin, Germany, December 2–5, 2013. In: Proceedings, chapter WT-LDA: user tagging augmented LDA for Web service clustering, . Springer, Berlin, Heidelberg, pp 162–176Google Scholar
  9. 9.
    Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(2):224–227CrossRefGoogle Scholar
  10. 10.
    Elgazzar K, Hassan A, Martin P (2010) Clustering wsdl documents to bootstrap the discovery of web services. In: IEEE international conference on Web services (ICWS), 2010, pp 147–154Google Scholar
  11. 11.
    Elshater Y, Elgazzar K, Martin P (2015) Godiscovery: Web service discovery made efficient. In: IEEE International Conference on Web Services (ICWS), 2015, pp 711–716Google Scholar
  12. 12.
    Fensel D, Kerrigan M, Zaremba M (2008) Implementing semantic web services: the SESA framework, chapter discovery. Springer, Berlin, pp 169–172CrossRefGoogle Scholar
  13. 13.
    Griffiths T (2002) Gibbs sampling in the generative model of latent dirichlet allocation. Technical reportGoogle Scholar
  14. 14.
    Hatzi O, Batistatos G, Nikolaidou M, Anagnostopoulos D (2012) A specialized search engine for web service discovery. In: IEEE 19th International Conference on Web Services (ICWS), 2012, pp 448–455Google Scholar
  15. 15.
    Lo W, Yin J, Wu Z (2015) Accelerated sparse learning on tag annotation for web service discovery. In: IEEE international conference on Web services (ICWS), 2015, pp 265–272Google Scholar
  16. 16.
    MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1: statistics, University of California Press, Berkeley, pp 281–297Google Scholar
  17. 17.
    The Mathworks, Inc. (2015) Natick, Massachusetts. MATLAB version (R2015a)Google Scholar
  18. 18.
    McCallum AK (2002) Mallet: a machine learning for language toolkit.
  19. 19.
    Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41CrossRefGoogle Scholar
  20. 20.
    PleplÃl Q, Perplexity to evaluate topic modelsGoogle Scholar
  21. 21.
    Ward JH (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc 58(301):236–244MathSciNetCrossRefGoogle Scholar
  22. 22.
    Xia Y, Chen P, Bao L, Wang M, Yang J (2011) A qos-aware web service selection algorithm based on clustering. In: 2011 IEEE international conference on Web services (ICWS), pp 428–435Google Scholar
  23. 23.
    Xie P, Xing EP (2013) Integrating document clustering and topic modeling. CoRR. arxiv:1309.6874
  24. 24.
    Zhang Y, Zheng Z, Lyu M (2010) Wsexpress: a qos-aware search engine for web services. In: IEEE International Conference on Web services (ICWS), 2010, pp 91–98Google Scholar
  25. 25.
    Zhou J, Li S (2009) Semantic web service discovery approach using service clustering. In: International conference on information engineering and computer science, ICIECS 2009, pp 1–5Google Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Information TechnologyTaif UniversityTaifKingdom of Saudi Arabia
  2. 2.Department of Computer ScienceRochester Institute of TechnologyRochesterUSA

Personalised recommendations