Ontology-Learning-Based Focused Crawling for Online Service Advertising Information Discovery and Classification

  • Hai Dong
  • Farookh Khadeer Hussain
  • Elizabeth Chang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7636)


Online advertising has become increasingly popular among SMEs in service industries, and thousands of service advertisements are published on the Internet every day. However, there is a huge barrier between service-provider-oriented service information publishing and service-customer-oriented service information discovery, which causes that service consumers hardly retrieve the published service advertising information from the Internet. This issue is partly resulted from the ubiquitous, heterogeneous, and ambiguous service advertising information and the open and shoreless Web environment. The existing research, nevertheless, rarely focuses on this research problem. In this paper, we propose an ontology-learning-based focused crawling approach, enabling Web-crawler-based online service advertising information discovery and classification in the Web environment, by taking into account the characteristics of service advertising information. This approach integrates an ontology-based focused crawling framework, a vocabulary-based ontology learning framework, and a hybrid mathematical model for service advertising information similarity computation.


Support Vector Machine Service Description Service Consumer Harvest Rate Service Concept 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Wang, H., Lee, M.K.O., Wang, C.: Consumer privacy concerns about Internet marketing. Commun. ACM 41, 63–70 (1998)CrossRefGoogle Scholar
  2. 2.
    Dong, H., Hussain, F.K., Chang, E.: A service search engine for the industrial digital ecosystems. IEEE Trans. Ind. Electron. 58, 2183–2196 (2011)CrossRefGoogle Scholar
  3. 3.
    Dong, H., Hussain, F.K.: Focused crawling for automatic service discovery, annotation, and classification in industrial digital ecosystems. IEEE Trans. Ind. Electron. 58, 2106–2116 (2011)CrossRefGoogle Scholar
  4. 4.
    Dong, H., Hussain, F.K., Chang, E.: A framework for discovering and classifying ubiquitous services in digital health ecosystems. J. of Comput. and Syst. Sci. 77, 687–704 (2011)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Dong, H., Hussain, F.K., Chang, E.: State of the Art in Semantic Focused Crawlers. In: Gervasi, O., Taniar, D., Murgante, B., Laganà, A., Mun, Y., Gavrilova, M.L. (eds.) ICCSA 2009, Part II. LNCS, vol. 5593, pp. 910–924. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  6. 6.
    Wong, W., Liu, W., Bennamoun, M.: Ontology learning from text: A look back and into the future. ACM Computing Surveys X (2011) (to appear)Google Scholar
  7. 7.
    Zheng, H.-T., Kang, B.-Y., Kim, H.-G.: An ontology-based approach to learnable focused crawling. Inform. Sciences 178, 4512–4522 (2008)CrossRefGoogle Scholar
  8. 8.
    Su, C., Gao, Y., Yang, J., Luo, B.: An efficient adaptive focused crawler based on ontology learning. In: Proceedings of the Fifth Int. Conf. on Hybrid Intelligent Syst. (HIS 2005), pp. 73–78. IEEE Computer Society, Rio de Janeiro (2005)Google Scholar
  9. 9.
    Rennie, J., McCallum, A.: Using reinforcement learning to spider the Web efficiently. In: Bratko, I., Dzeroski, S. (eds.) Proceedings of the Sixteenth Int. Conf. on Mach. Learning (ICML 1999), pp. 335–343. Morgan Kaufmann Publishers Inc., Bled (1999)Google Scholar
  10. 10.
    Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM, Pittsburgh (1992)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Hai Dong
    • 1
  • Farookh Khadeer Hussain
    • 2
  • Elizabeth Chang
    • 1
  1. 1.School of Information SystemsCurtin University of TechnologyAustralia
  2. 2.School of SoftwareUniversity of TechnologySydneyAustralia

Personalised recommendations