Skip to main content
Log in

Using evolution strategy for cooperative focused crawling on semantic web

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Conventional focused crawling systems have difficulties on contextual information retrieval in semantic web environment. In order to deal with these problems, we propose a cooperative crawler platform based on evolution strategy to build semantic structure (i.e., local ontologies) of web spaces. Mainly, multiple crawlers can discover semantic instances (i.e., ontology fragments) from annotated resources in a web space, and a centralized meta-crawler can carry out incremental aggregation of the semantic instances sent by the multiple crawlers. To do this, we exploit similarity-based ontology matching algorithm for computing semantic fitness of a population, i.e., summation of all possible semantic similarities between the semantic instances. As a result, we could efficiently obtain the best mapping condition (i.e., maximizing the semantic fitness) of the estimated semantic structures. We have shown two significant contributions of this paper; (1) reconciling semantic conflicts between multiple crawlers, and (2) adapting to evolving semantic structures of web spaces over time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Web Ontology Language, http://www.w3.org/TR/owl-features/

  2. JGAP, http://www.jgap.sourceforge.net/

  3. Alignment API, http://www.alignapi.gforge.inria.fr/

  4. This testing bed is obtained from [16]

References

  1. Aggarwal CC, Al-Garawi F, Yu PS (2001) On the design of a learning crawler for topical resource discovery. ACM Trans Inf Syst 19(3):286–309

    Google Scholar 

  2. Beyer H-G, Schwefel H-P (2002) Evolution strategies—a comprehensive introduction. Natural Comput 1(1):3–52

    MATH  MathSciNet  Google Scholar 

  3. Buitelaar P, Cimiano P, Magnini B (2005) Ontology learning from text: methods, evaluation and applications, vol 123 of frontiers in artificial intelligence and applications. IOS Press, Amsterdam

  4. Cantú-Paz E (2002) Order statistics and selection methods of evolutionary algorithms. Inf Process Lett 82(1):15–22

    MATH  Google Scholar 

  5. Chakrabarti S, van den Berg M, Dom B (1999) Focused crawling: a new approach to topic-specific web resource discovery. Comput Netw 31(11–16):1623–1640

    Google Scholar 

  6. Claypool M, Brown D, Le P, Waseda M (2001) Inferring user interest. IEEE Internet Comput 5(6):32–39

    Google Scholar 

  7. De Bra PME, Post RDJ (1994) Information retrieval in the World-Wide Web: making client-based searching feasible. Comput Netw ISDN Syst 27(2):183–192

    Google Scholar 

  8. Euzenat J (1995) Building consensual knowledge bases: context and architecture. In: Proceedings of the 2nd international conference on building and sharing very large-scale knowledge bases (KBKS), pp 143–155. IOS Press, Amsterdam

  9. Euzenat J, Valtchev P (2004) Similarity-based ontology alignment in OWL-Lite. In: Proceedings of the 16th European conference on artificial intelligence, pp 333–337

  10. Flouris G (2006) On belief change in ontology evolution. AI Commun 19(4):395–397

    MathSciNet  Google Scholar 

  11. Haase P, Hotho A, Schmidt-Thieme L, Sure Y (2005) Collaborative and usage-driven evolution of personal ontologies. In: Gómez-Pérez A, Euzenat J (eds) Proceedings of the second European semantic web conference (ESWC 2005), Heraklion, Crete, Greece, May 29–June 1, 2005. Lecture Notes in Computer Science, vol 3532. Springer, Heidelberg, pp 486–499

  12. Haase P, Stojanovic L (2005) Consistent evolution of owl ontologies. In: Gómez-Pérez A, Euzenat J (eds) Proceedings of the second European semantic web conference (ESWC 2005), Heraklion, Crete, Greece, May 29–June 1. Lecture Notes in Computer Science, vol 3532. Springer, Heidelberg, pp 182–197

  13. Heflin J, Hendler JA (2001) A portrait of the semantic web in action. IEEE Intell Syst 16(2):54–59

    Google Scholar 

  14. Hotho A, Jäschke R, Schmitz C, Stumme G (2006) Information retrieval in folksonomies: search and ranking. In: Sure Y, Domingue J (eds) ESWC. Lecture Notes in Computer Science, vol 4011. Springer, Heidelberg, pp 411–426

  15. Jung JJ (2005) Collaborative web browsing based on semantic extraction of user interests with bookmarks. J Univers Comput Sci 11(2):213–228

    Google Scholar 

  16. Jung JJ (2007) Exploiting semantic annotation to supporting user browsing on the web. Knowl Based Syst 20(4):373–381

    Google Scholar 

  17. Jung JJ (2007) Ontological framework based on contextual mediation for collaborative information retrieval. Inf Retrieval 10(1):85–109

    Google Scholar 

  18. Kelly D, Teevan J (2003) Implicit feedback for inferring user preference: a bibliography. SIGIR Forum 37(2):18–28

    Google Scholar 

  19. Kim J (2005) Meta-level patterns for interactive knowledge capture. In: Proceedings of the 3rd international conference on knowledge capture (K-CAP ’05). ACM Press, New York, pp 207–208

  20. Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM 46(5):604–632

    MATH  MathSciNet  Google Scholar 

  21. Liu H, Milios E, Janssen J (2004) Probabilistic models for focused web crawling. In: Proceedings of the 6th annual ACM international workshop on Web information and data management (WIDM 2004). ACM Press, New York, pp 16–22

  22. Menczer F, Pant G, Srinivasan P (2004) Topical web crawlers: evaluating adaptive algorithms. ACM Trans Internet Technol 4(4):378–419

    Google Scholar 

  23. Nguyen NT (2006) Conflicts of ontologies—classification and consensus-based methods for resolving. In: Gabrys B, Howlett RJ, Jain LC (eds) Proceedings of the 10th international conference on knowledge-based intelligent information and engineering systems (KES 2006). Lecture Notes in Computer Science, vol 4252. Springer, Heidelberg, pp 267–274

  24. Noy NF, Chugh A, Liu W, Musen MA (2006) A framework for ontology evolution in collaborative environments. In: Cruz IF, Decker S, Allemang D, Preist C, Schwabe D, Mika P, Uschold M, Aroyo L (eds) Proceedings of the 5th international semantic Web conference (ISWC 2006). Lecture Notes in Computer Science, vol 4273. Springer, Heidelberg, pp 544–558

  25. Noy NF, Klein MCA (2004) Ontology evolution: not the same as schema evolution. Knowl Inf Syst 6(4):428–440

    Google Scholar 

  26. Noy NF, Musen MA (2000) Prompt: algorithm and tool for automated ontology merging and alignment. In: Proceedings of the 17th national conference on artificial intelligence and twelfth conference on on innovative applications of artificial intelligence, July 30–August 3. AAAI Press/The MIT Press, Austin, pp 450–455

  27. Plessers P, De Troyer O, Casteleyn S (2007) Understanding ontology evolution: a change detection approach. J Web Semant 5(1):39–49

    Google Scholar 

  28. Staab S (2002) Emergent semantics. IEEE Intell Syst 17(1):78–86

    Google Scholar 

  29. Uren VS, Cimiano P, Iria J, Handschuh S, Vargas-Vera M, Motta E, Ciravegna F (2006) Semantic annotation for knowledge management: requirements and a survey of the state of the art. J Web Semant 4(1):14–28

    Google Scholar 

  30. White RW, Jose JM, Ruthven I (2006) An implicit feedback approach for interactive information retrieval. Inf Process Manage 42(1):166–190

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jason J. Jung.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jung, J.J. Using evolution strategy for cooperative focused crawling on semantic web. Neural Comput & Applic 18, 213–221 (2009). https://doi.org/10.1007/s00521-008-0173-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-008-0173-7

Keywords

Navigation