Webcrawling for a Biological Strategy Corpus to Support Biologically-Inspired Design

  • D. Vandevenne
  • J. Caicedo
  • P.-A. Verhaegen
  • S. Dewulf
  • J. R. Duflou
Conference paper


In the context of a larger effort to develop a tool that supports ideation in the early stage of Biologically-Inspired Design, this paper describes how the first important research question is tackled: any scalable approach towards such a tool requires a large corpus of biological strategies. This corpus should contain as much of the world’s knowledge about how organisms tackle problems as possible and it should be updated in an automated way. However, currently such a resource or system does not exist. This paper presents a scalable webcrawling approach that allows to continuously search the Internet for biological strategies and to keep its knowledge base up-to-date without manual interaction. The webcrawler solves this needle-in-a-haystack task by combining different classifiers to score the relevance of web documents to the envisaged corpus. It uses these scores to focus future crawling and to gain efficiency. In this way, it becomes possible to continuously harvest new biological strategy documents in a scalable way. Finally, the possible applications of this contribution are positioned in the different existing approaches for systematic BID.


  1. 1.
    Bar-Cohen Y (2006) Biomimetics: biologically inspired technologies. CRC/Taylor & Francis. ISBN: 0849331633Google Scholar
  2. 2.
    Benyus JM (1997) Biomimicry: innovation inspired by nature. Harper Perennial. ISBN:0060533226Google Scholar
  3. 3.
    Scannell K (2007) The crubadan project: corpus building for under-resourced languages. Building and exploring web corpora. In: Proceedings of the 3rd web as corpus workshop, incorporating Cleaneval, p 5Google Scholar
  4. 4.
    Talvensaari T, Pirkola A, Järvelin K, Juhola M, Laurikkala J (2008) Focused web crawling in the acquisition of comparable corpora. Inf Retrieval 11(5):427–445CrossRefGoogle Scholar
  5. 5.
    Baykan E, Henzinger M, Marian L, Weber I (2009) Purely URL-based topic classification. In: Proceedings of the 18th international conference on world wide web, New York, pp 1109–1110Google Scholar
  6. 6.
    Caicedo J (2011) Web crawling for the acquisition of a biological corpus, master thesis for master of industrial management. K.U.Leuven, BelgiumGoogle Scholar
  7. 7.
    Verhaegen P, D’hondt J, Vandevenne D, Dewulf S, Duflou J (2011) Identifying candidates for design-by-analogy. Computers in IndustryGoogle Scholar
  8. 8.
    Lenau T, Dentel A, Ingvarsdóttir Þ, Guðlaugsson T (2010) Engineering design of an adaptive leg prothesis using biological principles. Design 2010, Dubrovnik—Croatia, 17–20 May 2010Google Scholar
  9. 9.
    Porter MF (1997) An algorithm for suffix stripping. Readings in information retrieval. Morgan Kaufmann Publishers Inc., San Francisco, pp 313–316Google Scholar
  10. 10.
    Gerner M, Nenadic G, Bergman C (2010) LINNAEUS: a species name identification system for biomedical literature. BMC Bioinformatics 11(1):85Google Scholar
  11. 11.
    Menczer F, Pant G, Srinivasan P, Ruiz ME (2001) evaluating topic driven web crawlers. In: Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval, pp 241–249Google Scholar
  12. 12.
    Chakrabarti S, Van Den Berg M, Dom B (1999) Focused crawling: a new approach to topic-specific web resource discovery. Comput Netw 31(1999):1623–1640CrossRefGoogle Scholar
  13. 13.
    Abiteboul S, Preda M, Cobena G (2003) Adaptive on-line page importance computation. In: Proceedings of the 12th international conference on world wide web, pp 280–290Google Scholar
  14. 14.
    Vandevenne D, Verhaegen P-A, Dewulf S, Duflou JR. A scalable approach for the integration of large knowledge repositories in the biologically-inspired design process. ICED11, Denmark, 15–18 AugGoogle Scholar
  15. 15.
    Ask nature, the biomimicry design portal. Available from Accessed 30 Sep 2011
  16. 16.
    Altshuller GS (1984) Creativity as an exact science: the theory of the solution of inventive problems. Gordon and Breach Science Publishers. ISBN: 0677212305Google Scholar
  17. 17.
    Vincent JFV, Bogatyreva OA, Bogatyrev NR, Bowyer A, Pahl AK (2006) Biomimetics: its practice and theory. J R Soc Interface 3(9):471–482CrossRefGoogle Scholar
  18. 18.
    Chiu I, Shu LH (2007) Biomimetic design through natural language analysis to facilitate cross-domain information retrieval. Artif Intell Eng Des Anal Manuf 21(1):45–59Google Scholar
  19. 19.
    Chiu I, Shu LH (2005) Bridging cross-domain terminology for biomimetic design. In: Proceedings of ASME 2005 international design engineering technical conferences and computers and information in engineering conference. Paper No. DETC2005/DTM-84908, Long Beach, CA, pp 24–28Google Scholar
  20. 20.
    Purves WK, Sadava D, Orians GH, Heller HC (2001) Life: the science of biology. W.H.Freeman & Co Ltd. ISBN: 0716726297Google Scholar
  21. 21.
    Nagel JKS, Nagel BI, Stone RB, McAdams DA (2010) Function-based, biologically inspired concept generation. Artif Intell Eng Des Anal Manuf 24(4):521–535CrossRefGoogle Scholar
  22. 22.
    Vattam S, Wiltgen B, Helms M, Goel A, Yen J (2010) DANE: fostering creativity in and through biologically inspired design. In: First international conference on design creativity, Kobe, JapanGoogle Scholar
  23. 23.
    Chakrabarti A, Sarkar P, Leelavathamma B, Nataraju BS (2005) A functional representation for aiding biomimetic and artificial inspiration of new ideas. Artif Intell Eng Des Anal Manuf 19(2):113–132Google Scholar
  24. 24.
    Wang C, Guan Z, Chen C, Bu J, Wang J, Lin H (2009) On-line topical importance estimation: an effective focused crawling algorithm combining link and content analysis. Journal of Zhejiang University SCIENCE A 10(8):1114–1124Google Scholar
  25. 25.
    Chakrabarti S, Punera K, Subramanyam M (2002) Accelerated focused crawling through online relevance feedback. In: Proceedings of the 11th international conference on world wide web, pp 148–159Google Scholar

Copyright information

© Springer-Verlag London 2013

Authors and Affiliations

  • D. Vandevenne
    • 1
  • J. Caicedo
    • 1
  • P.-A. Verhaegen
    • 1
  • S. Dewulf
    • 1
  • J. R. Duflou
    • 1
  1. 1.Department of Mechanical EngineeringCentre for Industrial ManagementK.U.LeuvenBelgium

Personalised recommendations