Chapter

Innovations in Hybrid Intelligent Systems

Volume 44 of the series Advances in Soft Computing pp 433-438

Focused Crawling for Retrieving Chemical Information

  • Zhaojie XiaAffiliated withState Key Laboratory of Multiphase Reactions, Institute of Process Engineering, Chinese Academy of Sciences
  • , Li GuoAffiliated withState Key Laboratory of Multiphase Reactions, Institute of Process Engineering, Chinese Academy of Sciences
  • , Chunyang LiangAffiliated withState Key Laboratory of Multiphase Reactions, Institute of Process Engineering, Chinese Academy of Sciences
  • , Xiaoxia LiAffiliated withState Key Laboratory of Multiphase Reactions, Institute of Process Engineering, Chinese Academy of Sciences
  • , Zhangyuan YangAffiliated withState Key Laboratory of Multiphase Reactions, Institute of Process Engineering, Chinese Academy of Sciences

* Final gross prices may vary according to local VAT.

Get Access

Abstract

The exponential growth of resources available in the Web has made it important to develop instruments to perform search efficiently. This paper proposes an approach for chemical information discovery by using focused crawling. The comparison of combination using various feature representations and classifier algorithms to implement focused crawlers was carried out. Latent Semantic Indexing (LSI) and Mutual Information (MI) were used to extract features from documents, while Naive Bayes (NB) and Support Vector Machines (SVM) were the selected algorithms to compute content relevance score. It was found that the combination of LSI and SVM provided the best solution.