Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1292))

  • 280 Accesses

Abstract

As the sources of chemical literature are vast, researchers and readers have to spend a great amount of time to obtain specific information. In addition to this, the ambiguity among the names of chemical entities often gives results which are irrelevant to the query. So it is better to provide the researchers and readers with the facility to carry out chemistry-oriented search. Nowadays, domain-specific search engines have become popular because they offer increased accuracy and additional functionality for that domain. In this paper, a domain-specific search application is developed to retrieve the relevant chemical documents from a large collection. This search application has advanced features such as specialized index and re-ranking of documents by chemical entities, functional groups, and phrases as keywords. At present, 85,680 patents and 66,425 chemical abstracts are indexed with 36,984 entities of 23 different types and 16 functional groups.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Concepts, D. M. (2006). Technique, Jiawei Han and Micheline Kamber. University of Illinois at Urbana-Champaign.

    Google Scholar 

  2. McCallumzy, A., Nigamy, K., Renniey, J., & Seymorey, K. (1999). Building domain-specific search engines with machine learning techniques. In Proceedings of the AAAI Spring Symposium on Intelligent Agents in Cyberspace (pp. 28–39). Citeseer

    Google Scholar 

  3. Mitra, P., Giles, C. L., Sun, B., & Liu, Y. (2007, November). Chemxseer: a digital library and data repository for chemical kinetics. In Proceedings of the ACM first workshop on CyberInfrastructure: information management in eScience (pp. 7–10).

    Google Scholar 

  4. Pence, H. E., & Williams, A. (2010). ChemSpider: An online chemical information resource.

    Google Scholar 

  5. Sun, B., Mitra, P., Lee Giles, C., & Mueller, K. T. (2011). Identifying, indexing, and ranking chemical formulae and chemical names in digital documents. ACM Transactions on Information Systems (TOIS), 29(2), 1–38.

    Article  Google Scholar 

  6. Sun, B., Tan, Q., Mitra, P., & Giles, C. L. (2007, May). Extraction and search of chemical formulae in text documents on the web. In Proceedings of the 16th international conference on World Wide Web (pp. 251–260).

    Google Scholar 

  7. Akhondi, S. A., Klenner, A. G., Tyrchan, C., Manchala, A. K., Boppana, K., Lowe, D., & Muresan, S. (2014). Annotated chemical patent corpus: A gold standard for text mining. PLoS ONE, 9(9), e107477.

    Article  Google Scholar 

  8. Kolárik, C., Klinger, R., Friedrich, C. M., Hofmann-Apitius, M., & Fluck, J. (2008). Chemical names: Terminological resources and corpora annotation. In Workshop on Building and Evaluating Resources for Biomedical Text Mining (6th edition of the Language Resources and Evaluation Conference).

    Google Scholar 

  9. Friedrich, C. M., Revillion, T., Hofmann, M., & Fluck, J. (2006, April). Biomedical and chemical named entity recognition with conditional random fields: The advantage of dictionary features. In SMBM

    Google Scholar 

  10. Steele, R. J. (2001). Techniques for specialized search engines. In International Conference on Internet Computing. CSREA Press.

    Google Scholar 

  11. Mohd, M. (2011). Development of Search Engines using Lucene: An Experience. Procedia-Social and Behavioral Sciences, 18, 282–286.

    Article  Google Scholar 

  12. Gondaliya, T. P., & Joshi, H. D. (2017). Journey of Information Retrieval to Information Retrieval Tools-IR&IRT A Review.

    Google Scholar 

  13. Rahayu, S. B., Noah, S. A., & Wardhana, A. A. (2011, June). User-centered evaluation for IR: Ranking annotated document algorithms. In International Conference on Software Engineering and Computer Systems (pp. 306–312). Berlin, Heidelberg: Springer.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Hema .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hema, R. (2021). Extraction and Search of Relevant Chemical Documents from the Web. In: Peng, SL., Hao, RX., Pal, S. (eds) Proceedings of First International Conference on Mathematical Modeling and Computational Science. Advances in Intelligent Systems and Computing, vol 1292. Springer, Singapore. https://doi.org/10.1007/978-981-33-4389-4_24

Download citation

Publish with us

Policies and ethics