Skip to main content

A Thesaurus Based Semantic Relation Extraction for Agricultural Corpora

  • Conference paper
  • First Online:
Computational Intelligence in Data Science (ICCIDS 2020)

Abstract

Semantic relations exist two concepts present in the text. Semantic relation extraction becomes an essential part of building an efficient Natural Language Processing (NLP) applications such as Question Answering (QA) and Information Retrieval (IR) system. Automatic semantic relation extraction from text increases the efficiency of these systems by aiding in retrieving more accurate information to the user query. In this research work, we have proposed a framework that extracts agricultural entities and finds the semantic relation exist between entities. Entity extraction is done using a Parts Of Speech (POS) tagger, Word Suffixes and Thesaurus without using any of the external domain-specific knowledge bases, such as Ontology and WordNet. Semantic relation exists between entities are done by using Multinomial Naïve Bayes (MNB) classifier. This paper extracts two entities, namely disease and treatment and focuses on two semantic relations namely “Cure” and “Prevent”. The “Cure” semantic relation expresses the remedial measure for the diseases that prevail in the crops, and the “Prevent” semantic relation shows the precautionary measures that could prevent the crop from being affected. The proposed approach has been trained with 2281 sentences and tested against 553 sentences and then evaluated using standard metrics.

Supported by SRM Institute of Science and Technology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alston, J., Pardey, P.: Agriculture in the global economy. J. Econ. Perspect. 28, 121–46 (2002)

    Article  Google Scholar 

  2. Janssen, S.J.C., et al.: Towards a new generation of agricultural system data, models and knowledge products: information and communication technology. Agric. Syst. 155, 200–212 (2017)

    Article  Google Scholar 

  3. National Agricultural Library - Thesaurus. https://agclass.nal.usda.gov/download.shtml. Accessed 4 Feb 2019

  4. Ben Abacha, A., Zweigenbaum, P.: Automatic extraction of semantic relations between medical entities: a rule based approach. J. Biomed. Semant. 2(5), S4 (2011)

    Article  Google Scholar 

  5. Cheng, X., Miao, D., Wang, C.: A link-based approach to semantic relation analysis. Neurocomputing 154, 127–138 (2015)

    Article  Google Scholar 

  6. Wang, D., Liu, X., Luo, H., Fan, J.: A novel framework for semantic entity identification and relationship integration in large scale text data. Future Gener. Comput. Syst. 64, 198–210 (2016)

    Article  Google Scholar 

  7. Zhang, M.L., Peña, J.M., Robles, V.: Feature selection for multi-label naive Bayes classification. Inf. Sci. 179(19), 3218–3229 (2009)

    Article  Google Scholar 

  8. Altheneyan, A.S., Menai, M.E.B.: NaïVe Bayes classifiers for authorship attribution of arabic texts. J. King Saud. Univ. Comput. Inf. Sci. 26(4), 473–484 (2014)

    Google Scholar 

  9. Takeuchi, K., Collier, N.: Bio-medical entity extraction using support vector machines. Artif. Intell. Med. 33(2), 125–137 (2005)

    Article  Google Scholar 

  10. Bhasuran, B., Murugesan, G., Abdulkadhar, S., Natarajan, J.: Stacked ensemble combined with fuzzy matching for biomedical named entity recognition of diseases. J. Biomed. Inf. 64, 1–9 (2016)

    Article  Google Scholar 

  11. Frunza, O., Inkpen, D., Tran, T.: A machine learning approach for identifying disease-treatment relations in short texts. IEEE Trans. Knowl. Data Eng. 23(6), 801–814 (2011)

    Article  Google Scholar 

  12. Chaudhary, A., Kolhe, S., Kamal, R.: A hybrid ensemble for classification in multiclass datasets: an application to oilseed disease dataset. Comput. Electron. Agric. 124, 65–72 (2016)

    Article  Google Scholar 

  13. Agricultural dataset. https://drive.google.com/file/d/1b1TfA25dqXFxdH6U9eW2MP2S12ae8IaI/view?usp=sharing. Accessed 4 Feb 2019

  14. Zhang, W., Gao, F.: An improvement to naive Bayes for text classification. Procedia Eng. 15, 2160–2164 (2011)

    Article  Google Scholar 

  15. John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, UAI 1995, pp. 338–345 (1995)

    Google Scholar 

  16. Bermejo, P., Gámez, J., Puerta, J.: Improving the performance of Naive Bayes multinomial in e-mail foldering by introducing distribution-based balance of datasets. Expert Syst. Appl. 38, 2072–2080 (2011)

    Article  Google Scholar 

  17. Rosario, B., Hearst, M.: Classifying semantic relations in bioscience texts. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-2004), Barcelona, Spain, pp. 430–437 (2004)

    Google Scholar 

  18. Powers, D.: Evaluation: from precision, recall and F measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Tech. 2, 37–63 (2007)

    Google Scholar 

  19. Chinchor, N.A.: Overview of MUC-7. In: Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia (1998)

    Google Scholar 

  20. Marrero, M., Urbano, J., Sánchez-Cuadrado, S., Morato, J., Gómez-Berbís, J.M.: Named entity recognition: fallacies, challenges and opportunities. Computer Stan. Interfaces 35(5), 482–489 (2013)

    Article  Google Scholar 

  21. Gridach, M.: Character-level neural network for biomedical named entity recognition. J. Biomed. Inf. 70, 85–91 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Srinivasan .

Editor information

Editors and Affiliations

A Appendix

A Appendix

The term NER was introduced in 1996 at the Message Understanding Conference to refer the entities [19]. NER is defined as to identify and classify the information elements called Named Entities [20]. Biomedical Named Entity Recognition (BNER) is used to determine the biological entities such as protein names, genes, disease name in biomedical texts [9, 21]. Please note that the first paragraph of a section or subsection is not indented. The first paragraph that follows a table, figure, equation etc. does not need an indent, either.

Rights and permissions

Reprints and permissions

Copyright information

© 2020 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Srinivasan, R., Subalalitha, C.N. (2020). A Thesaurus Based Semantic Relation Extraction for Agricultural Corpora. In: Chandrabose, A., Furbach, U., Ghosh, A., Kumar M., A. (eds) Computational Intelligence in Data Science. ICCIDS 2020. IFIP Advances in Information and Communication Technology, vol 578. Springer, Cham. https://doi.org/10.1007/978-3-030-63467-4_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63467-4_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63466-7

  • Online ISBN: 978-3-030-63467-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics