Abstract
Semantic relations exist two concepts present in the text. Semantic relation extraction becomes an essential part of building an efficient Natural Language Processing (NLP) applications such as Question Answering (QA) and Information Retrieval (IR) system. Automatic semantic relation extraction from text increases the efficiency of these systems by aiding in retrieving more accurate information to the user query. In this research work, we have proposed a framework that extracts agricultural entities and finds the semantic relation exist between entities. Entity extraction is done using a Parts Of Speech (POS) tagger, Word Suffixes and Thesaurus without using any of the external domain-specific knowledge bases, such as Ontology and WordNet. Semantic relation exists between entities are done by using Multinomial Naïve Bayes (MNB) classifier. This paper extracts two entities, namely disease and treatment and focuses on two semantic relations namely “Cure” and “Prevent”. The “Cure” semantic relation expresses the remedial measure for the diseases that prevail in the crops, and the “Prevent” semantic relation shows the precautionary measures that could prevent the crop from being affected. The proposed approach has been trained with 2281 sentences and tested against 553 sentences and then evaluated using standard metrics.
Supported by SRM Institute of Science and Technology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alston, J., Pardey, P.: Agriculture in the global economy. J. Econ. Perspect. 28, 121–46 (2002)
Janssen, S.J.C., et al.: Towards a new generation of agricultural system data, models and knowledge products: information and communication technology. Agric. Syst. 155, 200–212 (2017)
National Agricultural Library - Thesaurus. https://agclass.nal.usda.gov/download.shtml. Accessed 4 Feb 2019
Ben Abacha, A., Zweigenbaum, P.: Automatic extraction of semantic relations between medical entities: a rule based approach. J. Biomed. Semant. 2(5), S4 (2011)
Cheng, X., Miao, D., Wang, C.: A link-based approach to semantic relation analysis. Neurocomputing 154, 127–138 (2015)
Wang, D., Liu, X., Luo, H., Fan, J.: A novel framework for semantic entity identification and relationship integration in large scale text data. Future Gener. Comput. Syst. 64, 198–210 (2016)
Zhang, M.L., Peña, J.M., Robles, V.: Feature selection for multi-label naive Bayes classification. Inf. Sci. 179(19), 3218–3229 (2009)
Altheneyan, A.S., Menai, M.E.B.: NaïVe Bayes classifiers for authorship attribution of arabic texts. J. King Saud. Univ. Comput. Inf. Sci. 26(4), 473–484 (2014)
Takeuchi, K., Collier, N.: Bio-medical entity extraction using support vector machines. Artif. Intell. Med. 33(2), 125–137 (2005)
Bhasuran, B., Murugesan, G., Abdulkadhar, S., Natarajan, J.: Stacked ensemble combined with fuzzy matching for biomedical named entity recognition of diseases. J. Biomed. Inf. 64, 1–9 (2016)
Frunza, O., Inkpen, D., Tran, T.: A machine learning approach for identifying disease-treatment relations in short texts. IEEE Trans. Knowl. Data Eng. 23(6), 801–814 (2011)
Chaudhary, A., Kolhe, S., Kamal, R.: A hybrid ensemble for classification in multiclass datasets: an application to oilseed disease dataset. Comput. Electron. Agric. 124, 65–72 (2016)
Agricultural dataset. https://drive.google.com/file/d/1b1TfA25dqXFxdH6U9eW2MP2S12ae8IaI/view?usp=sharing. Accessed 4 Feb 2019
Zhang, W., Gao, F.: An improvement to naive Bayes for text classification. Procedia Eng. 15, 2160–2164 (2011)
John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, UAI 1995, pp. 338–345 (1995)
Bermejo, P., Gámez, J., Puerta, J.: Improving the performance of Naive Bayes multinomial in e-mail foldering by introducing distribution-based balance of datasets. Expert Syst. Appl. 38, 2072–2080 (2011)
Rosario, B., Hearst, M.: Classifying semantic relations in bioscience texts. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-2004), Barcelona, Spain, pp. 430–437 (2004)
Powers, D.: Evaluation: from precision, recall and F measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Tech. 2, 37–63 (2007)
Chinchor, N.A.: Overview of MUC-7. In: Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia (1998)
Marrero, M., Urbano, J., Sánchez-Cuadrado, S., Morato, J., Gómez-Berbís, J.M.: Named entity recognition: fallacies, challenges and opportunities. Computer Stan. Interfaces 35(5), 482–489 (2013)
Gridach, M.: Character-level neural network for biomedical named entity recognition. J. Biomed. Inf. 70, 85–91 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Appendix
A Appendix
The term NER was introduced in 1996 at the Message Understanding Conference to refer the entities [19]. NER is defined as to identify and classify the information elements called Named Entities [20]. Biomedical Named Entity Recognition (BNER) is used to determine the biological entities such as protein names, genes, disease name in biomedical texts [9, 21]. Please note that the first paragraph of a section or subsection is not indented. The first paragraph that follows a table, figure, equation etc. does not need an indent, either.
Rights and permissions
Copyright information
© 2020 IFIP International Federation for Information Processing
About this paper
Cite this paper
Srinivasan, R., Subalalitha, C.N. (2020). A Thesaurus Based Semantic Relation Extraction for Agricultural Corpora. In: Chandrabose, A., Furbach, U., Ghosh, A., Kumar M., A. (eds) Computational Intelligence in Data Science. ICCIDS 2020. IFIP Advances in Information and Communication Technology, vol 578. Springer, Cham. https://doi.org/10.1007/978-3-030-63467-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-63467-4_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63466-7
Online ISBN: 978-3-030-63467-4
eBook Packages: Computer ScienceComputer Science (R0)