Named Entity Recognition from Gujarati Text Using Rule-Based Approach

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 736)

Abstract

NER which is known as Named Entity Recognition is an application of Natural Language Processing (NLP). NER is an activity of Information Extraction. NER is a task used for automated text processing for various industries, a key concept for academics, artificial intelligence, robotics, Bioinformatics and much more. NER is always an essential activity when dealing with chief NLP activity such as machine translation, question-answering, document summarization etc. Most NER work has been done for other European languages. NER work has been done in few Indian constitutional languages. Not enough work is possible due to some challenges such as lack of resources, ambiguity in language, morphologically rich and much more. In this paper, to identify various named entities from a text document, rules are defined using Rule-based approach. Based on defined rules, three different test cases computed on the training dataset and achieved 70% of accuracy.

Keywords

NER Rule-based approach Constitutional languages Tagset Tithi 

References

  1. 1.
    Athavale, V., Bharadwaj, S., Pamecha, M., Prabhu, A., Shrivastava, M.: Towards Deep Learning in Hindi NER: An approach to tackle the Labelled Data Scarcity (2016)Google Scholar
  2. 2.
    Jiandani, K.S.D., Bhattacharyya, P.: Hybrid inflectional stemmer and rule-based derivational stemmer for Gujarati. In: Proceedings of the 2nd Workshop on South and Southeast Asian Natural Language Processing (WSSANLP 2011), November 2011Google Scholar
  3. 3.
    Amarappa, S., Sathyanarayana, S.V.: Kannada named entity recognition and classification (nerc) based on multinomial naïve Bayes (MNB) classifier. Int. J. Nat. Lang. Comput. (IJNLC) 4, 39–52 (2015)Google Scholar
  4. 4.
    Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach. Int. J. Mach. Learn. Comput. 4(3), 300–306 (2014)CrossRefGoogle Scholar
  5. 5.
    Sathyanarayana, S.A.: A hybrid approach for named entity recognition, classification and extraction (NERCE) in Kannada documents. In: Proceedings of International Conference on Multimedia Processing, Communication, and Info. Tech., MPCIT (2013)Google Scholar
  6. 6.
    Singh, A.K.: Named entity recognition for south and south east asian languages: taking stock. In: Proceedings of the IJCNLP Workshop on NER for South and South East Asian Languages, pp 5–16 (2008)Google Scholar
  7. 7.
    Agarwal, A., Singh, S.P., Kumar, A., Darbari, H.: Morphological analyser for hindi-a rule-based implementation. Int. J. Adv. Comput. Res. 4(1), 19 (2014)Google Scholar
  8. 8.
    Sharma, L.K., Mittal, N.: Named entity based answer extraction from hindi text corpus using n-grams. In: 11th International Conference on Natural Language Processing, p. 362, December 2014Google Scholar
  9. 9.
    Sasan, T.S., Jamwal, S.S.: Transliteration of name entities using rule-based approach. Int. J. Adv. Res. Comput. Sci. Soft. Eng., 6(6) (2016)Google Scholar
  10. 10.
    Jahan, N., Morwal, S., Chopra, D.: Named entity recognition in Indian languages using gazetteer method and hidden Markov model: a hybrid approach. IJCSET, March 2012Google Scholar
  11. 11.
    Abinaya, N., Kumar, M.A., Soman, K.P.: Randomized kernel approach for named entity recognition in Tamil. Indian J. Sci. Technol. 8(24), 1–7 (2015)CrossRefGoogle Scholar
  12. 12.
    Kaur, Y., Kaur, E.: Named Entity Recognition system for Hindi Language using a combination of rule-based approach and list lookup approach. Int. J. Sci. Res. Manag. (IJSRM) 3(3), 2300–2306 (2015)Google Scholar
  13. 13.
    Aboaoga, M., Ab Aziz, M.J.: Arabic person names recognition by using a rule-based approach. J. Comput. Sci. 9(7), 922 (2013)CrossRefGoogle Scholar
  14. 14.
    Bhalla, D., Joshi, N., Mathur, I.: Rule-based transliteration scheme for English to Punjabi (2013)Google Scholar
  15. 15.
  16. 16.
  17. 17.
    Indian Place Names (Internet). http://www.irfca.org/docs/place-names.html
  18. 18.
    Gujarati Number names for Digits (Internet). https://www.omniglot.com/language/numbers/gujarati.htm

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Faculty of Computer ApplicationsS S Agrawal Institute of Computer ScienceNavsariIndia
  2. 2.Faculty of Computer ScienceC U Shah UniversityWadhwanIndia

Personalised recommendations