Skip to main content

Comparison of Different Part-of-Speech Tagging Techniques for Mongolian

  • Conference paper
  • First Online:
Advances in Intelligent Information Hiding and Multimedia Signal Processing (IIHMSP 2022)

Abstract

In this paper, we presented two POS taggers for Mongolian, namely Neural Networks—Multilayer Perceptron and Hidden Markov Model with Viterbi. The accuracy of the former tagger is 95.6%, whereas the latter is 85.6%. Also, we compared the performance of our taggers with the previous works. The Comparison shows that the Neural Network tagger performs better for Mongolian POS tagging than other approaches. Our dataset consists of about 5000 sentences and includes almost 100,000 words for training and testing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://milab.num.edu.mn/.

  2. 2.

    https://github.com/ganchimegl/POS-tagger-for-Mongolian.

References

  1. Jaimai, P., Chimeddorj, O.: Part of speech tagging for mongolian corpus. 09, 103–106 (2009)

    Google Scholar 

  2. Zoljargal Munkhjargal, P.J.: Mongolian Trigram Part-of-Speech Tagger, pp. 161–163 (2011)

    Google Scholar 

  3. A.K.: Part of Speech Tagging Experiments on Mongolian Language. ICEIC 76 (2013)

    Google Scholar 

  4. Lkhagvasuren, G., Rentsendorj, J. In: Open Information Extraction for Mongolian Language, pp. 299–304 (2020)

    Google Scholar 

  5. Helmut, S.: In: Improvements in Part-of-Speech Tagging with an Application to German, pp. 13–25. Springer, Netherlands, Dordrecht (1999)

    Google Scholar 

  6. Khreich, W., Granger, E., Miri, A., Sabourin, R.: A survey of techniques for incremental learning of hmm parameters. Inf. Sci. 197, 105–130 (2012)

    Google Scholar 

  7. Kupiec, J.: Robust part-of-speech tagging using a hidden markov model. Comput. Speech Lang. 6(3), 225–242 (1992)

    Google Scholar 

  8. Thede, S.M., Harper, M.: A second-order hidden markov model for part-of-speech tagging. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pp. 175–182 (1999)

    Google Scholar 

  9. Al Shamsi, F., Guessoum, A.: A hidden markov model-based pos tagger for arabic. In: Proceeding of the 8th International Conference on the Statistical Analysis of Textual Data, pp. 31–42. France (2006)

    Google Scholar 

  10. Kumawat, D., Jain, V.: Pos tagging approaches: a comparison. Int. J. Comput. Appl. 118, 32–38 (2015)

    Google Scholar 

  11. Meftah, S., Semmar, N.: A neural network model for part-of-speech tagging of social media texts. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan, European Language Resources Association (ELRA) (2018)

    Google Scholar 

  12. Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. Trans. Assoc. Comput. Linguist. 4, 357–370 (2016)

    Google Scholar 

  13. Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1064–1074. Berlin, Germany, Association for Computational Linguistics (2016)

    Google Scholar 

  14. JĂłzefowicz, R., Vinyals, O., Schuster, M., Shazeer, N., Wu, Y.: Exploring the limits of language modeling. CoRR (2016). ArXiv:abs/1602.02410

  15. Nyamdavaa, O.: Mongolian syntactic annotation for parser development. Master’s thesis, National University of Mongolia, Mongolia (2016)

    Google Scholar 

  16. Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of english: The penn treebank. Comput. Linguist. 19(2), 313–330 (1993)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Oyun-Erdene Namsrai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lkhagvasuren, G., Rentsendorj, J., Bukhsuren, E., Namsrai, OE. (2023). Comparison of Different Part-of-Speech Tagging Techniques for Mongolian. In: Weng, S., Shieh, CS., Tsihrintzis, G.A. (eds) Advances in Intelligent Information Hiding and Multimedia Signal Processing. IIHMSP 2022. Smart Innovation, Systems and Technologies, vol 341. Springer, Singapore. https://doi.org/10.1007/978-981-99-0605-5_9

Download citation

Publish with us

Policies and ethics