Abstract
We explore the applications of representation learning in Nepali, an under-resourced language. Using distributional similarity on a large amount of unlabeled Nepali text, we induce clusters of different sizes. The use of these clusters as features significantly improves the performance compared to the baseline on two standard NLP tasks. In a part-of-speech (PoS) tagging experiment where the train and test domain are the same, the accuracy on the unknown words increased by up to 5% compared to the baseline. In a named-entity recognition (NER) experiment in domain adaptation setting with a small training data size, the F1 score improved by up to 41% compared to the baseline. In a setting where train and test domain are the same, the F1 score improved by 13% compared to the baseline.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ahuja, A., Downey, D.: Improved extraction assessment through better language models. In: Proceedings of the Annual Meeting of the North American Chapter of the Association of Computational Linguistics, NAACL-HLT (2010)
Shrestha, P., Bal, B.K.: A morphological analyzer and stemmer for nepali. In: Workshop on Morpho-Syntactic Analysis, School of Asian Applied Natural Language Processing for Linguistics Diversity and language Resource Development (2007)
Bal, B.K.: Structure of nepali grammar (2004–2007)
Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: International Conference on Machine Learning, ICML (2009)
Bengio, Y.: Neural net language models. Scholarpedia 3(1), 3881 (2008)
Bista, S.K.: Interim report on dobhase: Online english to nepali machine translation system. Technical Report, Kathmandu University (2005)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Brown, P.F., de Souza, P.V., Mercer, R.L., Pietra, V.J.D., Lai, J.C.: Class-based n-gram models of natural language. Computational Linguistics, 467–479 (1992)
Candito, M., Crabbé, B.: Improving generative statistical parsing with semi-supervised word clustering. In: IWPT, pp. 138–141 (2009)
Nepal Central Bureau of Statistics. Major highlights. Census (2011)
Collobert, R., Weston, J.: A unified architecture for natural language processing: Deep neural networks with multitask learning. In: International Conference on Machine Learning, ICML (2008)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. Journal of Machine Learning Research 12, 2493–2537 (2011)
Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. Journal of the American Society of Information Science 41(6), 391–407 (1990)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society, Series B 39(1), 1–38 (1977)
Downey, D., Schoenmackers, S., Etzioni, O.: Sparse information extraction: Unsupervised language models to the rescue. In: ACL (2007)
Honkela, T.: Self-organizing maps of words for natural language processing applications. In: Proceedings of the International ICSC Symposium on Soft Computing (1997)
Huang, F., Ahuja, A., Downey, D., Yang, Y., Guo, Y., Yates, A.: Learning Representations for Weakly Supervised Natural Language Processing Tasks. Computational Linguistics 40(1) (2014)
Huang, F., Yates, A.: Distributional representations for handling sparsity in supervised sequence labeling. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, ACL (2009)
Huang, F., Yates, A.: Exploring representation-learning approaches to domain adaptation. In: Proceedings of the ACL 2010 Workshop on Domain Adaptation for Natural Language Processing, DANLP (2010)
Huang, F., Yates, A.: Open-domain semantic role labeling by modeling word spans. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, ACL (2010)
Kaski, S.: Dimensionality reduction by random mapping: Fast similarity computation for clustering. In: IJCNN, pp. 413–418 (1998)
Koo, T., Carreras, X., Collins, M.: Simple semi-supervised dependency parsing. In: Proceedings of the Annual Meeting of the Association of Computational Linguistics (ACL), pp. 595–603 (2008)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the International Conference on Machine Learning (2001)
Lin, D., Wu, X.: Phrase clustering for discriminative learning. In: ACL-IJCNLP, pp. 1030–1038 (2009)
Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)
Martin, S., Liermann, J., Ney, H.: Algorithms for bigram and trigram word clustering. Speech Communication 24, 19–37 (1998)
Mnih, A., Hinton, G.E.: A scalable hierarchical distributed language model. In: Neural Information Processing Systems (NIPS), pp. 1081–1088 (2009)
Ferro, L., Chinchor, N., Brown, E., Robinson, P.: 1999 named entity recognition task definition (1999)
Bal, B.K., Rupakheti, P., Khatiwada, L.P.: Report on nepali computational grammar. Technical Report, Madan Puraskar Pustakalaya, Lalitpur, Nepal
Pereira, F., Tishby, N., Lee, L.: Distributional clustering of English words. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 183–190 (1993)
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–285 (1989)
Sahlgren, M.: An introduction to random indexing. In: Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering, TKE (2005)
Sahlgren, M.: The word-space model: Using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. PhD thesis, Stockholm University (2006)
Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill (1983)
Balami, B., Shahi, T.B., Dhamala, T.N.: Support vector machines based part of speech tagging for nepali text. International Journal of Computer Applications (2013)
Turian, J., Ratinov, L., Bengio, Y.: Word representations: A simple and general method for semi-supervised learning. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 384–394 (2010)
Turney, P.D., Pantel, P.: From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research 37, 141–188 (2010)
Väyrynen, J.J., Honkela, T., Lindqvist, L.: Towards explicit semantic features using independent component analysis. In: Proceedings of the Workshop Semantic Content Acquisition and Representation, SCAR (2007)
Weston, J., Ratle, F., Collobert, R.: Deep learning via semi-supervised embedding. In: Proceedings of the 25th International Conference on Machine Learning (2008)
Lohani, R.R., Regmi, B.N., Gurung, S., Gurung, A., McEnery, T., Allwood, J., Yadava, Y.P., Hardie, A., Hall, P.: Construction and annotation of a corpus of contemporary nepali. Corpora 3, 213–225 (2008)
Zhao, H., Chen, W., Kit, C., Zhou, G.: Multilingual dependency learning: A huge feature engineering method to semantic dependency parsing. In: CoNLL 2009 Shared Task (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nepal, A., Yates, A. (2014). Exploring Applications of Representation Learning in Nepali. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2014. Lecture Notes in Computer Science, vol 8403. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54906-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-54906-9_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54905-2
Online ISBN: 978-3-642-54906-9
eBook Packages: Computer ScienceComputer Science (R0)