Deep Belief Network Based Part-of-Speech Tagger for Telugu Language

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 381)

Abstract

Indian languages have very less linguistic resources, though they have a large speaker base. They are very rich in morphology, making it very difficult to do sequential tagging or any type of language analysis. In natural language processing, parts-of-speech (POS) tagging is the basic tool with which it is possible to extract terminology using linguistic patterns. The main aim of this research is to do sequential tagging for Indian languages based on the unsupervised features and distributional information of a word with its neighboring words. The results of the machine learning algorithms depend on the data representation. Not all the data contribute to creation of the model, leading a few in vain and it depends on the descriptive factors of data disparity. Data representations are designed by using domain-specific knowledge but the aim of Artificial Intelligence is to reduce these domain-dependent representations, so that it can be applied to the domains which are new to one. Recently, deep learning algorithms have acquired a substantial interest in reducing the dimension of features or extracting the latent features. Recent development and applications of deep learning algorithms are giving impressive results in several areas mostly in image and text applications.

Keywords

Natural language processing Deep belief networks Restricted Boltzmann machine Neural networks POS tagging 

References

  1. 1.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)Google Scholar
  2. 2.
    Dhanalakshmi, V., Anandkumar, M., Vijaya, M.S., Loganathan, R., Soman, K.P., Rajendran, S.: Tamil part-of-speech tagger based on SVMTool. In: Proceedings of the COLIPS International Conference on natural language processing (IALP), Thailand (2008)Google Scholar
  3. 3.
    Antony, P.J., Mohan, Soman, K.P.: SVM based part of speech tagger for Malayalam. In: Recent Trends in Information, Telecommunication and Computing (ITC) (2010)Google Scholar
  4. 4.
    Binulal, G., Sindhiya, P., Goud, A., Soman, K.P. A SVM based approach to Telugu parts of speech tagging using SVMTool. Int. J. Recent Trends Eng. 1(2), 166–169 (2009)Google Scholar
  5. 5.
    Dhanalakshmi, V., Anand Kumar, M., Rekha, R.U., Arun Kumar, C., Soman, K.P., Rajendran, S.: Morphological analyzer for agglutinative languages using machine learning approaches. Adv. Recent Technol. Comm. Comput. 433–435 (2009)Google Scholar
  6. 6.
    Anand Kumar, M., Dhanalakshmi, V., Soman, K.P., Rajendran, S.: A sequence labeling approach to morphological analyzer for tamil language. Int. J. Comput. Sci. Eng. 2(6), 1944–1951 (2010)Google Scholar
  7. 7.
    Kiranmai, S., Mallika, G.K., Anand Kumar, M., Dhanalakshmi, V., Soman, K.P.: Morphological analyzer for Telugu using support vector machine. In: Information and Communication Technologies, pp. 430–433, Springer, Berlin (2010)Google Scholar
  8. 8.
    Abeera, V.P., Aparna, S., Rekha, R.U., Anand Kumar, M., Dhanalakshmi, V. Soman, K.P., Rajendran, S.: Morphological analyzer for Malayalam using machine learning. In: Data Engineering and Management, pp. 252–254. Springer, Berlin (2012)Google Scholar
  9. 9.
    Rabiner, Lawrence R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  10. 10.
    Schmid, H.: Part-of-speech tagging with neural Networks. In: Proceedings of the International Conference on Computational Linguistics, pp. 172–176 (1994)Google Scholar
  11. 11.
    Goldwater, S., Griffiths, T.: A fully Bayesian approach to unsupervised part-of-speech tagging. Annu. Meet.-ACL. 45(1), 744 (2007)Google Scholar
  12. 12.
    Erhan, D., et al.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res 11, 625–660 (2010)Google Scholar
  13. 13.
    Bengio, Y., et al.: Greedy layer-wise training of deep networks. Adv. Neural Inf. Proc. Syst. 19, 153 (2007)Google Scholar
  14. 14.
    Neal, Radford M.: Connectionist learning of belief networks. Artif. Intell. 56(1), 71–113 (1992)MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. In: Cognitive Modeleling (1988)Google Scholar
  16. 16.
    Raina, R., et al.: Self-taught learning: transfer learning from unlabeled data. In: Proceedings of the 24th International Conference on Machine Learning. ACM (2007)Google Scholar
  17. 17.
    Mohamed, A., et al.: Deep belief networks using discriminative features for phone recognition. In: Acoustics, Speech and Signal Processing (ICASSP) (2011)Google Scholar
  18. 18.
    Salakhutdinov, R., Hinton, G.E.: Deep boltzmann machines. In: International Conference on Artificial Intelligence and Statistics (2009)Google Scholar
  19. 19.
    Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the 24th International Conference on Machine Learning. ACM (2007)Google Scholar

Copyright information

© Springer India 2016

Authors and Affiliations

  1. 1.Atlas Healthcare Software India PvtCoimbatoreIndia
  2. 2.Centre for Excellence in Computational Engineering and Networking, Amrita Vishwa VidyapeethamCoimbatoreIndia

Personalised recommendations