Skip to main content

Machine Learning and Natural Language Processing: Review of Models and Optimization Problems

  • Conference paper
  • First Online:
ICT Innovations 2020. Machine Learning and Applications (ICT Innovations 2020)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1316))

Included in the following conference series:

Abstract

The purpose of this paper is to consider one of the most important modern technologies, namely: Natural Language Processing (NLP) and the machine learning algorithms related to it. The aim of the authors is to present the machine learning models in this interdisciplinary scientific field, which represents an intersection point of computer science, artificial intelligence, and linguistics. The machine learning techniques are classified and the corresponding models are briefly discussed. Different optimization approaches and problems for machine learning are considered. In this regard, some conclusions are drawn about the development trends in the area and the directions for future research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agarwal, S.: Word to Vectors — Natural Language Processing (2017). https://towardsdatascience.com/word-to-vectors-natural-language-processing-b253dd0b0817. Accessed 04 May 2020

  2. Daud, A., Khan, W., Che, D.: Urdu language processing: a survey. Artif. Intell. Rev. 47(3), 279–311 (2016). https://doi.org/10.1007/s10462-016-9482-x

    Article  Google Scholar 

  3. Khan, W., Daud, A., Nasir, J.A., Amjad, T.: A survey on the state-of-the-art machine learning models in the context of NLP. Kuwait J. Sci. 43(4), 95–113 (2016)

    MathSciNet  Google Scholar 

  4. Le, J.: The 7 NLP techniques that will change how you communicate in the future (Part I) (2018). https://heartbeat.fritz.ai/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-i-f0114b2f0497. Accessed 04 May 2020

  5. Sun, A., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE Trans. Cybern. 1–14 (2019)

    Google Scholar 

  6. Goebel, T.: Machine learning or linguistic rules: two approaches to building a chatbot (2017). https://www.cmswire.com/digital-experience/machine-learning-or-linguistic-rules-two-approaches-to-building-a-chatbot/. Accessed 10 May 2020

  7. Ishibuchi, H., Nakashima, T., Murata, T.: Multiobjective optimization in linguistic rule extraction from numerical data. In: Zitzler, E., Thiele, L., Deb, K., Coello Coello, C.A., Corne, D. (eds.) EMO 2001. LNCS, vol. 1993, pp. 588–602. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44719-9_41

    Chapter  Google Scholar 

  8. Shanthamallu, U.S., Spanias, A., Tepedelenlioglu, C., Stanley M.: A brief survey of machine learning methods and their sensor and IoT applications. In: Proceedings of the 2017 8th International Conference on Information, Intelligence, Systems and Applications, IISA, Larnaca, Cyprus, (2017)

    Google Scholar 

  9. Rosenblatt, F.: The perceptron: a probabilistic model for information storage in the brain. Psychol. Rev. 65, 386–408 (1958)

    Article  Google Scholar 

  10. Krizhevsky, A., et al.: Image Net classification with deep convolutional NN. Advances in Neural Information Processing Systems, Vol. 25. pp. 1090–1098 (2012)

    Google Scholar 

  11. Sattigeri, P., Thiagarajan, J.J., Ramamurthy, K.N., Spanias, A.: Implementation of a fast image coding and retrieval system using a GPU. In: 2012 IEEEESPA, Las Vegas, NV, pp. 5–8 (2012)

    Google Scholar 

  12. Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)

    Article  Google Scholar 

  13. Deng, L., Yu, D.: Deep learning. Sig. Process 7, 3–4 (2014)

    Google Scholar 

  14. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in NIPS, October 2013

    Google Scholar 

  15. Yu, D., Deng, L.: Automatic Speech Recognition. SCT. Springer, London (2015). https://doi.org/10.1007/978-1-4471-5779-3

    Book  MATH  Google Scholar 

  16. Zheng, X., Chen, H., Xu, T.: Deep learning for Chinese word segmentation and pos tagging. In: Proceedings of the Conference on EMNLP-ACL-2013, pp. 647–657 (2013)

    Google Scholar 

  17. Santos, C.D., Zadrozny, B.: Learning character-level representations for part-of-speech tagging. In: Proceedings of the 31th International Conference on Machine Learning, pp. 1818–1826 (2014)

    Google Scholar 

  18. Li, Y., Miao, C., Bontcheva, K., Cunningham, H.: Perceptron learning for Chinese word segmentation. In: Proceedings of Fourth Sighan Workshop on Chinese Language Processing (Sighan-05), pp. 154–157 (2005)

    Google Scholar 

  19. Qi, Y., Das, S.G., Collobert, R., Weston, J.: Deep learning for character-based information extraction. In: de Rijke, M., et al. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 668–674. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06028-6_74

    Chapter  Google Scholar 

  20. Mohammed, N.F., Omar, N.: Arabic named entity recognition using artificial neural network. Journal of Computer Science 8, 1285–1293 (2012)

    Article  Google Scholar 

  21. Mishev, K., Gjorgjevikj, A., Vodenska, I., Chitkushev, L.T., Trajanov, D.: Evaluation of sentiment analysis in finance: from lexicons to transformers. IEEE Access 8, 131662–131682 (2020)

    Article  Google Scholar 

  22. Mishev, K., et al.: Performance evaluation of word and sentence embeddings for finance headlines sentiment analysis. In: Gievska, S., Madjarov, G. (eds.) ICT Innovations 2019. CCIS, vol. 1110, pp. 161–172. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33110-8_14

    Chapter  Google Scholar 

  23. Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, (ICML), pp. 282–289 (2001)

    Google Scholar 

  24. Yang, E., Ravikumar, P., Allen, G.I., Liu, Z.: Conditional random fields via univariate exponential families. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), vol. 26 (2013)

    Google Scholar 

  25. Hammersley, J.M., Clifford, P.: Markov fields on finite graphs and lattices, Computer Science (1971)

    Google Scholar 

  26. Liu, X., Wei, F., Zhang, S., Zhou, M.: Named entity recognition for tweets. ACM Trans. Intell. Syst. Technol. (TIST) 4(1), 1524–1534 (2013)

    Google Scholar 

  27. Abdelrahman, S., Elarnaoty, M., Magdy, M., Fahmy, A.: Integrated machine learning techniques for Arabic named entity recognition. IJCSI 7(4), 27–36 (2010)

    Google Scholar 

  28. Yao, L., Sun, C., Li, S., Wang, X., Wang, X.: CRF-based active learning for Chinese named entity recognition. In: Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics, pp. 1557–1561 (2009)

    Google Scholar 

  29. Ammar, W., Dyer, C., Smith, N.A.: Conditional random field auto encoders for unsupervised structured prediction. In: Proceedings of the Advances in Neural Information Processing Systems (NIPS-2014), vol. 26, pp. 1–9 (2014)

    Google Scholar 

  30. Pandian, S.L., Geetha, T.: CRF models for Tamil part of speech tagging and chunking. Proceedings of Int. Conf. on Computer Proc. of Oriental Languages. pp. 11–22, (2009)

    Google Scholar 

  31. Patel, C., Gali, K.: Part-of-speech tagging for Gujarati using conditional random fields. In: Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages, pp. 117–122 (2008)

    Google Scholar 

  32. Jurafsky, D., Martin, J.H.: Speech and Language Processing (3rd ed. draft) (2019). https://web.stanford.edu/~jurafsky/slp3/. Accessed 13 May 2020

  33. Bikel, D.M., Schwartz, R.L., Weischedel, R.M.: An algorithm that learns what’s in a name. Mach. Learn. 34, 211–231 (1999)

    Article  Google Scholar 

  34. Singh, U., Goyal, V., Lehal, G.S.: Named entity recognition system for Urdu. In: Proceedings of COLING 2012: Technical Papers, pp. 2507–2518 (2012)

    Google Scholar 

  35. Jurafsky, D., James, H.: Speech and Language Processing an Introduction to Natural Language Processing, Computational Linguistics, and Speech. Prentice Hall, Upper Saddle River (2000)

    Google Scholar 

  36. Morwal, S., Chopra, D.: NERHMM: a tool for named entity recognition based on hidden Markov model. Int. J. Nat. Lang. Comput. (IJNLC) 2, 43–49 (2013)

    Article  Google Scholar 

  37. Morwal, S., Jahan, N.: Named entity recognition using hidden Markov model (HMM): an experimental result on Hindi, Urdu and Marathi languages. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 3(4), 671–675 (2013)

    Google Scholar 

  38. Youzhi, Z.: Research and implementation of part-of-speech tagging based on hidden Markov model. In: Proceedings of Asia-Pacific Conference on Computational Intelligence and Industrial Applications (PACIIA), pp. 26–29 (2009)

    Google Scholar 

  39. Kolar, J., Liu, Y.: Automatic sentence boundary detection in conversational speech: a cross-lingual evaluation on English and Czech. In: Proceedings of International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 5258–5261 (2010)

    Google Scholar 

  40. Rehman, Z., Anwar, W.: A hybrid approach for Urdu sentence boundary disambiguation. Int. Arab J. Inf. Tech. (IAJIT) 9(3), 250–255 (2012)

    Google Scholar 

  41. Gouda, A.M., Rashwan, M.: Segmentation of connected Arabic characters using hidden Markov models. In: Proceedings of the IEEE International Conference on Computational Intelligence for Measurement Systems and Applications (CIMSA), pp. 115–119 (2004)

    Google Scholar 

  42. Wenchao, M., Lianchen, L., Anyan, C.: A comparative study on Chinese word segmentation using statistical models. In: Proceedings of IEEE International Conference on Software Engineering and Service Sciences (ICSESS), pp. 482 – 486 (2010)

    Google Scholar 

  43. Saha, S.K., Sarkar, S., Mitra, P.: A hybrid feature set based maximum entropy Hindi named entity recognition. In: Proceedings of the IJCNLP 2008 Workshop on NLP for Less Privileged Languages, pp. 343–349 (2008)

    Google Scholar 

  44. Ekbal, A., Haque, R., Das, A., Poka, V., Bandyopadhyay, S.: Language independent named entity recognition in Indian languages. In: Proceeding of International Joint Conference on Natural Language Processing (IJCNLP), pp. 1–7 (2008)

    Google Scholar 

  45. Cover, T.M., Hart, P.E.: Nearest neighbour pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)

    Article  Google Scholar 

  46. Peterson, L.: K-nearest neighbor. Scholarpedia 4, 1883 (2009)

    Article  Google Scholar 

  47. Lifshits, Y.: Nearest neighbor search. In: SIGSPATIAL, Vol. 2, p. 12 (2010)

    Google Scholar 

  48. Agrawal, V., et al.: Application of k-NN regression for predicting coal mill related variables. In: 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT), India, pp. 1–9 (2016)

    Google Scholar 

  49. Qin, Z., Wang, A.T., Zhang, C., Zhang, S.: Cost-sensitive classification with k-nearest neighbors. In: Wang, M. (ed.) KSEM 2013. LNCS (LNAI), vol. 8041, pp. 112–131. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39787-5_10

    Chapter  Google Scholar 

  50. Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  51. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Elsevier, Amsterdam (2006)

    MATH  Google Scholar 

  52. Vedala, R. et al.: An application of Naive Bayes classification for credit scoring in e-lending platform. In: ICDSE, pp. 81–84 (2012)

    Google Scholar 

  53. Sunny, S., David Peter, S., Jacob, K.P.: Combined feature extraction techniques and Naive Bayes classifier for speech recognition. In: Computer Science & Information Technology (CS & IT), pp. 155–163, (2013)

    Google Scholar 

  54. Bahl, L.R., de Souza, P.V., Gopalakrishnan, P.S., Nahamoo, D., Picheny, M.A.: Context dependent modeling of phones in continuous speech using decision trees. In: Proceedings DARPA Speech and Natural Language: Proceedings of a Workshop, pp 264–270, Pacific Grove, Calif (1991)

    Google Scholar 

  55. Kotsiantis, S.B.: Decision trees: a recent overview. Artif. Intell. Rev. 39, 261–283 (2013)

    Article  Google Scholar 

  56. Boros, T., Dimitrescu, S.D., Pipa, S.: Fast and accurate decision trees for natural language processing tasks. In: Proceedings of Recent Advances in Natural Language Processing, Sep. 4–6, Varna, Bulgaria, pp. 103–110 (2017)

    Google Scholar 

  57. Ekbal, A., Bandyopadhyay, S.: Named entity recognition in Bengali: a multi-engine approach. Proc. Northern Eur. J. Lang. Tech. 1, 26–58 (2009)

    Article  Google Scholar 

  58. Antony, P., Mohan, S.P., Soman, K.: SVM based part of speech tagger for Malayalam. In: Proceedings of IEEE International Conference on Recent Trends in Information, Telecommunication and Computing (ITC), pp. 339–341 (2010)

    Google Scholar 

  59. Brin, S.: Extracting patterns and relations from the world wide web. In: Atzeni, P., Mendelzon, A., Mecca, G. (eds.) WebDB 1998. LNCS, vol. 1590, pp. 172–183. Springer, Heidelberg (1999). https://doi.org/10.1007/10704656_11

    Chapter  Google Scholar 

  60. Agichtein, E., Gravano, L.: Snowball: extracting relations from large plain-text collections. In: Proceedings of the Fifth ACM Conference on Digital Libraries, pp. 85–94 (2000)

    Google Scholar 

  61. Batista, D.S, Martins, B., Silva, M.J.: Semi-supervised bootstrapping of relationship extractors with distributional semantics. In: Empirical Methods in Natural Language Processing. ACL (2015)

    Google Scholar 

  62. Bindal, A., Pathak, A.: A survey on k-means clustering and web-text mining. IJSR 5(4), 1049–1052 (2016)

    Article  Google Scholar 

  63. Sun, J.: Clustering algorithms research. J. Software 19 (2008)

    Google Scholar 

  64. Bouhmala, N.: How good is the euclidean distance metric for the clustering problem. In: IIAI-AAI, Kummamoto, pp. 312–315 (2016)

    Google Scholar 

  65. Gersho, A., Gray, R.M.: Vector Quantization and Signal Compression, 6th edn. Kluwer Academic Publishers, Boston (1991)

    MATH  Google Scholar 

  66. Makhoul, J., et al.: Vector quantization in speech coding. In: Proceedings of the IEEE, Vol. 73, no. 11, pp. 1551–1588, November 1985

    Google Scholar 

  67. Linde, Y., Buzo, A., Gray, R.M.: An algorithm for vector quantization. In: IEEE COM-28, no. 1, pp. 84–95, January 1980

    Google Scholar 

  68. Spanias, A.S.: Speech coding: a tutorial review. In: Proceedings of the IEEE, vol. 82, no. 10, pp. 1441–1582, October 1994

    Google Scholar 

  69. Spanias, A., Painter, T., Atti, V.: Audio Signal Processing and Coding. Wiley, New York (2007)

    Book  Google Scholar 

  70. Kim, Y.: Convolutional neural networks for sentence classification. In: Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751 (2014)

    Google Scholar 

  71. Yin, W., Schiitze, H.: Multichannel variable-size convolution for sentence classification. In: Conference on Computational Language Learning, pp. 204–214 (2015)

    Google Scholar 

  72. Burukin, S.: NLP-based data preprocessing method to improve prediction model accuracy (2019). https://towardsdatascience.com/nlp-based-data-preprocessing-method-to-improve-prediction-model-accuracy-30b408a1865f. Accessed 10 May 2020

  73. Ding, C., He, X., Zha, H., Simon, H.D.: Adaptive dimension reduction for clustering high dimensional data. In: IEEE International Conference on Data Mining, pp. 147–154 (2002)

    Google Scholar 

  74. Guillaumin, M., Verbeek, J.: Multimodal semi-supervised learning for image classification. In: Computer Vision and Pattern Recognition, pp. 902–909 (2010)

    Google Scholar 

  75. Kulis, B., Basu, S.: Semi-supervised graph clustering: a kernel approach. Mach. Learn. 74, 1–22 (2009)

    Article  Google Scholar 

  76. Zhou, Z.H., Li, M.: Semi-supervised regression with co-training. In: International Joint Conferences on Artificial Intelligence, pp. 908–913 (2005)

    Google Scholar 

  77. Chen, P., Jiao, L.: Semi-supervised double sparse graphs based discriminant analysis for dimensionality reduction. Pattern Recogn. 61, 361–378 (2017)

    Article  Google Scholar 

  78. Bennett, K.P., Demiriz, A.: Semi-supervised support vector machines. In: Advances in Neural Information Processing Systems, pp. 368–374 (1999)

    Google Scholar 

  79. Cheung, E.: Optimization Methods for Semi-Supervised Learning. University of Waterloo (2018)

    Google Scholar 

  80. Chapelle, O., Sindhwani, V., Keerthi, S.S.: Optimization techniques for semi-supervised support vector machines. J. Mach. Learn. Res. 9, 203–233 (2008)

    MATH  Google Scholar 

  81. Li, Y.F., Tsang, I.W.: Convex and scalable weakly labeled SVMs. J. Mach. Learn. Res. 14, 2151–2188 (2013)

    MathSciNet  MATH  Google Scholar 

  82. Chapelle O., Sindhwani V., Keerthi, S.S.: Branch and bound for semi-supervised support vector machines. In: Advances in Neural Information Processing Systems, pp. 217–224 (2007)

    Google Scholar 

  83. Estivill-Castro, V., Yang, J.: Fast and robust general purpose clustering algorithms. In: Mizoguchi, R., Slaney, J. (eds.) PRICAI 2000. LNCS (LNAI), vol. 1886, pp. 208–218. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44533-1_24

    Chapter  Google Scholar 

  84. Jolliffe, I.: Principal component analysis. In: International Encyclopedia of Statistical Science, pp. 1094–1096 (2011)

    Google Scholar 

  85. Yao, M.: What are important AI & machine learning trends for 2020?. https://www.forbes.com/sites/mariyayao/2020/01/22/what-are–important-ai–machine-learning-trends-for-2020/#3f46f07b2323. Accessed 13 May 2020

  86. Zhuang, C., Zhai, A.L., Yamins, D.: Local aggregation for unsupervised learning of visual embeddings. [CS, CV]. https://arxiv.org/abs/1903.12355 (2019)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emiliano Mankolli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mankolli, E., Guliashki, V. (2020). Machine Learning and Natural Language Processing: Review of Models and Optimization Problems. In: Dimitrova, V., Dimitrovski, I. (eds) ICT Innovations 2020. Machine Learning and Applications. ICT Innovations 2020. Communications in Computer and Information Science, vol 1316. Springer, Cham. https://doi.org/10.1007/978-3-030-62098-1_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-62098-1_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-62097-4

  • Online ISBN: 978-3-030-62098-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics