Abstract
The purpose of this paper is to consider one of the most important modern technologies, namely: Natural Language Processing (NLP) and the machine learning algorithms related to it. The aim of the authors is to present the machine learning models in this interdisciplinary scientific field, which represents an intersection point of computer science, artificial intelligence, and linguistics. The machine learning techniques are classified and the corresponding models are briefly discussed. Different optimization approaches and problems for machine learning are considered. In this regard, some conclusions are drawn about the development trends in the area and the directions for future research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agarwal, S.: Word to Vectors — Natural Language Processing (2017). https://towardsdatascience.com/word-to-vectors-natural-language-processing-b253dd0b0817. Accessed 04 May 2020
Daud, A., Khan, W., Che, D.: Urdu language processing: a survey. Artif. Intell. Rev. 47(3), 279–311 (2016). https://doi.org/10.1007/s10462-016-9482-x
Khan, W., Daud, A., Nasir, J.A., Amjad, T.: A survey on the state-of-the-art machine learning models in the context of NLP. Kuwait J. Sci. 43(4), 95–113 (2016)
Le, J.: The 7 NLP techniques that will change how you communicate in the future (Part I) (2018). https://heartbeat.fritz.ai/the-7-nlp-techniques-that-will-change-how-you-communicate-in-the-future-part-i-f0114b2f0497. Accessed 04 May 2020
Sun, A., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE Trans. Cybern. 1–14 (2019)
Goebel, T.: Machine learning or linguistic rules: two approaches to building a chatbot (2017). https://www.cmswire.com/digital-experience/machine-learning-or-linguistic-rules-two-approaches-to-building-a-chatbot/. Accessed 10 May 2020
Ishibuchi, H., Nakashima, T., Murata, T.: Multiobjective optimization in linguistic rule extraction from numerical data. In: Zitzler, E., Thiele, L., Deb, K., Coello Coello, C.A., Corne, D. (eds.) EMO 2001. LNCS, vol. 1993, pp. 588–602. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44719-9_41
Shanthamallu, U.S., Spanias, A., Tepedelenlioglu, C., Stanley M.: A brief survey of machine learning methods and their sensor and IoT applications. In: Proceedings of the 2017 8th International Conference on Information, Intelligence, Systems and Applications, IISA, Larnaca, Cyprus, (2017)
Rosenblatt, F.: The perceptron: a probabilistic model for information storage in the brain. Psychol. Rev. 65, 386–408 (1958)
Krizhevsky, A., et al.: Image Net classification with deep convolutional NN. Advances in Neural Information Processing Systems, Vol. 25. pp. 1090–1098 (2012)
Sattigeri, P., Thiagarajan, J.J., Ramamurthy, K.N., Spanias, A.: Implementation of a fast image coding and retrieval system using a GPU. In: 2012 IEEEESPA, Las Vegas, NV, pp. 5–8 (2012)
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)
Deng, L., Yu, D.: Deep learning. Sig. Process 7, 3–4 (2014)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in NIPS, October 2013
Yu, D., Deng, L.: Automatic Speech Recognition. SCT. Springer, London (2015). https://doi.org/10.1007/978-1-4471-5779-3
Zheng, X., Chen, H., Xu, T.: Deep learning for Chinese word segmentation and pos tagging. In: Proceedings of the Conference on EMNLP-ACL-2013, pp. 647–657 (2013)
Santos, C.D., Zadrozny, B.: Learning character-level representations for part-of-speech tagging. In: Proceedings of the 31th International Conference on Machine Learning, pp. 1818–1826 (2014)
Li, Y., Miao, C., Bontcheva, K., Cunningham, H.: Perceptron learning for Chinese word segmentation. In: Proceedings of Fourth Sighan Workshop on Chinese Language Processing (Sighan-05), pp. 154–157 (2005)
Qi, Y., Das, S.G., Collobert, R., Weston, J.: Deep learning for character-based information extraction. In: de Rijke, M., et al. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 668–674. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06028-6_74
Mohammed, N.F., Omar, N.: Arabic named entity recognition using artificial neural network. Journal of Computer Science 8, 1285–1293 (2012)
Mishev, K., Gjorgjevikj, A., Vodenska, I., Chitkushev, L.T., Trajanov, D.: Evaluation of sentiment analysis in finance: from lexicons to transformers. IEEE Access 8, 131662–131682 (2020)
Mishev, K., et al.: Performance evaluation of word and sentence embeddings for finance headlines sentiment analysis. In: Gievska, S., Madjarov, G. (eds.) ICT Innovations 2019. CCIS, vol. 1110, pp. 161–172. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33110-8_14
Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, (ICML), pp. 282–289 (2001)
Yang, E., Ravikumar, P., Allen, G.I., Liu, Z.: Conditional random fields via univariate exponential families. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), vol. 26 (2013)
Hammersley, J.M., Clifford, P.: Markov fields on finite graphs and lattices, Computer Science (1971)
Liu, X., Wei, F., Zhang, S., Zhou, M.: Named entity recognition for tweets. ACM Trans. Intell. Syst. Technol. (TIST) 4(1), 1524–1534 (2013)
Abdelrahman, S., Elarnaoty, M., Magdy, M., Fahmy, A.: Integrated machine learning techniques for Arabic named entity recognition. IJCSI 7(4), 27–36 (2010)
Yao, L., Sun, C., Li, S., Wang, X., Wang, X.: CRF-based active learning for Chinese named entity recognition. In: Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics, pp. 1557–1561 (2009)
Ammar, W., Dyer, C., Smith, N.A.: Conditional random field auto encoders for unsupervised structured prediction. In: Proceedings of the Advances in Neural Information Processing Systems (NIPS-2014), vol. 26, pp. 1–9 (2014)
Pandian, S.L., Geetha, T.: CRF models for Tamil part of speech tagging and chunking. Proceedings of Int. Conf. on Computer Proc. of Oriental Languages. pp. 11–22, (2009)
Patel, C., Gali, K.: Part-of-speech tagging for Gujarati using conditional random fields. In: Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages, pp. 117–122 (2008)
Jurafsky, D., Martin, J.H.: Speech and Language Processing (3rd ed. draft) (2019). https://web.stanford.edu/~jurafsky/slp3/. Accessed 13 May 2020
Bikel, D.M., Schwartz, R.L., Weischedel, R.M.: An algorithm that learns what’s in a name. Mach. Learn. 34, 211–231 (1999)
Singh, U., Goyal, V., Lehal, G.S.: Named entity recognition system for Urdu. In: Proceedings of COLING 2012: Technical Papers, pp. 2507–2518 (2012)
Jurafsky, D., James, H.: Speech and Language Processing an Introduction to Natural Language Processing, Computational Linguistics, and Speech. Prentice Hall, Upper Saddle River (2000)
Morwal, S., Chopra, D.: NERHMM: a tool for named entity recognition based on hidden Markov model. Int. J. Nat. Lang. Comput. (IJNLC) 2, 43–49 (2013)
Morwal, S., Jahan, N.: Named entity recognition using hidden Markov model (HMM): an experimental result on Hindi, Urdu and Marathi languages. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 3(4), 671–675 (2013)
Youzhi, Z.: Research and implementation of part-of-speech tagging based on hidden Markov model. In: Proceedings of Asia-Pacific Conference on Computational Intelligence and Industrial Applications (PACIIA), pp. 26–29 (2009)
Kolar, J., Liu, Y.: Automatic sentence boundary detection in conversational speech: a cross-lingual evaluation on English and Czech. In: Proceedings of International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 5258–5261 (2010)
Rehman, Z., Anwar, W.: A hybrid approach for Urdu sentence boundary disambiguation. Int. Arab J. Inf. Tech. (IAJIT) 9(3), 250–255 (2012)
Gouda, A.M., Rashwan, M.: Segmentation of connected Arabic characters using hidden Markov models. In: Proceedings of the IEEE International Conference on Computational Intelligence for Measurement Systems and Applications (CIMSA), pp. 115–119 (2004)
Wenchao, M., Lianchen, L., Anyan, C.: A comparative study on Chinese word segmentation using statistical models. In: Proceedings of IEEE International Conference on Software Engineering and Service Sciences (ICSESS), pp. 482 – 486 (2010)
Saha, S.K., Sarkar, S., Mitra, P.: A hybrid feature set based maximum entropy Hindi named entity recognition. In: Proceedings of the IJCNLP 2008 Workshop on NLP for Less Privileged Languages, pp. 343–349 (2008)
Ekbal, A., Haque, R., Das, A., Poka, V., Bandyopadhyay, S.: Language independent named entity recognition in Indian languages. In: Proceeding of International Joint Conference on Natural Language Processing (IJCNLP), pp. 1–7 (2008)
Cover, T.M., Hart, P.E.: Nearest neighbour pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)
Peterson, L.: K-nearest neighbor. Scholarpedia 4, 1883 (2009)
Lifshits, Y.: Nearest neighbor search. In: SIGSPATIAL, Vol. 2, p. 12 (2010)
Agrawal, V., et al.: Application of k-NN regression for predicting coal mill related variables. In: 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT), India, pp. 1–9 (2016)
Qin, Z., Wang, A.T., Zhang, C., Zhang, S.: Cost-sensitive classification with k-nearest neighbors. In: Wang, M. (ed.) KSEM 2013. LNCS (LNAI), vol. 8041, pp. 112–131. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39787-5_10
Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Elsevier, Amsterdam (2006)
Vedala, R. et al.: An application of Naive Bayes classification for credit scoring in e-lending platform. In: ICDSE, pp. 81–84 (2012)
Sunny, S., David Peter, S., Jacob, K.P.: Combined feature extraction techniques and Naive Bayes classifier for speech recognition. In: Computer Science & Information Technology (CS & IT), pp. 155–163, (2013)
Bahl, L.R., de Souza, P.V., Gopalakrishnan, P.S., Nahamoo, D., Picheny, M.A.: Context dependent modeling of phones in continuous speech using decision trees. In: Proceedings DARPA Speech and Natural Language: Proceedings of a Workshop, pp 264–270, Pacific Grove, Calif (1991)
Kotsiantis, S.B.: Decision trees: a recent overview. Artif. Intell. Rev. 39, 261–283 (2013)
Boros, T., Dimitrescu, S.D., Pipa, S.: Fast and accurate decision trees for natural language processing tasks. In: Proceedings of Recent Advances in Natural Language Processing, Sep. 4–6, Varna, Bulgaria, pp. 103–110 (2017)
Ekbal, A., Bandyopadhyay, S.: Named entity recognition in Bengali: a multi-engine approach. Proc. Northern Eur. J. Lang. Tech. 1, 26–58 (2009)
Antony, P., Mohan, S.P., Soman, K.: SVM based part of speech tagger for Malayalam. In: Proceedings of IEEE International Conference on Recent Trends in Information, Telecommunication and Computing (ITC), pp. 339–341 (2010)
Brin, S.: Extracting patterns and relations from the world wide web. In: Atzeni, P., Mendelzon, A., Mecca, G. (eds.) WebDB 1998. LNCS, vol. 1590, pp. 172–183. Springer, Heidelberg (1999). https://doi.org/10.1007/10704656_11
Agichtein, E., Gravano, L.: Snowball: extracting relations from large plain-text collections. In: Proceedings of the Fifth ACM Conference on Digital Libraries, pp. 85–94 (2000)
Batista, D.S, Martins, B., Silva, M.J.: Semi-supervised bootstrapping of relationship extractors with distributional semantics. In: Empirical Methods in Natural Language Processing. ACL (2015)
Bindal, A., Pathak, A.: A survey on k-means clustering and web-text mining. IJSR 5(4), 1049–1052 (2016)
Sun, J.: Clustering algorithms research. J. Software 19 (2008)
Bouhmala, N.: How good is the euclidean distance metric for the clustering problem. In: IIAI-AAI, Kummamoto, pp. 312–315 (2016)
Gersho, A., Gray, R.M.: Vector Quantization and Signal Compression, 6th edn. Kluwer Academic Publishers, Boston (1991)
Makhoul, J., et al.: Vector quantization in speech coding. In: Proceedings of the IEEE, Vol. 73, no. 11, pp. 1551–1588, November 1985
Linde, Y., Buzo, A., Gray, R.M.: An algorithm for vector quantization. In: IEEE COM-28, no. 1, pp. 84–95, January 1980
Spanias, A.S.: Speech coding: a tutorial review. In: Proceedings of the IEEE, vol. 82, no. 10, pp. 1441–1582, October 1994
Spanias, A., Painter, T., Atti, V.: Audio Signal Processing and Coding. Wiley, New York (2007)
Kim, Y.: Convolutional neural networks for sentence classification. In: Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751 (2014)
Yin, W., Schiitze, H.: Multichannel variable-size convolution for sentence classification. In: Conference on Computational Language Learning, pp. 204–214 (2015)
Burukin, S.: NLP-based data preprocessing method to improve prediction model accuracy (2019). https://towardsdatascience.com/nlp-based-data-preprocessing-method-to-improve-prediction-model-accuracy-30b408a1865f. Accessed 10 May 2020
Ding, C., He, X., Zha, H., Simon, H.D.: Adaptive dimension reduction for clustering high dimensional data. In: IEEE International Conference on Data Mining, pp. 147–154 (2002)
Guillaumin, M., Verbeek, J.: Multimodal semi-supervised learning for image classification. In: Computer Vision and Pattern Recognition, pp. 902–909 (2010)
Kulis, B., Basu, S.: Semi-supervised graph clustering: a kernel approach. Mach. Learn. 74, 1–22 (2009)
Zhou, Z.H., Li, M.: Semi-supervised regression with co-training. In: International Joint Conferences on Artificial Intelligence, pp. 908–913 (2005)
Chen, P., Jiao, L.: Semi-supervised double sparse graphs based discriminant analysis for dimensionality reduction. Pattern Recogn. 61, 361–378 (2017)
Bennett, K.P., Demiriz, A.: Semi-supervised support vector machines. In: Advances in Neural Information Processing Systems, pp. 368–374 (1999)
Cheung, E.: Optimization Methods for Semi-Supervised Learning. University of Waterloo (2018)
Chapelle, O., Sindhwani, V., Keerthi, S.S.: Optimization techniques for semi-supervised support vector machines. J. Mach. Learn. Res. 9, 203–233 (2008)
Li, Y.F., Tsang, I.W.: Convex and scalable weakly labeled SVMs. J. Mach. Learn. Res. 14, 2151–2188 (2013)
Chapelle O., Sindhwani V., Keerthi, S.S.: Branch and bound for semi-supervised support vector machines. In: Advances in Neural Information Processing Systems, pp. 217–224 (2007)
Estivill-Castro, V., Yang, J.: Fast and robust general purpose clustering algorithms. In: Mizoguchi, R., Slaney, J. (eds.) PRICAI 2000. LNCS (LNAI), vol. 1886, pp. 208–218. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44533-1_24
Jolliffe, I.: Principal component analysis. In: International Encyclopedia of Statistical Science, pp. 1094–1096 (2011)
Yao, M.: What are important AI & machine learning trends for 2020?. https://www.forbes.com/sites/mariyayao/2020/01/22/what-are–important-ai–machine-learning-trends-for-2020/#3f46f07b2323. Accessed 13 May 2020
Zhuang, C., Zhai, A.L., Yamins, D.: Local aggregation for unsupervised learning of visual embeddings. [CS, CV]. https://arxiv.org/abs/1903.12355 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Mankolli, E., Guliashki, V. (2020). Machine Learning and Natural Language Processing: Review of Models and Optimization Problems. In: Dimitrova, V., Dimitrovski, I. (eds) ICT Innovations 2020. Machine Learning and Applications. ICT Innovations 2020. Communications in Computer and Information Science, vol 1316. Springer, Cham. https://doi.org/10.1007/978-3-030-62098-1_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-62098-1_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62097-4
Online ISBN: 978-3-030-62098-1
eBook Packages: Computer ScienceComputer Science (R0)