Abstract
Text classifiers can automatically analyse text using Natural Language Processing (NLP) techniques and then assign categories based on its content. Applying machine learning techniques in the field of NLP has achieved appreciable results. In this work, a system for analysing and classifying news videos based on the audio content using machine learning techniques has been presented. It assists the user to find the genre of a news video without watching it. In the proposed work, NLP techniques are utilized to identify the most correlated unigrams and bigrams, TF-IDF which are the features used to train the model using the machine learning techniques such as Multinomial Naïve-Bayes classifier, Logistic Regression and Support Vector Machines. The performance of various classifiers in classifying the news videos are analysed and presented here. For this purpose, a dataset has been collected, which consists of 25 News videos of CNN news channel which covers almost five categories. However, the classifier models are trained using text news data obtained from BBC news articles. The accuracy of the classifiers is tested for both BBC text news and also for the text news extracted from news video. The experimental results convey that the multinomial Naive-Bayes classifier outperforms the other classifier models for both the noisy and noiseless text input.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Tokunaga, T.: Text categorization based on weighted inverse document frequency. In: Special Interest Groups and Information Process Society of Japan (SIG-IPSJ) (1994)
Rogati, M., Yang, Y.: High-performing feature selection for text classification. In: International Conference on Information and Knowledge Management (2003)
Wang, L.: Feature selection withconditionalmutual information Maxi Min in text categorization. In: International Conference on Information and Knowledge Management (2004)
Lan, M., Sung, H.-B., Tan, C.L.: A comparative study on term weighting schemes for text categorization. In: International Joint Conference on Neural Networks (2005)
Li, Z., Shang, W.: News text classification model based on topic model. In: IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS) (2016)
Tang, B., He, H.: A Bayesian classification approach using class-specific features. IEEE Trans. Know. Data Eng. (2016)
Kim, S.B., Rim, H.C., Yook, D.S., Lim, H.S.: Effective methods for improving Naïve Bayes text classifiers. LNAI 2417 (2002)
Schneider: Techniques for improving the performance of Naive-Bayes for text classification, vol. 3406. LNCS (2005)
Klopotek, M., Woch, M.: Very large Bayesian networks in text classification. ICCS 2003, LNCS 2657 (2003)
Shanahan, J., Roma, N.: Improving SVM text classification performance through threshold adjustment. LNAI 2837 (2003)
Johnson, D.E., Oles, F.J., Zhang, T., Goetz, T.: A decision-tree-based symbolic rule induction system for text categorization. IBM Syst. J. (2002)
Nowshed, C.A., Seddiqui, M.H., Das, S.: Bangla news classification using Naive-Bayes classifier. In: 16th International Conference on Computer and Information Technology (ICCIT). IEEE (2014)
Menaka, S., Radha, N.: Text classification using keyword extraction technique. Int. J. Adv. Res. Comput. Sci. Software Eng. (2013)
Aggarwal, C.C., Cheng, X.Z.: A survey of text classification algorithms. Mining text data. Springer US (2012)
Schapire, R.E., Singer, Y.: Boostexter: a boosting-based system for text categorization. Mach. Learn. 39(2), 135–168 (2000)
Deng, X., Li, Y., Weng, J., Zhang, J.: Feature selection for text classification: a review. Multimed Tools Appl. (2018)
Gong, Y.L., Zhu, J., Tang, S.P.: Keywords extraction based on text classification. Adv. Mater. Res. (2013)
Joachims, T.: Optimizing search engines using clickthrough data. In: International Conference on Knowledge Discovery and Data Mining (2002)
Vinciarelli, A.: Noisy text categorization, pattern recognition. In: 17th International Conference on (ICPR ’04) (2004)
Mirończuk, M.M., Protasiewicz, J.: A recent overview of the state-of-the-art elements of text classification. J. Expert Syst. Appl. (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Divya, S., Raghavi, R., Sripriya, N., Mohanavalli, S., Poornima, S. (2021). Analysis of Audio-Based News Classification Using Machine Learning Techniques. In: Mohapatra, R.N., Yugesh, S., Kalpana, G., Kalaivani, C. (eds) Mathematical Analysis and Computing. ICMAC 2019. Springer Proceedings in Mathematics & Statistics, vol 344. Springer, Singapore. https://doi.org/10.1007/978-981-33-4646-8_35
Download citation
DOI: https://doi.org/10.1007/978-981-33-4646-8_35
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-33-4645-1
Online ISBN: 978-981-33-4646-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)