Abstract
Articles in newspapers are divided in sections like culture, politics and sports to help readers to find information easily. Newspapers editors read the articles and decide the ones to be published and the sections they belong to. This paper presents supervised machine learning methods to automatically classify news articles in newspaper sections. To perform this task 4,027 news articles were collected along with its corresponding sections from three Mexican newspapers during a six month period. Different features were extracted and several machine learning methods were tested. Obtained results show an accuracy over 80% classifying articles in the particular sections of the three selected newspapers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian approach to filtering junk e-mail. In: AAAI-98 Workshop on Learning for Text Categorization, pp. 55–62 (1998)
Gambino, O.J., Ortega-Pacheco, J.D., Mendoza, C.V.G., Felix-Mata, M.: Automatic detection and registration of events by analyzing email content. Res. Comput. Sci. 130, 35–43 (2016)
Kohonen, T., et al.: Self organization of a massive document collection. IEEE Trans. Neural Netw. 11, 574–585 (2000)
Wu, X., Wu, G.Q., Xie, F., Zhu, Z., Hu, X.G.: News filtering and summarization on the web. IEEE Intell. Syst. 25, 68–76 (2010)
Bracewell, D.B., Yan, J., Ren, F., Kuroiwa, S.: Category classification and topic discovery of Japanese and english news articles. Electron. Notes Theor. Comput. Sci. 225, 51–65 (2009)
Bracewell, D.B., Ren, F., Kuriowa, S.: Multilingual single document keyword extraction for information retrieval. In: 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering, pp. 517–522. IEEE (2005)
Kaur, G., Bajaj, K.: News classification using neural networks. Commun. Appl. Electron. 5, 42–45 (2016)
Cecchini, D., Na, L.: Chinese news classification. In: 2018 IEEE International Conference on Big Data and Smart Computing, pp. 681–684 (2018)
Rao, V., Sachdev, J.: A machine learning approach to classify news articles based on location. In: 2017 International Conference on Intelligent Sustainable Systems, pp. 863–867 (2017)
Téllez Valero, A., Montes y Gómez, M.: Using machine learning for extracting information from natural disaster news reports. Computación y Sistemas 13, 33–44 (2009)
Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600. ACM (2010)
Padró, L., Stanilovsky, E.: FreeLing 3.0: towards wider multilinguality. In: The Language Resources and Evaluation Conference. ELRA, Istanbul (2012)
Er, M.J., Venkatesan, R., Wang, N.: An online universal classifier for binary, multi-class and multi-label classification. In: 2016 IEEE International Conference on Systems, Man, and Cybernetics, pp. 003701–003706 (2016)
Aggarwal, C.C., Zhai, C.: A Survey of Text Classification Algorithms. In: Aggarwal, C., Zhai, C. (eds.) Mining Text Data, pp. 163–222. Springer, Boston (2012). https://doi.org/10.1007/978-1-4614-3223-4_6
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Acknowledgments
We thank the support of Insituto Politécnico Nacional (IPN), ESCOM-IPN, SIP-IPN projects numbers: 20181102, 20180859, COFAA-IPN and EDI-IPN.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
García-Mendoza, CV., Gambino Juárez, O. (2018). News Article Classification of Mexican Newspapers. In: Mata-Rivera, M., Zagal-Flores, R. (eds) Telematics and Computing . WITCOM 2018. Communications in Computer and Information Science, vol 944. Springer, Cham. https://doi.org/10.1007/978-3-030-03763-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-03763-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03762-8
Online ISBN: 978-3-030-03763-5
eBook Packages: Computer ScienceComputer Science (R0)