Skip to main content

News Article Classification of Mexican Newspapers

  • Conference paper
  • First Online:
Telematics and Computing (WITCOM 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 944))

Included in the following conference series:

Abstract

Articles in newspapers are divided in sections like culture, politics and sports to help readers to find information easily. Newspapers editors read the articles and decide the ones to be published and the sections they belong to. This paper presents supervised machine learning methods to automatically classify news articles in newspaper sections. To perform this task 4,027 news articles were collected along with its corresponding sections from three Mexican newspapers during a six month period. Different features were extracted and several machine learning methods were tested. Obtained results show an accuracy over 80% classifying articles in the particular sections of the three selected newspapers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.eluniversal.com.mx/.

  2. 2.

    http://www.excelsior.com.mx/.

  3. 3.

    http://www.jornada.com.mx.

References

  1. Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian approach to filtering junk e-mail. In: AAAI-98 Workshop on Learning for Text Categorization, pp. 55–62 (1998)

    Google Scholar 

  2. Gambino, O.J., Ortega-Pacheco, J.D., Mendoza, C.V.G., Felix-Mata, M.: Automatic detection and registration of events by analyzing email content. Res. Comput. Sci. 130, 35–43 (2016)

    Google Scholar 

  3. Kohonen, T., et al.: Self organization of a massive document collection. IEEE Trans. Neural Netw. 11, 574–585 (2000)

    Article  Google Scholar 

  4. Wu, X., Wu, G.Q., Xie, F., Zhu, Z., Hu, X.G.: News filtering and summarization on the web. IEEE Intell. Syst. 25, 68–76 (2010)

    Article  Google Scholar 

  5. Bracewell, D.B., Yan, J., Ren, F., Kuroiwa, S.: Category classification and topic discovery of Japanese and english news articles. Electron. Notes Theor. Comput. Sci. 225, 51–65 (2009)

    Article  Google Scholar 

  6. Bracewell, D.B., Ren, F., Kuriowa, S.: Multilingual single document keyword extraction for information retrieval. In: 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering, pp. 517–522. IEEE (2005)

    Google Scholar 

  7. Kaur, G., Bajaj, K.: News classification using neural networks. Commun. Appl. Electron. 5, 42–45 (2016)

    Article  Google Scholar 

  8. Cecchini, D., Na, L.: Chinese news classification. In: 2018 IEEE International Conference on Big Data and Smart Computing, pp. 681–684 (2018)

    Google Scholar 

  9. Rao, V., Sachdev, J.: A machine learning approach to classify news articles based on location. In: 2017 International Conference on Intelligent Sustainable Systems, pp. 863–867 (2017)

    Google Scholar 

  10. Téllez Valero, A., Montes y Gómez, M.: Using machine learning for extracting information from natural disaster news reports. Computación y Sistemas 13, 33–44 (2009)

    Google Scholar 

  11. Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600. ACM (2010)

    Google Scholar 

  12. Padró, L., Stanilovsky, E.: FreeLing 3.0: towards wider multilinguality. In: The Language Resources and Evaluation Conference. ELRA, Istanbul (2012)

    Google Scholar 

  13. Er, M.J., Venkatesan, R., Wang, N.: An online universal classifier for binary, multi-class and multi-label classification. In: 2016 IEEE International Conference on Systems, Man, and Cybernetics, pp. 003701–003706 (2016)

    Google Scholar 

  14. Aggarwal, C.C., Zhai, C.: A Survey of Text Classification Algorithms. In: Aggarwal, C., Zhai, C. (eds.) Mining Text Data, pp. 163–222. Springer, Boston (2012). https://doi.org/10.1007/978-1-4614-3223-4_6

    Chapter  Google Scholar 

  15. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

We thank the support of Insituto Politécnico Nacional (IPN), ESCOM-IPN, SIP-IPN projects numbers: 20181102, 20180859, COFAA-IPN and EDI-IPN.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Omar Gambino Juárez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

García-Mendoza, CV., Gambino Juárez, O. (2018). News Article Classification of Mexican Newspapers. In: Mata-Rivera, M., Zagal-Flores, R. (eds) Telematics and Computing . WITCOM 2018. Communications in Computer and Information Science, vol 944. Springer, Cham. https://doi.org/10.1007/978-3-030-03763-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-03763-5_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-03762-8

  • Online ISBN: 978-3-030-03763-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics