Content In-context: Automatic News Contextualization

  • Camilo Restrepo-Arango
  • Claudia Jiménez-Guarín
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 735)

Abstract

News content usually refers to specific facts, people or situations. Understanding the whole relationships and context about the actual text may require the user to manually connect, search and filter other sources, with a considerable effort. This work considers the use of deep learning techniques to analyze the news content to automatically build context and ultimately to provide a valuable solution for news readers and news editors using a real dataset from the most important online newspaper. Using a news article as a seed, we relate and add valuable information to news articles, providing understanding and comprehensiveness to put information into users’ perspective, based on semantic, unobvious and time changing relationships. Context is constructed by modeling news and ontological information using deep learning. Ontological information is extracted from knowledge base sources. Content In-context is a complete solution applying this approach to a Colombian real, online news dataset, produced in Spanish. Tests and results are performed considering new articles using unknown data. Results prove to be interesting compared to classical machine learning methods.

Keywords

Deep learning Natural language processing Neural networks 

References

  1. 1.
    Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)MATHGoogle Scholar
  2. 2.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)MATHGoogle Scholar
  3. 3.
    Bizer, C.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 6(2), 167–195 (2012)Google Scholar
  4. 4.
    Blei, D.M., et al.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATHGoogle Scholar
  5. 5.
    Chasin, R., et al.: Extracting and displaying temporal entities from historical articles. Comput. J. 57(3), 403–426 (2011)CrossRefGoogle Scholar
  6. 6.
    Finkel, J.R., et al.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 363–370 (2005)Google Scholar
  7. 7.
    Gruenheid, A., et al.: StoryPivot : comparing and contrasting story evolution. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1415–1420 (2015)Google Scholar
  8. 8.
    Harris, Z.S.: Distributional structure. In: Papers on Syntax, pp. 3–22 (1981)Google Scholar
  9. 9.
    Hou, L., et al.: Newsminer: multifaceted news analysis for event search. Knowl. Based Syst. 76, 17–29 (2015)CrossRefGoogle Scholar
  10. 10.
    Hullman, J., et al.: Contextifier : automatic generation of annotated stock visualizations. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2707–2716 (2013)Google Scholar
  11. 11.
    Jannach, D., et al.: Recommender Systems: An Introduction (2011)Google Scholar
  12. 12.
    Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. In: NAACL 2011, pp. 103–112 (2015)Google Scholar
  13. 13.
    Johnson, R., Zhang, T.: Semi-supervised convolutional neural networks for text categorization via region embedding. In: Advances in Neural Information Processing Systems 28, pp. 919–927. Curran Associates, Inc. (2015)Google Scholar
  14. 14.
    Kalchbrenner, N., et al.: A Convolutional Neural Network for Modelling Sentences. CoRR (2014)Google Scholar
  15. 15.
    Kim, Y., et al.: Character-aware neural language models. In: Thirtieth AAAI Conference (AAAI 2016) (2016)Google Scholar
  16. 16.
    Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014) (2014)Google Scholar
  17. 17.
    Krstajic, M., et al.: Story tracker: incremental visual text analytics of news story development. Inf. Vis. 12(3–4), 308–323 (2013)CrossRefGoogle Scholar
  18. 18.
    Leban, G., et al.: Event registry – learning about world events from news. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 107–110 (2014)Google Scholar
  19. 19.
    Luo, D., et al.: EventRiver: visually exploring text collections with temporal references. IEEE Trans. Vis. Comput. Graph. 18(1), 93–105 (2012)CrossRefGoogle Scholar
  20. 20.
    Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: Conference on Empirical Methods in Natural Language Processing, pp. 404–411 (2004)Google Scholar
  21. 21.
    Mikolov, T., et al.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 1–9 (2013)Google Scholar
  22. 22.
    Mimno, D., et al.: Optimizing semantic coherence in topic models. In: Proceedings of the Conference on Empirical Methods in NLP, pp. 262–272 (2011)Google Scholar
  23. 23.
    Nguyen, T.H., Grishman, R.: Relation extraction: perspective from convolutional neural networks. In: Work. Vector Model. NLP, pp. 39–48 (2015)Google Scholar
  24. 24.
    Palmonari, M., Uboldi, G., Cremaschi, M., Ciminieri, D., Bianchi, F.: DaCENA: Serendipitous News Reading with Data Contexts. In: Gandon, F., Guéret, C., Villata, S., Breslin, J., Faron-Zucker, C., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9341, pp. 133–137. Springer, Cham (2015). doi: 10.1007/978-3-319-25639-9_26 CrossRefGoogle Scholar
  25. 25.
    Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press, Cambridge (2011)CrossRefGoogle Scholar
  26. 26.
    Restrepo, C.: Content In-context: enriquecimiento automático de información para contextualización de noticias. Tesis de Maestría. Universidad de los Andes (2016)Google Scholar
  27. 27.
    Rose, S., et al.: Automatic keyword extraction from individual documents. In: Text Mining: Applications and Theory. Wiley Ltd. (2010)Google Scholar
  28. 28.
    Santos, C.N., dos Gatti, M.: Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 69–78 (2014)Google Scholar
  29. 29.
    Srivastava, N., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATHGoogle Scholar
  30. 30.
    Sarmiento Suárez, D., Jiménez-Guarín, C.: Natural language processing for linking online news and open government data. In: Indulska, M., Purao, S. (eds.) ER 2014. LNCS, vol. 8823, pp. 243–252. Springer, Cham (2014). doi: 10.1007/978-3-319-12256-4_26 Google Scholar
  31. 31.
    Sun, Y., et al.: Modeling mention, context and entity with neural networks for entity disambiguation. In: Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) Modeling, pp. 1333–1339 (2015)Google Scholar
  32. 32.
    Tanisha, L.F., et al.: Analyzing and visualizing news trends over time. In: 2014 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), pp. 307–311 (2014)Google Scholar
  33. 33.
    Wang, P., et al.: Semantic clustering and convolutional neural network for short text categorization. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 352–357 (2015)Google Scholar
  34. 34.
    Zhang, Y., Wallace, B.: A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Computing and Systems Engineering Department, School of EngineeringUniversidad de los AndesBogotáColombia

Personalised recommendations