A topic recommender for journalists

  • Alessandro CucchiarelliEmail author
  • Christian Morbidoni
  • Giovanni Stilo
  • Paola Velardi
Social Media for Personalization and Search


The way in which people gather information about events and form their own opinion on them has changed dramatically with the advent of social media. For many readers, the news gathered from online sources has become an opportunity to share points of view and information within micro-blogging platforms such as Twitter, mainly aimed at satisfying their communication needs. Furthermore, the need to deepen the aspects related to news stimulates a demand for additional information which is often met through online encyclopedias, such as Wikipedia. This behaviour has also influenced the way in which journalists write their articles, requiring a careful assessment of what actually interests the readers. The goal of this paper is to present a recommender system, What to Write and Why, capable of suggesting to a journalist, for a given event, the aspects still uncovered in news articles on which the readers focus their interest. The basic idea is to characterize an event according to the echo it receives in online news sources and associate it with the corresponding readers’ communicative and informative patterns, detected through the analysis of Twitter and Wikipedia, respectively. Our methodology temporally aligns the results of this analysis and recommends the concepts that emerge as topics of interest from Twitter and Wikipedia, either not covered or poorly covered in the published news articles.


Recommender systems Wikipedia Twitter Online News Event detection Temporal mining 



This work has been partially supported by the MIUR under grant “Dipartimenti di eccellenza 2018–2022” of the Department of Computer Science of Sapienza University and by the IBM Faculty Award #2305895190.

Finally, we would like to thank SpazioDati ( and Textrazor( for supporting this research by granting extensive access to their APIs.


  1. Bi, B., Ma, H., Wang, K., Hsu, B., Cho, J., & Chu, W. (2015). Learning to recommend related entities to search users. In Proceedings of WSDM’15, pp. 177–186. ACM.Google Scholar
  2. Blanco, R., Cambazoglu, B. B., Mika, P., & Torzec, N. (2013). Entity recommendations in web search. In Proceedings of ISWC’13, pp. 33–48. Springer.Google Scholar
  3. Brooker, R., & Schaefer, T. (2005). Public Opinion in the 21st Century: Let the People Speak?. Houghton Mifflin Company: New directions in political behavior series.Google Scholar
  4. Camacho-Collados, J., Taher Pilehvar, M., & Navigli, R. (2016). Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities. Artificial Intelligence, 240, 36–64.MathSciNetCrossRefzbMATHGoogle Scholar
  5. Cheekula, S. K., Kapanipathi, P., Doran, D., Jain, P., & Sheth, A.(2015). Entity recommendations using hierarchical knowledge bases. In Proceedings of Know@LOD 2015, vol. 1365. CEUR-WS.Google Scholar
  6. Cordeiro, M., & Gama, J. (2016). Online social networks event detection: A survey (pp. 1–41). New York: Springer International Publishing.Google Scholar
  7. Davidson, J., Liebald, B., Liu, J., Nandy, P., Van Vleet, T., Gargi, U., Gupta, S., He, Y., Lambert, M., & Livingston, B., et al. (2010). The youtube video recommendation system. In Proceedings of RecSys’10, pp. 293–296. ACM.Google Scholar
  8. Diakopoulos, N., De Choudhury, M., & Naaman, M. (2012). Finding and assessing social media information sources in the context of journalis. In Proceedins of ACM CHI’12, pp. 24,151–2460. ACM.Google Scholar
  9. Dunietz, J., & Gillick, D. (2014). A new entity salience task with millions of training examples. In Proceedings of EACL’14, pp. 205–209. ACL.Google Scholar
  10. Ferragina, P., & Scaiella, U. (2010) Tagme: On-the-fly annotation of short text fragments (by wikipedia entities). In Proceedings of CIKM’10, pp. 1625–1628. ACM.Google Scholar
  11. Fetahu, B., Markert, K., & Anand, A. (2015). Automated news suggestions for populating wikipedia entity pages. In Proceedings of CIKM’15, pp. 323–332. ACM.Google Scholar
  12. Fouss, F., & Saerens, M. (2008). Evaluating performance of recommender systems: An experimental comparison. In Proceedings of WI-IAT’08, vol. 1, pp. 735–738. IEEE.Google Scholar
  13. Ge, M., Delgado-Battenfeld, C., & Jannach, D. (2010). Beyond accuracy: Evaluating recommender systems by coverage and serendipity. In Proceedings of RecSys’10, pp. 257–260. ACM.Google Scholar
  14. Gloviczki, P. J. (2015). Journalism in the age of social media (pp. 1–23). New York: Palgrave Macmillan.CrossRefGoogle Scholar
  15. Gunawardana, A., & Shani, G. (2009). A survey of accuracy evaluation metrics of recommendation tasks. Journal of Machine Learning Research, 10, 2935–2962.MathSciNetzbMATHGoogle Scholar
  16. Hu, J., Wang, G., Lochovsky, F., Sun, J., & Chen, Z. (2009). Understanding user’s query intent with wikipedia. In Proceedins of WWW’09, pp. 471–480. ACM.Google Scholar
  17. Jain, A. (2010). Data clustering: 50 years beyond k-means. Pattern Recognition Letters, 31(8), 651–666.CrossRefGoogle Scholar
  18. Kaminskas, M., & Bridge, D. (2014). Measuring surprise in recommender systems. In: P. Adamopoulos (ed.) Proceedings of the Work. REDD’14. ACM.Google Scholar
  19. Kille, B., Hopfgartner, F., Brodt, T., & Heintz, T. (2013). The plista dataset. In Proceedings of NRS’13, pp. 16–23. ACM.Google Scholar
  20. Knight, M. (2012). Journalism as usual: The use of social media as a newsgathering tool in the coverage of the iranian elections in 2009. Journal of Media Practice, 13(1), 61–74.MathSciNetCrossRefGoogle Scholar
  21. König, A. C., Gamon, M., & Wu, Q. (2009). Click-through prediction for news queries. In Proceedings of SIGIR ’09, pp. 347–354. ACM.Google Scholar
  22. Kotkov, D., Wang, S., & Veijalainen, J. (2016). A survey of serendipity in recommender systems. Knowledge-Based Systems, 111, 180–192.CrossRefGoogle Scholar
  23. Krestel, R., Werkmeister, T., Wiradarma, T. P., & Kasneci, G. (2015). Tweet-recommender: Finding relevant tweets for news articles. In Proceedings of WWW’15, pp. 53–54. ACM.Google Scholar
  24. Kuzey, E., Vreeken, J., & Weikum, G. (2014). A fresh look on knowledge bases: Distilling named events from news. In Proceedings of CIKM’14, pp. 1689–1698. ACM.Google Scholar
  25. Lehmann, J., Gonçalves, B., Ramasco, J. J., & Cattuto, C. (2012). Dynamical classes of collective attention in twitter. In Proceedings of WWW’12, pp. 251–260. ACM.Google Scholar
  26. Leskovec, J., Backstrom, L., & Kleinberg, J. (2009). Meme-tracking and the dynamics of the news cycle. In Proceedings of KDD ’09, pp. 497–506. ACM.Google Scholar
  27. Lin, J., Keogh, E., Lonardi, S., & Chiu, B. (2003). A symbolic representation of time series, with implications for streaming algorithms. In Proceedings of DMKD’03, pp. 2–11. ACM.Google Scholar
  28. Linden, G., Smith, B., & York, J. (2003). Amazon. com recommendations: Item-to-item collaborative filtering. IEEE Internet Computing, 7(1), 76–80.CrossRefGoogle Scholar
  29. Lommatzsch, A., & Albayrak, S. (2015). Real-time recommendations for user-item streams. In Proceedings of SAC’15, pp. 1039–1046. ACM.Google Scholar
  30. Maccatrozzo, V., Aroyo, L., & Van Hage, W. (2013). Crowdsourced evaluation of semantic patterns for recommendations. In Proceedings of UMAP’13.Google Scholar
  31. Magland, J. F., & Barnett, A. H. (2015). Unimodal clustering using isotonic regression: ISO-SPLIT. ArXiv e-prints arXiv:1508.04841
  32. Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval. Cambridge: Cambridge University Press.CrossRefzbMATHGoogle Scholar
  33. Mishra, A., & Berberich, K. (2016). Leveraging semantic annotations to link wikipedia and news archives. In Proceedings of ECIR’16, pp. 30–42. Springer.Google Scholar
  34. Mongiovì, M., Bogdanov, P., & Singh, A. K. (2013). Mining evolving network processes. In Proceedings of ICDM’13, pp. 537–546. IEEE.Google Scholar
  35. Murakami, T., Mori, K., & Orihara, R. (2008). Metrics for evaluating the serendipity of recommendation lists. In Proceedings of the 2007 conference on new frontiers in AI, pp. 40–46. Springer.Google Scholar
  36. Osborne, M., Petrovi, S., Mccreadie, R., Macdonald, C., & Ounis, I. (2012). Bieber no more: First story detection using twitter and wikipedia. In SIGIR 2012 Workshop on time-aware information access.Google Scholar
  37. Phelan, O., McCarthy, K., & Smyth, B. (2009). Using twitter to recommend real-time topical news. In Proceedings of RecSys ’09, pp. 385–388. ACM.Google Scholar
  38. Scaiella, U., Prestia, G., Del Tessandoro, E., Verì, M., Barbera, M., & Parmesan, S. (2014). Datatxt at microposts2014 challenge. In Proceedings of Work. MSM’14, pp. 66–67. ACM.Google Scholar
  39. Steiner, T., Van Hooland, S., & Summers, E. (2013). Mjno more: Using concurrent wikipedia edit spikes with social network plausibility checks for breaking news detection. In Proceeding of WWW’13, pp. 791–794.Google Scholar
  40. Stilo, G., & Velardi, P. (2016). Efficient temporal mining of micro-blog texts and its application to event discovery. Data Mining and Knowledge Discovery, 30(2), 372–402.MathSciNetCrossRefGoogle Scholar
  41. Tran, T., Niederée, C., Kanhabua, N., Gadiraju, U., & Anand, A. (2015). Balancing novelty and salience: Adaptive learning to rank entities for timeline summarization of high-impact events. In Proceedings of CICM’15, vol. 19, pp. 1201–1210. Springer.Google Scholar
  42. Tsagkias, E., de Rijke, M., & Weerkamp, W. (2009). Predicting the volume of comments on online news stories. In Proceedings of CIKM’09. ACM.Google Scholar
  43. Tsagkias, M., de Rijke, M., & Weerkamp, W. (2011). Linking online news and social media. In Proceedings of WSDM’11, pp. 565–574. ACM.Google Scholar
  44. Vargas, S., & Castells, P. (2011). Rank and relevance in novelty and diversity metrics for recommender systems. In Proceedings of RecSys’11, pp. 109–116. ACM.Google Scholar
  45. Weiler, A., Grossniklaus, M., & Scholl, M. (2014). Event identification and tracking in social media streaming data. In Proceedings of the Work. of the EDBT/ICDT’14, pp. 282–287. CEUR-WS.Google Scholar
  46. Yang, J., & Leskovec, J. (2011). Patterns of temporal variation in online media. In Proceedings of WSDM ’11, pp. 177–186. ACM.Google Scholar
  47. Yoshida, M., Arase, Y., Tsunoda, T., & Yamamoto, M. (2015). Wikipedia page view reflects web search trend. In Proceedings of WSC’15, pp. 65:1–65:2. ACM.Google Scholar
  48. Zubiaga, A., Ji, H., & Knight, K. (2013). Curating and contextualizing twitter stories to assist with social newsgathering. In Proceedings of IUI’13, pp. 213–224. ACM.Google Scholar

Copyright information

© Springer Nature B.V. 2018

Authors and Affiliations

  1. 1.Università Politecnica delle MarcheAnconaItaly
  2. 2.Sapienza University of RomeRomeItaly

Personalised recommendations