Advertisement

Discovering Authoritative News Sources and Top News Stories

  • Yang Hu
  • Mingjing Li
  • Zhiwei Li
  • Wei-ying Ma
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4182)

Abstract

With the popularity of reading news online, the idea of assembling news articles from multiple news sources and digging out the most important stories has become very appealing. In this paper we present a novel algorithm to rank assembled news articles as well as news sources according to their importance and authority respectively. We employ the visual layout information of news homepages and exploit the mutual reinforcement relationship between news articles and news sources. Specifically, we propose to use a label propagation based semi-supervised learning algorithm to improve the structure of the relation graph between sources and new articles. The integration of the label propagation algorithm with the HITS like mutual reinforcing algorithm produces a quite effective ranking algorithm. We implement a system TOPSTORY which could automatically generate homepages for users to browse important news. The result of ranking a set of news collected from multiple sources over a period of half a month illustrates the effectiveness of our algorithm.

Keywords

News Article Ranking Algorithm Label Propagation News Event Authoritative Source 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–622 (1999)MATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Zhou, D.Y., Weston, J., Gretton, A., Bousquet, O., Schölkopf, B.: Ranking on Data Manifolds. MPI Technical Report (113), Max Planck Institute for Biological Cybernetics, Tübingen, Germany (2003)Google Scholar
  3. 3.
    Wayne, C.L.: Multilingual Topic Detection and Tracking: Successful Research Enabled by Corpora and Evaluation. In: Proceedings of the Language Resources and Evaluation Conference, LREC (2000)Google Scholar
  4. 4.
    Corso, G.M., Gulli, A., Romani, F.: Ranking a stream of news. In: Proceedings of the 14th International Conference on World Wide Web (2005)Google Scholar
  5. 5.
    Yao, J.Y., Wang, J., Li, Z.W., Li, M.J., Ma, W.Y.: Ranking Web News via Homepage Visual Layout and Cross-site Voting. In: Lalmas, M., MacFarlane, A., Rüger, S.M., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds.) ECIR 2006. LNCS, vol. 3936, pp. 131–142. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    He, J.R., Li, M.J., Zhang, H.J., Tong, H.H., Zhang, C.S.: Manifold-ranking based image retrieval. In: Proceedings of the 12th annual ACM International Conference on Multimedia (2004)Google Scholar
  7. 7.
    Jarvelin, K., Kekalainen, J.: Cumulated Gain-based Evaluation of IR Techniques. ACM Transactions on Information Systems (ACM TOIS) 20(4), 422–446 (2002)CrossRefGoogle Scholar
  8. 8.
    Zhu, X.J., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using Gaussian fields and harmonic functitons. In: Proceedings of the 20th International Conference on Machine Learning (2003)Google Scholar
  9. 9.
    Cai, D., Yu, S.P., Wen, J.R., Ma, W.Y.: VIPS: a vision-based page segmentation algorithm. Microsoft Technical Report, MSR-TR-2003-79 (2003)Google Scholar
  10. 10.
    Radev, D.R., Blair-Goldensohn, S., Zhang, Z., Raghavan, R.S.: Newsinessence: A system for domain-independent, real-time news clustering and multi-document summarization. In: Proceedings of the Human Language Technology Conference (2001)Google Scholar
  11. 11.
    Gabrilovich, E., Dumais, S., Horvitz, E.: Newsjunkie: Providing personalized newsfeeds via analysis of information novelty. In: Proceedings of the 13th International Conference on World Wide Web (2004)Google Scholar
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yang Hu
    • 1
  • Mingjing Li
    • 2
  • Zhiwei Li
    • 2
  • Wei-ying Ma
    • 2
  1. 1.University of Science and Technology of ChinaHefeiChina
  2. 2.Microsoft Research AsiaBeijingChina

Personalised recommendations