Topic and Viewpoint Extraction for Diversity and Bias Analysis of News Contents
News content is one kind of popular and valuable information on the Web. Since news agencies have different viewpoints and collect different news materials, their perspectives on news contents may be diverse (biased). In such cases, it is important to indicate this bias and diversity to newsreaders. In this paper, we propose a system called TVBanc (Topic and Viewpoint based Bias Analysis of News Content) to analyze diversity and bias in Web-news content based on comparisons of topics and viewpoints. The topic and viewpoint of a news item are represented by using a novel notion called a content structure consisting touple of subject, aspect and state terms. Given a news item, TVBanc facilitates bias analysis in three steps: first, TVBanc extracts the topic and viewpoint of that news item based on its content structure. Second, TVBanc searches for related news items from multi-sources such as TV-news programs, video news clips, and articles on the Web. Finally, TVBanc groups the related news items into different clusters, and analyzes their distribution to estimate the diversity and bias of the news contents. The details of clustering results are also presented to help users understand the different viewpoints of the news contents. This paper also presents some experimental results we obtained to validate the methods we propose.
KeywordsNews Article State Term Content Structure News Item News Agency
Unable to display preview. Download preview PDF.
- 1.ChaSen (2009), http://chasen.aist-nara.ac.jp/index.html.en
- 2.NewsBlaster (2005), http://www1.cs.columbia.edu/nlp/newsblaster/
- 3.Engels, R., Bremdal, B.A.: CORPORUM: a workbench for the semantic Web. In: Proceedings of Semantic Web Mining Workshop, PKDD/ECML 2001, pp. 1–10 (2001)Google Scholar
- 4.Gabrilovich, S.E., Horvitz, E.: Newsjunkie: providing personalized newsfeeds via analysis of information novelty. In: Proceedings of the 13th International World Wide Web Conference, pp. 482–490 (2004)Google Scholar
- 5.Harmelen, F.V., Meer, J.V.D.: WebMaster: knowledge-based verification of web pages. In: Proceedings of the 12th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, pp. 256–265 (1999)Google Scholar
- 6.Allan, J., Papka, R., Lavrenko, V.: On-line new event dectection and tracking. In: Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 37–45 (1998)Google Scholar
- 7.McKeown, K., Barzilay, R., Evans, D., Hatzivassiloglou, V., Klavans, J., Sable, C., Schiffman, B., Sigelman, S.: Tracking and summarizing news on a daily basis with Columbia’s Newsblaster. In: Proceedings of the 2002 Human Language Technology Conference (HLT) (2002)Google Scholar
- 8.Nadamoto, A., Tanaka, K.: A comparative web browser (CWB) for browsing and comparing web pages. In: Proceedings of WWW 2003, pp. 727–735 (2003)Google Scholar
- 9.Wayne, C.L.: Multilingual topic detection and tracking: Successful research enabled by corpora and evaluation. In: Proceedings of the Language Resources and Evaluation Conference (LREC) 2000, pp. 487–1494 (2000)Google Scholar
- 10.Uchiyama, M., Isahara, H.: A statistical model for domain -independent text segmentation. In: Proceedings of the 9th Conference of the European Chapter of the Association for Computational Linguistics, pp. 491–498 (2001)Google Scholar
- 11.Yang, Y., Pierce, T., Carbonell, J.: A study on retrospective and on-line event detection. In: Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 28–36 (1998)Google Scholar