Towards Monitoring of Novel Statements in the News

  • Michael FärberEmail author
  • Achim Rettinger
  • Andreas Harth
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9678)


In media monitoring users have a clearly defined information need to find so far unknown statements regarding certain entities or relations mentioned in natural-language text. However, commonly used keyword-based search technologies are focused on finding relevant documents and cannot judge the novelty of statements contained in the text. In this work, we propose a new semantic novelty measure that allows to retrieve statements, which are both novel and relevant, from natural-language sentences in news articles. Relevance is defined by a semantic query of the user, while novelty is ensured by checking whether the extracted statements are related, but non-existing in a knowledge base containing the currently known facts. Our evaluation performed on English news texts and on CrunchBase as the knowledge base demonstrates the effectiveness, unique capabilities and future challenges of this novel approach to novelty.


Semantic novelty measures Novelty detection Statement extraction 



This work was carried out with the support of the German Federal Ministry of Education and Research (BMBF) within the Software Campus project SUITE (Grant 01IS12051).


  1. 1.
    Gabrilovich, E., Dumais, S., Horvitz, E.: Newsjunkie: providing personalized newsfeeds via analysis of information novelty. In: Proceedings of the 13th International Conference on World Wide Web, WWW 2004, pp. 482–490. ACM, New York (2004)Google Scholar
  2. 2.
    Karkali, M., Rousseau, F., Ntoulas, A., Vazirgiannis, M.: Efficient online novelty detection in news streams. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds.) WISE 2013, Part I. LNCS, vol. 8180, pp. 57–71. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  3. 3.
    Del Corro, L., Gemulla, R.: ClausIE: clause-based open information extraction. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, Republic and Canton of Geneva, Switzerland, pp. 355–366. ACM (2013)Google Scholar
  4. 4.
    Zhang, L., Färber, M., Rettinger, A.: xLiD-Lexica: cross-lingual linked data lexica. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014), pp. 2101–2105. European Language Resources Association (2014)Google Scholar
  5. 5.
    Zhang, L., Rettinger, A.: X-LiSA: cross-lingual semantic annotation. PVLDB 7(13), 1693–1696 (2014)Google Scholar
  6. 6.
    Welty, C., Fan, J., Gondek, D., Schlaikjer, A.: Large scale relation detection. In: Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading. FAM-LbR 2010, Stroudsburg, PA, USA, pp. 24–33. Association for Computational Linguistics (2010)Google Scholar
  7. 7.
    Gerber, D., Ngonga Ngomo, A.C.: Bootstrapping the linked data web. In: 1st Workshop on Web Scale Knowledge Extraction @ ISWC 2011 (2011)Google Scholar
  8. 8.
    Trampuš, M., Novak, B.: Internals of an aggregated web news feed. In: Proceedings of the Fifteenth International Information Science Conference IS SiKDD 2012, pp. 431–434 (2012)Google Scholar
  9. 9.
    Presutti, V., Draicchio, F., Gangemi, A.: Knowledge extraction based on discourse representation theory and linguistic frames. In: ten Teije, A., Völker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., Hernandez, N. (eds.) EKAW 2012. LNCS, vol. 7603, pp. 114–129. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  10. 10.
    Carvalho, D.S., Freitas, A., da Silva, J.C.P.: Graphia: extracting contextual relation graphs from text. In: Cimiano, P., Fernández, M., Lopez, V., Schlobach, S., Völker, J. (eds.) ESWC 2013. LNCS, vol. 7955, pp. 236–241. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  11. 11.
    Augenstein, I., Padó, S., Rudolph, S.: LODifier: generating linked data from unstructured text. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 210–224. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  12. 12.
    Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. EMNLP 2011, Stroudsburg, PA, USA, pp. 1535–1545. Association for Computational Linguistics (2011)Google Scholar
  13. 13.
    Mausam, S., M., Bart, R., Soderland, S., Etzioni, O.: Open language learning for information extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in NLP and Computational Natural Language Learning. EMNLP-CoNLL 2012, Stroudsburg, PA, USA, pp. 523–534. ACL (2012)Google Scholar
  14. 14.
    Zhang, Y., Callan, J., Minka, T.: Novelty and redundancy detection in adaptive filtering. In: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. SIGIR 2002, pp. 81–88. ACM, New York (2002)Google Scholar
  15. 15.
    Li, X., Croft, W.B.: An information-pattern-based approach to novelty detection. Inf. Process. Manag. 44(3), 1159–1188 (2008)CrossRefGoogle Scholar
  16. 16.
    Li, X., Croft, W.B.: Novelty detection based on sentence level patterns. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management. CIKM 2005, pp. 744–751. ACM, New York (2005)Google Scholar
  17. 17.
    Soboroff, I., Harman, D.: Novelty detection: the trec experience. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, British Columbia, Canada, pp. 105–112. Association for Computational Linguistics (2005)Google Scholar
  18. 18.
    Clarke, C.L., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: Proceedings of the 31st Annual International ACM SIGIR Conference on R&D in Information Retrieval. SIGIR 2008, pp. 659–666. ACM, New York (2008)Google Scholar
  19. 19.
    Dutta, A., Meilicke, C., Stuckenschmidt, H.: Semantifying triples from open information extraction systems. In: STAIRS 2014 : Proceedings of the 7th European Starting AI Researcher Symposium, IOS Press, pp. 111–120, Clifton, VA (2014)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Michael Färber
    • 1
    Email author
  • Achim Rettinger
    • 1
  • Andreas Harth
    • 1
  1. 1.Karlsruhe Institute of Technology (KIT)KarlsruheGermany

Personalised recommendations