Real-Time News Event Extraction for Global Crisis Monitoring

  • Hristo Tanev
  • Jakub Piskorski
  • Martin Atkinson
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5039)

Abstract

This paper presents a real-time news event extraction system developed by the Joint Research Centre of the European Commission. It is capable of accurately and efficiently extracting violent and disaster events from online news without using much linguistic sophistication. In particular, in our linguistically relatively lightweight approach to event extraction, clustered news have been heavily exploited at various stages of processing. The paper describes the system’s architecture, news geo-tagging, automatic pattern learning, pattern specification language, information aggregation, the issues of integrating event information in a global crisis monitoring system and new experimental evaluation.

Keywords

information extraction event extraction processing massive datasets machine learning finite-state technology 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Best, C., van der Goot, E., Blackler, K., Garcia, T., Horby, D.: Europe Media Monitor. Technical Report EUR 22173 EN, European Commission (2005)Google Scholar
  2. 2.
    Nadeau, D., Sekine, S.: A Survey of Named Entity Recognition and Classification. Lingvisticae Investigationes 30(1), 3–26 (2007)Google Scholar
  3. 3.
    Grishman, R., Huttunen, S., Yangarber, R.: Real-time Event Extraction for Infectious Disease Outbreaks. In: Proceedings of Human Language Technology Conference (HLT) 2002, San Diego, USA (2002)Google Scholar
  4. 4.
    King, G., Lowe, W.: An Automated Information Extraction Tool For International Conflict Data with Performance as Good as Human Coders: A Rare Events Evaluation Design. International Organization 57, 617–642 (2003)CrossRefGoogle Scholar
  5. 5.
    Ashish, N., Appelt, D., Freitag, D., Zelenko, D.: Proceedings of the Workshop on Event Extraction and Synthesis, held in conjunction with the AAAI 2006 conference, Menlo Park, California, USA (2006)Google Scholar
  6. 6.
    Piskorski, J.: CORLEONE – Core Linguistic Entity Online Extraction. Technical report, Joint Research Center of the European Commission, Ispra, Italy (2008)Google Scholar
  7. 7.
    Pouliquen, B., Kimler, M., Steinberger, R., Ignat, C., Oellinger, T., Blackler, K., Fuart, F., Zaghouani, W., Widiger, A., Forslund, A., Best, C.: Geocoding multilingual texts: Recognition, Disambiguation and Visualisation. In: Proceedings of LREC 2006, Genoa, Italy, pp. 24–26 (2006)Google Scholar
  8. 8.
    Jones, R., McCallum, A., Nigam, K., Riloff, E.: Bootstrapping for Text Learning Tasks. In: Proceedings of IJCAI 1999 Workshop on Text Mining: Foundations, Techniques, and Applications, Stockholm, Sweden (1999)Google Scholar
  9. 9.
    Yangarber, R.: Counter-Training in Discovery of Semantic Patterns. In: Proceedings of the 41st Annual Meeting of the ACL (2003)Google Scholar
  10. 10.
    Tanev, H., Oezden-Wennerberg, P.: Learning to Populate an Ontology of Violent Events. In: Perrotta, D., Piskorski, J., Soulie-Fogelman, F., Steinberger, R. (eds.) Mining Massive Data Sets for Security. IOS Press, Amsterdam (in print, 2008)Google Scholar
  11. 11.
    Piskorski, J.: ExPRESS Extraction Pattern Recognition Engine and Specification Suite. In: Proceedings of the International Workshop Finite-State Methods and Natural language Processing 2007 (FSMNLP 2007), Potsdam, Germany (2007)Google Scholar
  12. 12.
    Cunningham, H., Maynard, D., Tablan, V.: JAPE: a Java Annotation Patterns Engine. 2nd edn. Technical Report, CS–00–10, University of Sheffield, Department of Computer Science (2000)Google Scholar
  13. 13.
    Drożdżyński, W., Krieger, H.U., Piskorski, J., Schäfer, U., Xu, F.: Shallow Processing with Unification and Typed Feature Structures — Foundations and Applications. Künstliche Intelligenz 2004(1), 17–23 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Hristo Tanev
    • 1
  • Jakub Piskorski
    • 1
  • Martin Atkinson
    • 1
  1. 1.Joint Research Center of the European Commission, Web and Language Technology Group of IPSCIspra (VA)Italy

Personalised recommendations