JustEvents: A Crowdsourced Corpus for Event Validation with Strict Temporal Constraints

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10193)


Inspecting text to affirm the occurrence of an event is a non-trivial task. Since events are tied to temporal attributes, this task is more complex than merely identifying evidence of entities acting together and thus defining the event in a document. Manual inspection is a typical solution, although it is an onerous task and becomes infeasible with an increasing scale of documents. Therefore, the task of automatically determining whether an event occurs in a document or corpus, named as event validation, has been recently investigated. In this paper, we present a dataset for benchmarking event validation methods. Events and documents are coupled in pairs, whose validity has been judged by human evaluators based on whether the document in the pair contains evidence of the given event. In contrast to the notion of relevance considered in available datasets for event detection, validity judgments in this work strictly consider whether a document reports an event within its timespan as well as the number of event participants reported in the document. These requirements make the generation of manual validity judgments an onerous procedure. The ground truth, made of multiple judgments for each pair, has been acquired through crowdsourcing.


Event validation Evaluation Event detection Crowdsourcing Human computation 



This work was partially funded by the European Commission in the context of the FP7 ICT project QualiMaster (grant number: 619525) and the H2020 ICT project AFEL (grant number: 687916).


  1. 1.
    Allan, J., Papka, R., Lavrenko, V.: On-line new event detection and tracking. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1998 (1998)Google Scholar
  2. 2.
    Araki, J., Callan, J.: An annotation similarity model in passage ranking for historical fact validation. In: Proceedings of the 37th International SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2014 (2014)Google Scholar
  3. 3.
    Ceroni, A., Fisichella, M.: Towards an entity–based automatic event validation. In: Rijke, M., Kenter, T., Vries, A.P., Zhai, C.X., Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 605–611. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-06028-6_64 CrossRefGoogle Scholar
  4. 4.
    Ceroni, A., Gadiraju, U., Fisichella, M.: Improving event detection by automatically assessing validity of event occurrence in text. In: Proceedings of the 24th ACM International Conference on Information and Knowledge Management, CIKM 2015 (2015)Google Scholar
  5. 5.
    Ceroni, A., Georgescu, M., Gadiraju, U., Naini, K.D., Fisichella, M.: Information evolution in Wikipedia. In: Proceedings of the International Symposium on Open Collaboration, OpenSym 2014 (2014)Google Scholar
  6. 6.
    Das Sarma, A., Jain, A., Yu, C.: Dynamic relationship and event discovery. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM 2011 (2011)Google Scholar
  7. 7.
    Eickhoff, C., de Vries, A.P.: Increasing cheat robustness of crowdsourcing tasks. Inf. Retrieval 16, 121–137 (2013)CrossRefGoogle Scholar
  8. 8.
    He, Q., Chang, K., Lim, E.-P.: Analyzing feature trajectories for event detection. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2007 (2007)Google Scholar
  9. 9.
    Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: Yago2: a spatially and temporally enhanced knowledge base from Wikipedia. Artif. Intell. 194, 28–61 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Kuzey, E., Vreeken, J., Weikum, G.: A fresh look on knowledge bases: distilling named events from news. In: Proceedings of the 23rd International Conference on Information and Knowledge Management, CIKM 2014 (2014)Google Scholar
  11. 11.
    Marshall, C.C., Shipman, F.M.: Experiences surveying the crowd: reflections on methods, participation, and reliability. In: Proceedings of the 5th Annual ACM Web Science Conference, WebSci 2013 (2013)Google Scholar
  12. 12.
    McMinn, A.J., Moshfeghi, Y., Jose, J.M.: Building a large-scale corpus for evaluating event detection on Twitter. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013 (2013)Google Scholar
  13. 13.
    Tran, T., Ceroni, A., Georgescu, M., Djafari Naini, K., Fisichella, M.: WikipEvent: leveraging Wikipedia edit history for event detection. In: Benatallah, B., Bestavros, A., Manolopoulos, Y., Vakali, A., Zhang, Y. (eds.) WISE 2014. LNCS, vol. 8787, pp. 90–108. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-11746-1_7 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.L3S Research CenterHannoverGermany
  2. 2.Risk Ident GmbHHamburgGermany

Personalised recommendations