Abstract
Event-based search systems have become of increasing interest. This paper provides an overview of recent advances in event-based text mining, with an emphasis on biomedical text. We focus particularly on the enrichment of events with information relating to their interpretation according to surrounding textual and discourse contexts. We describe our annotation scheme used to capture this information at the event level, report on the corpora that have so far been enriched according to this scheme and provide details of our experiments to recognise this information automatically.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Zweigenbaum, P., Demner-Fushman, D., Yu, H., Cohen, K.B.: Frontiers of Biomedical Text Mining: Current Progress. Brief Bioinform. 8, 358–375 (2007)
Ananiadou, S., Kell, D.B., Tsujii, J.: Text Mining and its Potential Applications in Systems Biology. Trends Biotechnol. 24, 571–579 (2006)
Ananiadou, S., Nenadic, G.: Automatic Terminology Management in Biomedicine. In: Ananiadou, S., McNaught, J. (eds.) Text Mining for Biology and Biomedicine, pp. 67–98. Artech House, London (2006)
Mihăilă, C., Ohta, T., Pyysalo, S., Ananiadou, S.: BioCause: Annotating and Analysing Causality in the Biomedical Domain. BMC Bioinformatics 14, 2 (2013)
Kim, J., Ohta, T., Tsujii, J.: Corpus Annotation for Mining Biomedical Events from Literature. BMC Bioinformatics 9, 10 (2008)
Miwa, M., Saetre, R., Kim, J.D., Tsujii, J.: Event Extraction with Complex Event Classification using Rich Features. J. Bioinform. Comput. Biol. 8, 131–146 (2010)
Pyysalo, S., Ohta, T., Rak, R., Sullivan, D., Mao, C., Wang, C., Sobral, B., Tsujii, J., Ananiadou, S.: Overview of the ID, EPI and REL Tasks of BioNLP Shared Task 2011. BMC Bioinformatics 13 (suppl. 11), S2 (2012)
Pyysalo, S., Ohta, T., Rak, R., Sullivan, D., Mao, C., Wang, C., Sobral, B., Tsujii, J., Ananiadou, S.: Overview of the Infectious Diseases (ID) Task of BioNLP Shared Task 2011. In: BioNLP Shared Task 2011 Workshop, pp. 26–35. Association for Computational Linguistics (2011)
Miwa, M., Thompson, P., Ananiadou, S.: Boosting Automatic Event Extraction from the Literature using Domain Adaptation and Coreference Resolution. Bioinformatics 28(13), 1759–1765 (2012)
Miyao, Y., Sagae, K., Saetre, R., Matsuzaki, T., Tsujii, J.: Evaluating Contributions of Natural Language Parsers to Protein-Protein Interaction Extraction. Bioinformatics 25, 394–400 (2009)
Sagae, K., Tsujii, J.I.: Dependency Parsing and Domain Adaptation with LR Models and Parser Ensembles. In: Proceedings of the CoNLL 2007 Shared Task Session of EMNLP-CoNLL 2007, pp. 1044–1050. Association for Computational Linguistics (2007)
Miyao, Y., Ohta, T., Masuda, K., Tsuruoka, Y., Yoshida, K., Ninomiya, T., Tsujii, J.: Semantic Retrieval for the Accurate Identification of Relational Concepts in Massive Textbases. In: Proceedings of Coling/ACL, pp. 1017–1024. Association for Computational Linguistics (2006)
Hara, T., Miyao, Y., Tsujii, J.: Adapting a Probabilistic Disambiguation Model of an HPSG Parser to a New Domain. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 199–210. Springer, Heidelberg (2005)
Tsuruoka, Y., Tsujii, J.: Bidirectional Inference with the Easiest-First Strategy for Tagging Sequence Data. In: Proceedings of HLT/EMNLP 2005, pp. 467–474. Association for Computational Linguistics (2005)
Hirohata, K., Okazaki, N., Ananiadou, S., Ishizuka, M.: Identifying Sections in Scientific Abstracts using Conditional Random Fields. In: Proceedings of the 3rd International Joint Conference on Natural Language Processing, pp. 381–388. Association for Computational Linguistics (2008)
Tsai, R.T., Chou, W.C., Su, Y.S., Lin, Y.C., Sung, C.L., Dai, H.J., Yeh, I.T., Ku, W., Sung, T.Y., Hsu, W.L.: BIOSMILE: a Semantic Role Labeling System for Biomedical Verbs using a Maximum-Entropy Model with Automatically Generated Template Features. BMC Bioinformatics 8, 325 (2007)
Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene Ontology: Tool for the Unification of Biology. Nature Genetics 25, 25–29 (2000)
Thompson, P., McNaught, J., Montemagni, S., Calzolari, N., Del Gratta, R., Lee, V., Marchi, S., Monachini, M., Pezik, P., Quochi, V., Rupp, C.J., Sasaki, Y., Venturi, G., Rebholz-Schuhmann, D., Ananiadou, S.: The BioLexicon: a Large-Scale Terminological Resource for Biomedical Text Mining. BMC Bioinformatics 12, 397 (2011)
Kim, J.T., Moldovan, D.I.: Acquisition of Linguistic Patterns for Knowledge-Based Information Extraction. IEEE Transactions on Knowledge and Data Engineering 7, 713–724 (1995)
Soderland, S.: Learning Information Extraction Rules for Semi-structured and Free Text. Machine Learning 34, 233–272 (1999)
Califf, M.E., Mooney, R.J.: Bottom-Up Relational Learning of Pattern Matching Rules for Information Extraction. Journal of Machine Learning Research 4, 177–210 (2003)
Pyysalo, S., Ginter, F., Heimonen, J., Bjorne, J., Boberg, J., Jarvinen, J., Salakoski, T.: BioInfer: a Corpus for Information Extraction in the Biomedical Domain. BMC Bioinformatics 8, 50 (2007)
Pyysalo, S., Ohta, T., Miwa, M., Cho, H.-C., Tsujii, J.I., Ananiadou, S.: Event Extraction across Multiple Levels of Biological Organization. Bioinformatics 28, i575–i581 (2012)
Thompson, P., Iqbal, S.A., McNaught, J., Ananiadou, S.: Construction of an Annotated Corpus to Support Biomedical Information Extraction. BMC Bioinformatics 10, 349 (2009)
Nawaz, R., Thompson, P., McNaught, J., Ananiadou, S.: Meta-Knowledge Annotation of Bio-Events. In: Proceedings of LREC 2010, pp. 2498–2507. ELRA (2010)
Light, M., Qiu, X.Y., Srinivasan, P.: The Language of Bioscience: Facts, Speculations, and Statements in between. In: Proceedings of the BioLink 2004 Workshop at HLT/NAACL, pp. 17–24. Association for Computational Linguistics (2004)
Medlock, B., Briscoe, T.: Weakly Supervised Learning for Hedge Classification in Scientific Literature. In: Proceedings of ACL, pp. 992–999. Association for Computational Linguistics (2007)
Ruch, P., Boyer, C., Chichester, C., Tbahriti, I., Geissbühler, A., Fabry, P., Gobeill, J., Pillet, V., Rebholz-Schuhmann, D., Lovis, C.: Using Argumentation to Extract Key Sentences from Biomedical Abstracts. Int. J. Med. Informatics 76, 195–200 (2007)
McKnight, L., Srinivasan, P.: Categorization of Sentence Types in Medical Abstracts. In: Procedings of AMIA Annual Symposium, pp. 440–444. AMIA (2003)
Mizuta, Y., Korhonen, A., Mullen, T., Collier, N.: Zone Analysis in Biology Articles as a Basis for Information Extraction. Int. J. Med. Informatics 75, 468–487 (2006)
Teufel, S., Carletta, J., Moens, M.: An Annotation Scheme for Discourse-Level Argumentation in Research Articles. In: Proceedings of EACL, pp. 110–117. Association for Computational Linguistics (1999)
Liakata, M., Teufel, S., Siddharthan, A., Batchelor, C.: Corpora for the Conceptualisation and Zoning of Scientific Papers. In: Proceedings of LREC 2010, pp. 2054–2061. ELRA (2010)
Liakata, M., Saha, S., Dobnik, S., Batchelor, C., Rebholz-Schuhmann, D.: Automatic Recognition of Conceptualisation Zones in Scientific Articles and Two Life Science Applications. Bioinformatics 28(7), 991–1000 (2012)
Vincze, V., Szarvas, G., Farkas, R., Mora, G., Csirik, J.: The BioScope Corpus: Biomedical Texts Annotated for Uncertainty, Negation and their Scopes. BMC Bioinformatics 9, S9 (2008)
Rubin, V., Liddy, E., Kando, N.: Certainty Identification in Texts: Categorization Model and Manual Tagging Results. In: Shanahan, J.G., Qu, Y., Wiebe, J. (eds.) Computing Attitude and Affect in Text: Theory and Applications, pp. 61–76. Springer, Heidelberg (2006)
Hyland, K.: Talking to the Academy: Forms of Hedging in Science Research Articles. Written Communication 13, 251–281 (1996)
Hyland, K.: Writing without Conviction? Hedging in Science Research Articles. Applied Linguistics 17, 433–454 (1996)
Rizomilioti, V.: Exploring Epistemic Modality in Academic Discourse Using Corpora. In: Arnó Macià, E., Soler Cervera, A., Rueda Ramos, C. (eds.) Information Technology in Languages for Specific Purposes, pp. 53–71. Springer, New York (2006)
Thompson, P., Venturi, G., McNaught, J., Montemagni, S., Ananiadou, S.: Categorising Modality in Biomedical Texts. In: Proceedings of the LREC 2008 Workshop on Building and Evaluating Resources for Biomedical Text Mining, pp. 27–34. ELRA (2008)
de Waard, A., Pander Maat, H.: Categorizing Epistemic Segment Types in Biology Research Articles. In: Proceedings of the Workshop on Linguistic and Psycholinguistic Approaches to Text Structuring, LPTS 2009 (2009)
Wilbur, W.J., Rzhetsky, A., Shatkay, H.: New Directions in Biomedical Text Annotations: Definitions, Guidelines and Corpus Construction. BMC Bioinformatics 7, 356 (2006)
Liakata, M., Thompson, P., de Waard, A., Nawaz, R., Maat, H.P., Ananiadou, S.: A Three-Way Perspective on Scientific Discourse Annotation for Knowledge Extraction. In: Proceedings of the ACL Workshop on Detecting Structure in Scholarly Discourse (DSSD), pp. 37–46. Association for Computational Linguistics (2012)
Thompson, P., Nawaz, R., McNaught, J., Ananiadou, S.: Enriching a Biomedical Event Corpus with Meta-knowledge Annotation. BMC Bioinformatics 12, 393 (2011)
Cohen, K.B., Johnson, H.L., Verspoor, K., Roeder, C., Hunter, L.E.: The Structural and Content Aspects of Abstracts versus Bodies of Full Text Journal Articles are Different. BMC Bioinformatics 11, 492 (2010)
Nawaz, R., Thompson, P., Ananiadou, S.: Meta-Knowledge Annotation at the Event Level: Comparison between Abstracts and Full Papers. In: Proceedings of the Third LREC Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM 2012), pp. 24–21. ELRA (2012)
Knight, J.: Negative Results: Null and void. Nature 422, 554–555 (2003)
Miwa, M., Thompson, P., McNaught, J., Kell, D.B., Ananiadou, S.: Extracting Semantically Enriched Events from Biomedical Literature. BMC Bioinformatics 13, 108 (2012)
Bjorne, J., Salakoski, T.: Generalizing Biomedical Event Extraction. In: Proceedings of the BioNLP Shared Task 2011 Workshop, pp. 183–191. Association for Computational Linguistics (2011)
Kilicoglu, H., Bergler, S.: Adapting a General Semantic Interpretation Approach to Biological Event Extraction. In: Proceedings of BioNLP Shared Task 2011 Workshop, pp. 173–182. Association for Computational Linguistics (2011)
Kilicoglu, H., Bergler, S.: Syntactic Dependency Based Heuristics for Biological Event Extraction. In: Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task, pp. 119–127. Association for Computational Linguistics (2009)
Nawaz, R., Thompson, P., Ananiadou, S.: Identification of Manner in Bio-Events. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012), pp. 3505–3510. ELRA (2012)
Nawaz, R., Thompson, P., Ananiadou, S.: Something Old, Something New: Identifying Knowledge Source in Bio-Events. In: Proceedings of CICLing 2013 (2013)
Nawaz, R., Thompson, P., Ananiadou, S.: Negated Bio-events: Analysis and Identification. BMC Bioinformatics 14, 14 (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ananiadou, S., Thompson, P., Nawaz, R. (2013). Enhancing Search: Events and Their Discourse Context. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2013. Lecture Notes in Computer Science, vol 7817. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37256-8_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-37256-8_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37255-1
Online ISBN: 978-3-642-37256-8
eBook Packages: Computer ScienceComputer Science (R0)