BioExcom: Detection and Categorization of Speculative Sentences in Biomedical Literature

  • Julien Desclés
  • Motasem Alrahabi
  • Jean-Pierre Desclés
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6562)


Biological research papers are replete with speculative sentences. We present the BioExcom rule-based system, which detects speculations in biomedical literature. Furthermore, it enables to distinguish automatically between prior and new speculations in the analyzed paper. BioExcom is based on the Contextual Exploration processing (hierarchical research of linguistic surface markers with the EXCOM computational platform). To accomplish this task, BioExcom uses also specific linguistic resources established by concise semantic analysis performed by a biologist and a linguist. Our work shows that it is possible to detect and categorize speculative sentences without computational deep linguistic analyses. This work could be useful for biologists who are interested by finding new hypothesis in literature.


speculation hypothesis biology contextual exploration categorization text mining 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hyland, K.: The author in the text: Hedging Scientific Writing. Hong Kong Papers in Linguistics and Language Teaching 18, 33–42 (1995)Google Scholar
  2. 2.
    Light, M., Qiu, X.Y., Srinivasan, P.: The Language of Bioscience: Facts, Speculations, and Statements in Between. In: HLT-NAACL, ed, Workshop on Linking Biological Literature Ontologies and Databases, pp. 17–24 (2004)Google Scholar
  3. 3.
    Medlock, B.: Exploring hedge identification in biomedical literature. J. Biomed. Inform. 41, 636–654 (2008)CrossRefGoogle Scholar
  4. 4.
    Szarvas, G.: Hedge classification in biomedical texts with a weakly supervised selection of keywords. In: Proceedings of ACL 2008: HLT, Columbus, Ohio, USA, pp. 281–289 (June 2008)Google Scholar
  5. 5.
    Kilicoglu, H., Bergler, S.: Recognizing speculative language in biomedical research articles: a linguistically motivated perspective. BMC Bioinformatics 9 Suppl. 11, S10 (2008)CrossRefGoogle Scholar
  6. 6.
    Morante, R., Daelemans, W.: Learning the scope of hedge cues in biomedical texts. In: Proceedings of the Workshop on BioNLP, Boulder, Colorado, USA, June 2009, pp. 28–36. ACL (2009)Google Scholar
  7. 7.
    Özgür, A., Radev, D.: Detecting Speculations and their Scopes in Scientific Text. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, August 6-7, pp. 1398–1407 (2009)Google Scholar
  8. 8.
    Rebholz-Schuhmann, D., Kirsch, H., Couto, F.: Facts from text–is text mining ready to deliver? PLoS Biol. 3, e65 (2005)CrossRefGoogle Scholar
  9. 9.
    Cohen, K.B., Hunter, L.: Getting started in text mining. PLoS Comput. Biol. 4, 20 (2008)CrossRefGoogle Scholar
  10. 10.
    Hunter, L., Cohen, K.B.: Biomedical language processing: what’s beyond PubMed? Mol. Cell. 21, 589–594 (2006)CrossRefGoogle Scholar
  11. 11.
    Rzhetsky, A., Gerstein, M.: Seeking a new biology through text mining. Cell 134, 9–13 (2008)CrossRefGoogle Scholar
  12. 12.
    Desclés, J.P.: Contextual Exploration Processing for Discourse Automatic Annotations of Texts. In: FLAIRS 2006, invited speaker, Melbourne, Florida, pp. 281–284 (2006)Google Scholar
  13. 13.
    Djioua, B., Flores, J.G., Blais, A., Desclés, J.P., Guibert, G., Jackiewicz, A., Le Priol, F., Nait-Baha, L., Sauzay, B.: EXCOM: an automatic annotation engine for semantic information. In: FLAIRS 2006, Melbourne, Florida, Mai 11-13, pp. 285–290 (2006)Google Scholar
  14. 14.
    Alrahabi, M., Desclés, J.P.: Automatic annotation of direct reported speech in Arabic and French, according to semantic map of enunciative modalities. In: Nordström, B., Ranta, A. (eds.) GoTAL 2008. LNCS (LNAI), vol. 5221, pp. 40–51. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  15. 15.
    Szarvas, G., Vincze, V., Farkas, R., Csirik, J.: The BioScope corpus: annotation for negation, uncertainty and their scope in biomedical texts. In: BioNLP ACL 2008 Workshop (2008)Google Scholar
  16. 16.
    Desclés, J., Makkaoui, O., Hacène, T.: Detection of speculations in biomedical texts: new perspectives and large-scale evaluation. In: Proceeding of NeSp-NLP Workshop 2010, Sweden (2010)Google Scholar
  17. 17.
    Clark, T., Kinoshita, J.: Alzforum and SWAN: The Present and Future of Scientific Web Communities. Briefings in Bioinformatics 8(3), 163–171 (2007)CrossRefGoogle Scholar
  18. 18.
    Brent, R.: Functional genomics: learning to think about gene expression data. Curr. Biol. 9, R338–341 (1999)CrossRefGoogle Scholar
  19. 19.
    Brent, R.: Genomic biology. Cell 100, 169–183 (2000)CrossRefGoogle Scholar
  20. 20.
    Kell, D.B., Oliver, S.G.: Here is the evidence, now what is the hypothesis? The complementary roles of inductive and hypothesis-driven science in the post-genomic era. Bioessays 26, 99–105 (2004)CrossRefGoogle Scholar
  21. 21.
    Brent, R., Lok, L.: Cell biology. A fishing buddy for hypothesis generators. Science 308, 504–506 (2005)Google Scholar
  22. 22.
    Bray, D.: Reasoning for results. Nature 412, 863 (2001)CrossRefGoogle Scholar
  23. 23.
    Blagosklonny, M.V., Pardee, A.: Conceptual biology: unearthing the gems. Nature 416, 373 (2002)CrossRefGoogle Scholar
  24. 24.
    Yuan, X., Hu, Z.Z., Wu, H.T., Torii, M., Narayanaswamy, M., Ravikumar, K.E., Vijay-Shanker, K., Wu, C.: An online literature mining tool for protein phosphorylation. Bioinformatics 22, 1668–1669 (2006)CrossRefGoogle Scholar
  25. 25.
    Bekhuis, T.: Conceptual biology, hypothesis discovery, and text mining: Swanson’s legacy. Biomed. Digit. Libr. 3, 2 (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Julien Desclés
    • 1
  • Motasem Alrahabi
    • 1
  • Jean-Pierre Desclés
    • 1
  1. 1.Maison de la RechercheLaLIC Université Paris-SorbonneParisFrance

Personalised recommendations