Biomedical Text Mining Applied to Document Retrieval and Semantic Indexing

  • Anália Lourenço
  • Sónia Carneiro
  • Eugénio C. Ferreira
  • Rafael Carreira
  • Luis M. Rocha
  • Daniel Glez-Peña
  • José R. Méndez
  • Florentino Fdez-Riverola
  • Fernando Diaz
  • Isabel Rocha
  • Miguel Rocha
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5518)

Abstract

In Biomedical research, the ability to retrieve the adequate information from the ever growing literature is an extremely important asset. This work provides an enhanced and general purpose approach to the process of document retrieval that enables the filtering of PubMed query results. The system is based on semantic indexing providing, for each set of retrieved documents, a network that links documents and relevant terms obtained by the annotation of biological entities (e.g. genes or proteins). This network provides distinct user perspectives and allows navigation over documents with similar terms and is also used to assess document relevance. A network learning procedure, based on previous work from e-mail spam filtering, is proposed, receiving as input a training set of manually classified documents.

Keywords

Biomedical Document Retrieval Document Relevance Enhanced Instance Retrieval Network Named Entity Recognition Semantic Indexing Document Network 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hirschman, L., Yeh, A., Blaschke, C., Valencia, A.: Overview of BioCreAtIvE: critical assessment of information extraction for biology. BMC Bioinformatics 6(suppl.1), S1 (2005)CrossRefGoogle Scholar
  2. 2.
    Hersh, W., Bhupatiraju, R.T., Ross, L., Johnson, P., Cohen, A.M., Kraemer, D.F.: TREC 2004 Genomics Track Overview. In: Proc. 13th Text Retrieval Conference (TREC), pp. 13–31 (2004)Google Scholar
  3. 3.
    Abi-Haidar, A., Kaur, J., Maguitman, A., Radivojac, P., Retchsteiner, A., Verspoor, K., et al.: Uncovering Protein-Protein Interactions in the Bibliome. Genome Biology, 247–255 (2008)Google Scholar
  4. 4.
    Sehgal, A.K., Srinivasan, P.: Retrieval with gene queries. BMC Bioinformatics 7 (April 21, 2006)Google Scholar
  5. 5.
    Wang, P., Morgan, A.A., Zhang, Q., Sette, A., Peters, B.: Automating document classification for the Immune Epitope Database. BMC Bioinformatics 8 (July 26, 2007)Google Scholar
  6. 6.
    Raychaudhuri, S., Chang, J.T., Sutphin, P.D., Altman, R.B.: Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature. Genome Research 12(1), 203–214 (2002)CrossRefGoogle Scholar
  7. 7.
    Mostafa, J., Lam, W.: Automatic classification using supervised learning in a medical document filtering application. Information Processing Management 36(3), 415–444 (2000)CrossRefGoogle Scholar
  8. 8.
    Méndez, J.R., Glez-Peña, D., Fdez-Riverola, F., Díaz, F., Corchado, J.M.: Managing irrelevant knowledge in CBR models for unsolicited e-mail classification. Expert Systems with Applications (2008)Google Scholar
  9. 9.
    Lenz, M., Auriol, E., Manago, M.: Diagnosis and Decision Support. LNCS (LNAI), vol. 1400, pp. 51–90. Springer, Heidelberg (1998)Google Scholar
  10. 10.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proc. 14th International Joint Conference on Artificial Intelligence, pp. 1137–1143Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Anália Lourenço
    • 1
  • Sónia Carneiro
    • 1
  • Eugénio C. Ferreira
    • 1
  • Rafael Carreira
    • 1
    • 2
  • Luis M. Rocha
    • 3
  • Daniel Glez-Peña
    • 4
  • José R. Méndez
    • 4
  • Florentino Fdez-Riverola
    • 4
  • Fernando Diaz
    • 5
  • Isabel Rocha
    • 1
  • Miguel Rocha
    • 2
  1. 1.IBB/CEB, University of Minho, Campus Gualtar, BragaPortugal
  2. 2.CCTC, University of MinhoBragaPortugal
  3. 3.School of InformaticsIndiana UniversityBloomingtonUSA
  4. 4.Computer Science Dept.Univ. VigoOurenseSpain
  5. 5.Computer Science DepartmentUniversity of ValladolidSegóviaSpain

Personalised recommendations