Biomedical Literature Mining for Text Classification and Construction of Gene Networks

  • Despoina Antonakaki
  • Alexandros Kanterakis
  • George Potamias
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3955)


A multi-layered biomedical literature mining approach is presented aiming to the discovery of gene-gene correlations and the construction of respective gene networks. Utilization of the Trie-memory data structure enables efficient manipulation of different gene nomenclatures. The whole approach is coupled with a texts (biomedical abstracts) classification method. Experimental validation and evaluation results show the rationality, efficiency and reliability of the approach.


Frequent Itemset Mining Bell System Technical Journal Supervise Learning Technique Mutual Information Measure Efficient Manipulation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Antonakaki, D.: Mining the Biomedical Literature – The MineBioText system: Discovery of Gene, Protein and Disease Correlations. MSc thesis, dept. of Computer Science, Univ. Crete (2006)Google Scholar
  2. 2.
    Bodon, F.: Surprising results of trie-based FIM algorithms. In: Proc. Workshop on Frequent Itemset Mining Implementations (FIMI 2004), Brighton, UK, vol. 90 (2004)Google Scholar
  3. 3.
    Chang, J.T., Raychaudhuri, S., Altman, R.B.: Improving Biological Literature Improves Homology Search. In: Pacific Symposium on Bio-computing, Mauna Lani, HI, pp. 374–383 (2001)Google Scholar
  4. 4.
    Collier, N., Nobata, C., Tsujii, J.: Extracting the Names of Genes and Gene Products with a Hidden Markov Model. In: Proceedings of COLING, Saarbruecken, pp. 201–207 (2000)Google Scholar
  5. 5.
    Frantzi, K., Ananiadou, S., Mima, H.: Automatic Recognition of Multi-Word Terms: The C value/NC-value method. International Journal on Digital Libraries 3(2), 115–130 (2000)CrossRefGoogle Scholar
  6. 6.
    Fredkin, E.: Trie Memory. CACM 3(9), 490–499 (1960)CrossRefGoogle Scholar
  7. 7.
    Iliopoulos, I., Enright, A., Ouzounis, C.: Textquest: Document clustering of Medline abstracts for concept discovery in molecular biology. In: Pac. Symp. Biocomput., pp. 384–395 (2001)Google Scholar
  8. 8.
    Shannon, C.: A Mathematical Theory of Communication. Reprinted with corrections from The Bell System Technical Journal 27, 379–423, 623–656 (1948)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Despoina Antonakaki
    • 1
  • Alexandros Kanterakis
    • 1
  • George Potamias
    • 1
  1. 1.Institute of Computer ScienceFoundation for Research and Technology (FORTH)Heraklion, CreteGreece

Personalised recommendations