Skip to main content

Biological Information Extraction and Co-occurrence Analysis

  • Protocol
  • First Online:
Biomedical Literature Mining

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1159))

Abstract

Nowadays, it is possible to identify terms corresponding to biological entities within passages in biomedical text corpora: critically, their potential relationships then need to be detected. These relationships are typically detected by co-occurrence analysis, revealing associations between bioentities through their coexistence in single sentences and/or entire abstracts. These associations implicitly define networks, whose nodes represent terms/bioentities/concepts being connected by relationship edges; edge weights might represent confidence for these semantic connections.

This chapter provides a review of current methods for co-occurrence analysis, focusing on data storage, analysis, and representation. We highlight scenarios of these approaches implemented by useful tools for information extraction and knowledge inference in the field of systems biology. We illustrate the practical utility of two online resources providing services of this type—namely, STRING and BioTextQuest—concluding with a discussion of current challenges and future perspectives in the field.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hunter L, Cohen KB (2006) Biomedical language processing: what’s beyond PubMed? Mol Cell 21(5):589–594

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  2. Lu Z (2011) PubMed and beyond: a survey of web tools for searching biomedical literature. Database (Oxford) 2011:baq036

    Article  Google Scholar 

  3. Cohen AM, Hersh WR (2005) A survey of current work in biomedical text mining. Brief Bioinform 6(1):57–71

    Article  PubMed  CAS  Google Scholar 

  4. Rodriguez-Esteban R (2009) Biomedical text mining and its applications. PLoS Comput Biol 5(12):e1000597

    Article  PubMed  PubMed Central  Google Scholar 

  5. Zhu F, Patumcharoenpol P, Zhang C, Yang Y, Chan J, Meechai A, Vongsangnak W, Shen B (2012) Biomedical text mining and its applications in cancer research. J Biomed Inform 46(2):200–211

    Article  PubMed  Google Scholar 

  6. Rebholz-Schuhmann D, Oellrich A, Hoehndorf R (2012) Text-mining solutions for biomedical research: enabling integrative biology. Nat Rev Genet 13(12):829–839

    Article  PubMed  CAS  Google Scholar 

  7. Lu Z, Wilbur WJ, McEntyre JR, Iskhakov A, Szilagyi L (2009) Finding query suggestions for PubMed. AMIA Annu Symp Proc 2009:396–400

    PubMed  PubMed Central  Google Scholar 

  8. Swanson DR (1986) Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Perspect Biol Med 30(1):7–18

    PubMed  CAS  Google Scholar 

  9. States DJ, Ade AS, Wright ZC, Bookvich AV, Athey BD (2009) MiSearch adaptive pubMed search tool. Bioinformatics 25(7):974–976

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  10. Giglia E (2011) Quertle and KNALIJ: searching PubMed has never been so easy and effective. Eur J Phys Rehabil Med 47(4):687–690

    PubMed  CAS  Google Scholar 

  11. Hymel GM (2011) PubMed central inclusion, quertle indexing, outbound reference linking, and editorial board successions: encouraging developments in the IJTMB’s evolution. Int J Ther Massage Bodywork 4(1):1–2

    PubMed  PubMed Central  Google Scholar 

  12. Fontaine JF, Barbosa-Silva A, Schaefer M, Huska MR, Muro EM, Andrade-Navarro MA (2009) MedlineRanker: flexible ranking of biomedical literature. Nucleic Acids Res 37(Web Server issue):W141–W146

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  13. Errami M, Wren JD, Hicks JM, Garner HR (2007) eTBLAST: a web server to identify expert reviewers, appropriate journals and similar publications. Nucleic Acids Res 35(Web Server issue):W12–W15

    Article  PubMed  PubMed Central  Google Scholar 

  14. Poulter GL, Rubin DL, Altman RB, Seoighe C (2008) MScanner: a classifier for retrieving Medline citations. BMC Bioinformatics 9:108

    Article  PubMed  PubMed Central  Google Scholar 

  15. Smalheiser NR, Zhou W, Torvik VI (2008) Anne O’Tate: a tool to support user-driven summarization, drill-down and browsing of PubMed search results. J Biomed Discov Collab 3:2

    Article  PubMed  PubMed Central  Google Scholar 

  16. Doms A, Schroeder M (2005) GoPubMed: exploring PubMed with the gene ontology. Nucleic Acids Res 33(Web Server issue):W783–W786

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  17. Perez-Iratxeta C, Bork P, Andrade MA (2001) XplorMed: a tool for exploring MEDLINE abstracts. Trends Biochem Sci 26(9):573–575

    Article  PubMed  CAS  Google Scholar 

  18. Soldatos TG, O’Donoghue SI, Satagopam VP, Barbosa-Silva A, Pavlopoulos GA, Wanderley-Nogueira AC, Soares-Cavalcanti NM, Schneider R (2012) Caipirini: using gene sets to rank literature. BioData Min 5(1):1

    Article  PubMed  PubMed Central  Google Scholar 

  19. Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R, Eilbeck K, Lewis S, Marshall B, Mungall C, Richter J, Rubin GM, Blake JA, Bult C, Dolan M, Drabkin H, Eppig JT, Hill DP, Ni L, Ringwald M, Balakrishnan R, Cherry JM, Christie KR, Costanzo MC, Dwight SS, Engel S, Fisk DG, Hirschman JE, Hong EL, Nash RS, Sethuraman A, Theesfeld CL, Botstein D, Dolinski K, Feierbach B, Berardini T, Mundodi S, Rhee SY, Apweiler R, Barrell D, Camon E, Dimmer E, Lee V, Chisholm R, Gaudet P, Kibbe W, Kishore R, Schwarz EM, Sternberg P, Gwinn M, Hannick L, Wortman J, Berriman M, Wood V, de la Cruz N, Tonellato P, Jaiswal P, Seigfried T, White R (2004) The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res 32(Database issue):D258–D261

    PubMed  CAS  Google Scholar 

  20. Rebholz-Schuhmann D, Arregui M, Gaudan S, Kirsch H, Jimeno A (2008) Text processing through Web services: calling Whatizit. Bioinformatics 24(2):296–298

    Article  PubMed  CAS  Google Scholar 

  21. Settles B (2005) ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text. Bioinformatics 21(14):3191–3192

    Article  PubMed  CAS  Google Scholar 

  22. Pafilis E, O’Donoghue SI, Jensen LJ, Horn H, Kuhn M, Brown NP, Schneider R (2009) Reflect: augmented browsing for the life scientist. Nat Biotechnol 27(6):508–510

    Article  PubMed  CAS  Google Scholar 

  23. Pavlopoulos GA, Pafilis E, Kuhn M, Hooper SD, Schneider R (2009) OnTheFly: a tool for automated document-based text annotation, data linking and network generation. Bioinformatics 25(7):977–978

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  24. Frantzi K, Ananiadou S, Mima H (2000) Automatic recognition of multi-word terms. Int J Digit Libr 3(2):117–132

    Google Scholar 

  25. Kim JJ, Pezik P, Rebholz-Schuhmann D (2008) MedEvi: retrieving textual evidence of relations between biomedical concepts from Medline. Bioinformatics 24(11):1410–1412

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  26. Rebholz-Schuhmann D, Kirsch H, Arregui M, Gaudan S, Riethoven M, Stoehr P (2007) EBIMed—text crunching to gather facts for proteins from Medline. Bioinformatics 23(2):e237–e244

    Article  PubMed  CAS  Google Scholar 

  27. Douglas SM, Montelione GT, Gerstein M (2005) PubNet: a flexible system for visualizing literature derived networks. Genome Biol 6(9):R80

    Article  PubMed  PubMed Central  Google Scholar 

  28. Plikus MV, Zhang Z, Chuong CM (2006) PubFocus: semantic MEDLINE/PubMed citations analytics through integration of controlled biomedical dictionaries and ranking algorithm. BMC Bioinformatics 7:424

    Article  PubMed  PubMed Central  Google Scholar 

  29. Fontelo P, Liu F, Ackerman M, Schardt CM, Keitz SA (2006) askMEDLINE: a report on a year-long experience. AMIA Annu Symp Proc 923

    Google Scholar 

  30. Fontelo P, Liu F, Ackerman M (2005) MeSH Speller + askMEDLINE: auto-completes MeSH terms then searches MEDLINE/PubMed via free-text, natural language queries. AMIA Annu Symp Proc 957

    Google Scholar 

  31. Fontelo P, Liu F, Ackerman M (2005) askMEDLINE: a free-text, natural language query tool for MEDLINE/PubMed. BMC Med Inform Decis Mak 5:5

    Article  PubMed  PubMed Central  Google Scholar 

  32. Liu F, Ackerman M, Fontelo P (2006) BabelMeSH: development of a cross-language tool for MEDLINE/PubMed. AMIA Annu Symp Proc 1012

    Google Scholar 

  33. Featherstone R, Hersey D (2010) The quest for full text: an in-depth examination of Pubget for medical searchers. Med Ref Serv Q 29(4):307–319

    Article  PubMed  Google Scholar 

  34. Eaton AD (2006) HubMed: a web-based biomedical literature search interface. Nucleic Acids Res 34(Web Server issue):W745–W747

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  35. Hokamp K, Wolfe KH (2004) PubCrawler: keeping up comfortably with PubMed and GenBank. Nucleic Acids Res 32(Web Server issue):W16–W19

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  36. Goetz T, von der Lieth CW (2005) PubFinder: a tool for improving retrieval rate of relevant PubMed abstracts. Nucleic Acids Res 33(Web Server issue):W774–W778

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  37. Thomas J, Milward D, Ouzounis C, Pulman S, Carroll M (2000) Automatic extraction of protein interactions from scientific abstracts. Pac Symp Biocomput 5:538–549

    Google Scholar 

  38. Alako BT, Veldhoven A, van Baal S, Jelier R, Verhoeven S, Rullmann T, Polman J, Jenster G (2005) CoPub Mapper: mining MEDLINE based on search term co-publication. BMC Bioinformatics 6:51

    Article  PubMed  PubMed Central  Google Scholar 

  39. Ono T, Hishigaki H, Tanigami A, Takagi T (2001) Automated extraction of information on protein-protein interactions from the biological literature. Bioinformatics 17(2):155–161

    Article  PubMed  CAS  Google Scholar 

  40. Novichkova S, Egorov S, Daraselia N (2003) MedScan, a natural language processing engine for MEDLINE abstracts. Bioinformatics 19(13):1699–1706

    Article  PubMed  CAS  Google Scholar 

  41. Rebholz-Schuhmann D, Jimeno-Yepes A, Arregui M, Kirsch H (2010) Measuring prediction capacity of individual verbs for the identification of protein interactions. J Biomed Inform 43(2):200–207

    Article  PubMed  CAS  Google Scholar 

  42. Iacucci E, Tranchevent LC, Popovic D, Pavlopoulos GA, De Moor B, Schneider R, Moreau Y (2012) ReLiance: a machine learning and literature-based prioritization of receptor—ligand pairings. Bioinformatics 28(18):i569–i574

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  43. van Haagen HH, t Hoen PA, Botelho Bovo A, de Morree A, van Mulligen EM, Chichester C, Kors JA, den Dunnen JT, van Ommen GJ, van der Maarel SM, Kern VM, Mons B, Schuemie MJ (2009) Novel protein-protein interactions inferred from literature context. PLoS One 4(11):e7894

    Article  PubMed  PubMed Central  Google Scholar 

  44. Hoffmann R, Valencia A (2004) A gene network for navigating the literature. Nat Genet 36(7):664

    Article  PubMed  CAS  Google Scholar 

  45. Szklarczyk D, Franceschini A, Kuhn M, Simonovic M, Roth A, Minguez P, Doerks T, Stark M, Muller J, Bork P, Jensen LJ, von Mering C (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res 39(Database issue):D561–D568

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  46. Papanikolaou N, Pafilis E, Nikolaou S, Ouzounis CA, Iliopoulos I, Promponas VJ (2011) BioTextQuest: a web-based biomedical text mining suite for concept discovery. Bioinformatics 27(23):3327–3328

    Article  PubMed  CAS  Google Scholar 

  47. Zhu S, Okuno Y, Tsujimoto G, Mamitsuka H (2006) Application of a new probabilistic model for mining implicit associated cancer genes from OMIM and medline. Cancer Inform 2:361–371

    CAS  PubMed Central  Google Scholar 

  48. Schuemie MJ, Weeber M, Schijvenaars BJ, van Mulligen EM, van der Eijk CC, Jelier R, Mons B, Kors JA (2004) Distribution of information in biomedical abstracts and full-text publications. Bioinformatics 20(16):2597–2604

    Article  PubMed  CAS  Google Scholar 

  49. Jenssen TK, Laegreid A, Komorowski J, Hovig E (2001) A literature network of human genes for high-throughput analysis of gene expression. Nat Genet 28(1):21–28

    PubMed  CAS  Google Scholar 

  50. Stapley BJ, Benoit G (2000) Biobibliometrics: information retrieval and visualization from co-occurrences of gene names in Medline abstracts. Pac Symp Biocomput 529–540

    Google Scholar 

  51. Pavlopoulos GA, Secrier M, Moschopoulos CN, Soldatos TG, Kossida S, Aerts J, Schneider R, Bagos PG (2011) Using graph theory to analyze biological networks. BioData Min 4:10

    Article  PubMed  PubMed Central  Google Scholar 

  52. Pavlopoulos GA, Wegener AL, Schneider R (2008) A survey of visualization tools for biological network analysis. BioData Min 1:12

    Article  PubMed  PubMed Central  Google Scholar 

  53. Gehlenborg N, O’Donoghue SI, Baliga NS, Goesmann A, Hibbs MA, Kitano H, Kohlbacher O, Neuweger H, Schneider R, Tenenbaum D, Gavin AC (2010) Visualization of omics data for systems biology. Nat Methods 7(3 Suppl):S56–S68

    Article  PubMed  CAS  Google Scholar 

  54. Enright AJ, Ouzounis CA (2001) BioLayout—an automatic graph layout algorithm for similarity visualization. Bioinformatics 17(9):853–854

    Article  PubMed  CAS  Google Scholar 

  55. Kohler J, Baumbach J, Taubert J, Specht M, Skusa A, Ruegg A, Rawlings C, Verrier P, Philippi S (2006) Graph-based analysis and visualization of experimental results with ONDEX. Bioinformatics 22(11):1383–1390

    Article  PubMed  CAS  Google Scholar 

  56. Breitkreutz BJ, Stark C, Tyers M (1998) Pajek—program for large network analysis. Connections 21:47–57

    Google Scholar 

  57. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  58. Secrier M, Pavlopoulos GA, Aerts J, Schneider R (2012) Arena3D: visualizing time-driven phenotypic differences in biological systems. BMC Bioinformatics 13:45

    Article  PubMed  PubMed Central  Google Scholar 

  59. Pavlopoulos GA, O’Donoghue SI, Satagopam VP, Soldatos TG, Pafilis E, Schneider R (2008) Arena3D: visualization of biological networks in 3D. BMC Syst Biol 2:104

    Article  PubMed  PubMed Central  Google Scholar 

  60. Pavlopoulos GA, Hooper SD, Sifrim A, Schneider R, Aerts J (2011) Medusa: a tool for exploring and clustering biological networks. BMC Res Notes 4(1):384

    Article  PubMed  PubMed Central  Google Scholar 

  61. Hu Z, Hung JH, Wang Y, Chang YC, Huang CL, Huyck M, DeLisi C (2009) VisANT 3.5: multi-scale network visualization, analysis and inference based on the gene ontology. Nucleic Acids Res 37(Web Server issue):W115–W121

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  62. Wang Z, Zheng Y, Park HJ, Li J, Carr JR, Chen YJ, Kiefer MM, Kopanja D, Bagchi S, Tyner AL, Raychaudhuri P (2013) Targeting FoxM1 effectively retards p53-null lymphoma and sarcoma. Mol Cancer Ther 12(5):759–767

    Article  PubMed  CAS  Google Scholar 

  63. Yamamoto Y, Takagi T (2007) Biomedical knowledge navigation by literature clustering. J Biomed Inform 40(2):114–130

    Article  PubMed  Google Scholar 

  64. Rebholz-Schuhmann D, Kirsch H, Arregui M, Gaudan S, Rynbeek M, Stoehr P (2006) Protein annotation by EBIMed. Nat Biotechnol 24(8):902–903

    Article  PubMed  CAS  Google Scholar 

  65. Siadaty MS, Shu J, Knaus WA (2007) Relemed: sentence-level search engine with relevance score for the MEDLINE database of biomedical articles. BMC Med Inform Decis Mak 7:1

    Article  PubMed  PubMed Central  Google Scholar 

  66. Lin J, Wilbur WJ (2007) PubMed related articles: a probabilistic topic-based model for content similarity. BMC Bioinformatics 8:423

    Article  PubMed  PubMed Central  Google Scholar 

  67. Pavlopoulos GA, Moschopoulos CN, Hooper SD, Schneider R, Kossida S (2009) jClust: a clustering and visualization toolbox. Bioinformatics 25(15):1994–1996

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  68. Brohee S, Faust K, Lima-Mendez G, Sand O, Janky R, Vanderstocken G, Deville Y, van Helden J (2008) NeAT: a toolbox for the analysis of biological networks, clusters, classes and pathways. Nucleic Acids Res 36(Web Server issue):W444–W451

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  69. Enright AJ, Van Dongen S, Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 30(7):1575–1584

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  70. Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976

    Article  PubMed  CAS  Google Scholar 

  71. Bader GD, Hogue CW (2003) An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4:2

    Article  PubMed  PubMed Central  Google Scholar 

  72. Spirin V, Mirny LA (2003) Protein complexes and functional modules in molecular networks. Proc Natl Acad Sci U S A 100(21):12123–12128

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  73. Li XL, Tan SH, Foo CS, Ng SK (2005) Interaction graph mining for protein complexes using local clique merging. Genome Inform 16(2):260–269

    PubMed  CAS  Google Scholar 

  74. Altaf-Ul-Amin M, Shinbo Y, Mihara K, Kurokawa K, Kanaya S (2006) Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinformatics 7:207

    Article  PubMed  PubMed Central  Google Scholar 

  75. Liu G, Wong L, Chua HN (2009) Complex discovery from weighted PPI networks. Bioinformatics 25(15):1891–1897

    Article  PubMed  CAS  Google Scholar 

  76. Mete M, Tang F, Xu X, Yuruk N (2008) A structural approach for finding functional modules from large biological networks. BMC Bioinformatics 9 Suppl 9:S19

    Google Scholar 

  77. Adamcsek B, Palla G, Farkas IJ, Derenyi I, Vicsek T (2006) CFinder: locating cliques and overlapping modules in biological networks. Bioinformatics 22(8):1021–1023

    Article  PubMed  CAS  Google Scholar 

  78. Moschopoulos CN, Pavlopoulos GA, Schneider R, Likothanassis SD, Kossida S (2009) GIBA: a clustering tool for detecting protein complexes. BMC Bioinformatics 10 Suppl 6:S11

    Google Scholar 

  79. Chua HN, Ning K, Sung WK, Leong HW, Wong L (2008) Using indirect protein-protein interactions for protein complex prediction. J Bioinform Comput Biol 6(3):435–466

    PubMed  CAS  Google Scholar 

  80. Gusarova GA, Wang IC, Major ML, Kalinichenko VV, Ackerson T, Petrovic V, Costa RH (2007) A cell-penetrating ARF peptide inhibitor of FoxM1 in mouse hepatocellular carcinoma treatment. J Clin Invest 117(1):99–111

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  81. Millour J, de Olano N, Horimoto Y, Monteiro LJ, Langer JK, Aligue R, Hajji N, Lam EW (2011) ATM and p53 regulate FOXM1 expression via E2F in breast cancer epirubicin treatment and resistance. Mol Cancer Ther 10(6):1046–1058

    Article  PubMed  CAS  Google Scholar 

  82. Moschopoulos CN, Pavlopoulos GA, Iacucci E, Aerts J, Likothanassis S, Schneider R, Kossida S (2011) Which clustering algorithm is better for predicting protein complexes? BMC Res Notes 4:549

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  83. Vikis HG, Guan KL (2004) Glutathione-S-transferase-fusion based assays for studying protein-protein interactions. Methods Mol Biol 261:175–186

    PubMed  CAS  Google Scholar 

  84. Puig O, Caspary F, Rigaut G, Rutz B, Bouveret E, Bragado-Nilsson E, Wilm M, Seraphin B (2001) The tandem affinity purification (TAP) method: a general procedure of protein complex purification. Methods 24(3):218–229

    Article  PubMed  CAS  Google Scholar 

  85. Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y (2001) A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci U S A 98(8):4569–4574

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  86. Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM, Remor M, Hofert C, Schelder M, Brajenovic M, Ruffner H, Merino A, Klein K, Hudak M, Dickson D, Rudi T, Gnau V, Bauch A, Bastuck S, Huhse B, Leutwein C, Heurtier MA, Copley RR, Edelmann A, Querfurth E, Rybin V, Drewes G, Raida M, Bouwmeester T, Bork P, Seraphin B, Kuster B, Neubauer G, Superti-Furga G (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415(6868):141–147

    Article  PubMed  CAS  Google Scholar 

  87. Stoll D, Templin MF, Bachmann J, Joos TO (2005) Protein microarrays: applications and future challenges. Curr Opin Drug Discov Devel 8(2):239–252

    PubMed  CAS  Google Scholar 

  88. Costanzo MC, Hogan JD, Cusick ME, Davis BP, Fancher AM, Hodges PE, Kondu P, Lengieza C, Lew-Smith JE, Lingner C, Roberg-Perez KJ, Tillberg M, Brooks JE, Garrels JI (2000) The yeast proteome database (YPD) and Caenorhabditis elegans proteome database (WormPD): comprehensive resources for the organization and comparison of model organism protein information. Nucleic Acids Res 28(1):73–76

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  89. Mewes HW, Frishman D, Mayer KF, Munsterkotter M, Noubibou O, Pagel P, Rattei T, Oesterheld M, Ruepp A, Stumpflen V (2006) MIPS: analysis and annotation of proteins from whole genomes in 2005. Nucleic Acids Res 34(Database issue):D169–D172

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  90. Licata L, Briganti L, Peluso D, Perfetto L, Iannuccelli M, Galeota E, Sacco F, Palma A, Nardozza AP, Santonico E, Castagnoli L, Cesareni G (2012) MINT, the molecular interaction database: 2012 update. Nucleic Acids Res 40(Database issue):D857–D861

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  91. Kerrien S, Alam-Faruque Y, Aranda B, Bancarz I, Bridge A, Derow C, Dimmer E, Feuermann M, Friedrichsen A, Huntley R, Kohler C, Khadake J, Leroy C, Liban A, Lieftink C, Montecchi-Palazzi L, Orchard S, Risse J, Robbe K, Roechert B, Thorneycroft D, Zhang Y, Apweiler R, Hermjakob H (2007) IntAct—open source resource for molecular interaction data. Nucleic Acids Res 35(Database issue):D561–D565

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  92. Xenarios I, Salwinski L, Duan XJ, Higney P, Kim SM, Eisenberg D (2002) DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res 30(1):303–305

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  93. Bader GD, Betel D, Hogue CW (2003) BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res 31(1):248–250

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  94. Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M (2006) BioGRID: a general repository for interaction datasets. Nucleic Acids Res 34(Database issue):D535–D539

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  95. von Mering C, Huynen M, Jaeggi D, Schmidt S, Bork P, Snel B (2003) STRING: a database of predicted functional associations between proteins. Nucleic Acids Res 31(1):258–261

    Article  Google Scholar 

  96. Machesky LM, Gould KL (1999) The Arp2/3 complex: a multifunctional actin organizer. Curr Opin Cell Biol 11(1):117–121

    Article  PubMed  CAS  Google Scholar 

  97. Veltman DM, Insall RH (2010) WASP family proteins: their evolution and its physiological implications. Mol Biol Cell 21(16):2880–2893

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  98. Iliopoulos I, Enright AJ, Ouzounis CA (2001) Textquest: document clustering of Medline abstracts for concept discovery in molecular biology. Pac Symp Biocomput 384–395

    Google Scholar 

  99. Riechmann V, Ephrussi A (2001) Axis formation during Drosophila oogenesis. Curr Opin Genet Dev 11(4):374–383

    Article  PubMed  CAS  Google Scholar 

  100. Dai H-J, Chang Y-C, Tzong-Han Tsai R, Hsu W-L (2010) New challenges for biological text-mining in the next decade. J Comput Sci Tech 25(1):169

    Article  Google Scholar 

Download references

Acknowledgments

The work was supported in part by the European Commission FP7 programme “Translational Potential” (TransPOT; EC contract number 285948).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ioannis Iliopoulos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this protocol

Cite this protocol

Pavlopoulos, G.A., Promponas, V.J., Ouzounis, C.A., Iliopoulos, I. (2014). Biological Information Extraction and Co-occurrence Analysis. In: Kumar, V., Tipney, H. (eds) Biomedical Literature Mining. Methods in Molecular Biology, vol 1159. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-0709-0_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-0709-0_5

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-0708-3

  • Online ISBN: 978-1-4939-0709-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics