Abstract
One of the most exciting goals of literature-based discovery is the inference of new, previously undocumented relationships based upon an analysis of known relationships. Human ability to read and assimilate scientific information has long lagged the rate by which new information is produced, and the rapid accumulation of published literature has exacerbated this problem further. The idea that a computer could begin to take over part of the hypothesis formation process that has long been solely within the domain of human reason has been met with both skepticism and excitement, both of which are fully merited. Conceptually, it has already been demonstrated in several studies that a computational approach to literature analysis can lead to the generation of novel and fruitful hypotheses. The biggest barriers to progress in this field are technical in nature, dealing mostly with the shortcomings that computers have relative to humans in understanding the nature, importance and implications of relationships found in the literature. This chapter will discuss where current efforts have brought us in solving the open-discovery problem, and what barriers are limiting further progress.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Wren, J.D., et al., Knowledge discovery by automated identification and ranking of implicit relationships. Bioinformatics, 2004. 20(3): 389–398
Valencia, A., Search and retrieve. Large-scale data generation is becoming increasingly important in biological research. But how good are the tools to make sense of the data? EMBO Rep, 2002. 3(5): 396–400
Blagosklonny, M.V. and A.B. Pardee, Conceptual biology: unearthing the gems. Nature, 2002. 416(6879): 373
Bray, D., Reasoning for results. Nature, 2001. 412(6850): 863
Swanson, D.R., Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Perspect Biol Med, 1986. 30(1): 7–18
Swanson, D.R., Undiscovered public knowledge. Libr Q, 1986. 56: 103–118
Swanson, D.R., Migraine and magnesium: eleven neglected connections. Perspect Biol Med, 1988. 31(4): 526–557
Swanson, D.R., Somatomedin C and arginine: implicit connections between mutually isolated literatures. Perspect Biol Med, 1990. 33(2): 157–186
Swanson, D.R. and N.R. Smalheiser, An interactive system for finding complementary literatures: a stimulus to scientific discovery. Artif Intell, 1997. 91: 183–203
Smalheiser, N.R., Informatics and hypothesis-driven research. EMBO Rep, 2002. 3(8): 702
Pratt, W. and M. Yetisgen-Yildiz. LitLinker: capturing connections across the biomedical literature. In Proceedings of the International Conference on Knowledge Capture (K-Cap’03), 2003, Florida
Srinivasan, P., Text mining: generating hypotheses from MEDLINE. J Am Soc Inf Sci Technol, 2004. 55(5): 396–413
Wren, J.D., Extending the mutual information measure to rank inferred literature relationships. BMC Bioinformatics, 2004. 5(1): 145
Wren, J.D., Using fuzzy set theory and scale-free network properties to relate MEDLINE terms. Soft Computing, 2006. 10(4): 374–381
Jenssen, T.K., et al., A literature network of human genes for high-throughput analysis of gene expression. Nat Genet, 2001. 28(1): 21–28
Rindflesch, T.C., et al., EDGAR: extraction of drugs, genes and relations from the biomedical literature. Pac Symp Biocomput, 2000. 517–528
Stapley, B.J. and G. Benoit, Biobibliometrics: information retrieval and visualization from co-occurrences of gene names in Medline abstracts. Pac Symp Biocomput, 2000. 529–540
Xenarios, I., et al., DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res, 2002. 30(1): 303–305
Andrade, M.A. and P. Bork, Automated extraction of information in molecular biology. FEBS Lett, 2000. 476(1–2): 12–17
Blaschke, C., et al., Automatic extraction of biological information from scientific text: protein–protein interactions. In Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology, 1999, pp. 60–67
Burgunder, J.M., Pathophysiology of akinetic movement disorders: a paradigm for studies in fibromyalgia? Z Rheumatol, 1998. 57(Suppl 2): 27–30
Wren, J.D. and H.R. Garner, Data-mining analysis suggests an epigenetic pathogenesis for Type II Diabetes. J Biomed Biotechnol, 2005. 2: 104–112
Xenarios, I., et al., DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res, 2002. 30(1): 303–305
Zanzoni, A., et al., MINT: a Molecular INTeraction database. FEBS Lett, 2002. 513(1): 135–140
Wren, J.D., The emerging in-silico scientist: how text-based bioinformatics is bridging biology and artificial intelligence. IEEE Eng Med Biol Mag, 2004. 23(2): 87–93
Boolell, M., et al., Sildenafil: an orally active type 5 cyclic GMP-specific phosphodiesterase inhibitor for the treatment of penile erectile dysfunction. Int J Impot Res, 1996. 8(2): 47–52
DuCharme, D.W., et al., Pharmacologic properties of minoxidil: a new hypotensive agent. J Pharmacol Exp Ther, 1973. 184(3): 662–670
Zappacosta, A.R., Reversal of baldness in patient receiving minoxidil for hypertension. N Engl J Med, 1980. 303(25): 1480–1481
Perez-Iratxeta, C., P. Bork, and M.A. Andrade, Association of genes to genetically inherited diseases using data mining. Nat Genet, 2002. 31(3): 316–319
Edgar, R., M. Domrachev, and A.E. Lash, Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res, 2002. 30(1): 207–210
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Wren, J.D. (2008). The ‘Open Discovery’ Challenge. In: Bruza, P., Weeber, M. (eds) Literature-based Discovery. Information Science and Knowledge Management, vol 15. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68690-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-68690-3_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68685-9
Online ISBN: 978-3-540-68690-3
eBook Packages: Computer ScienceComputer Science (R0)