Abstract
Literature discovery can be characterized as a goal directed search for previously unknown implicit knowledge captured within a collection of scientific articles. Swanson’s serendipitous discovery of a treatment for Raynaud’s disease by dietary fish-oil while browsing Medline, an online collection of biomedical literature, exemplifies such a discovery. By means of a series of experiments, the impact of stop words, various weighting schemes, discovery mechanisms, and contextual reduction are studied in relation to replicating the Raynaud/fish-oil and migraine-magnesium discoveries by operational means. Two aspects of discovery were brought under focus: (i) the discovery of intermediate, or B –terms, and (ii) the discovery of indirect A – C connections via the B–terms. A semantic space representation of the underlying corpus is computed and discoveries automated by computing associations between words in both higher and contextually reduced spaces. It was found that the discovery of B–terms and A – C connections can be achieved to an encouraging degree with a standard stop word list. In addition, no single weighting scheme seems to suffice. Log-likelihood appears to be potentially effective for leading to the discovery of B–terms, whereas both odds ratio and simple co-occurrence frequencies both facilitate the discovery of A – C connections. With regard to discovery mechanism, both semantic similarity (via cosine) and information flow computation seem promising for computing A – C connections, but more research is needed to understand their relative strengths and weaknesses. Discovery in a contextually reduced semantic space revealed mixed results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bruza, P., Song, D., McArthur, R.: Abduction in semantic space: Towards a logic of discovery. Logic Journal of the Interest Group in Pure and Applied Logics 12, 97–109 (2004)
Bruza, P.D., Song, D.: Inferring Query Models by Computing Information Flow. In: Proceedings of the 11th International Conference on Information and Knowledge Management (CIKM 2002), pp. 260–269. ACM Press, New York (2002)
Burgess, C., Livesay, K., Lund, K.: Explorations in context space: words, sentences, discourse. Discourse Processes 25(2&3), 211–257 (1998)
Dunning, T.: Accurate methods for the statistics of surprise and coincidence. Computational Linguistics 19(1), 61–74 (1994)
Gabbay, D., Woods, J.: The Reach of Abduction: Insight and Trial. A Practical Logic of Cognitive Systems, vol. 2. Elsevier, Amsterdam (2004); An early draft appeared as Lecture Notes from ESSLLI 2000 (European Summer School on Logic, Language and Information), Online http://www.cs.bham.ac.uk/~esslli/notes/gabbay.html
Gordon, M.D.: Literature-based discovery by lexical statistics. Journal of the American Society for Information Science 50, 574–587 (1999)
Gordon, M.D., Dumais, S.: Using latent semantic indexing for literature based discovery. Journal of the American Society for Information Science 48, 674–685 (1998)
Gordon, M.D., Lindsay, R.L.: Towards discovery support systems: A replication, re-examination, and extension of swanson’s work on literature-based discovery of a connection between raynaud’s and fish oil. Journal of the American Society for Information Science 47, 116–128 (1996)
Kintsch, W.: Predication. Cognitive Science 25, 173–202 (2001)
Landauer, T.K., Dumais, S.T.: A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review 104, 211–240 (1997)
Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Processes 25, 259–284 (1998)
Levy, J.P., Bulliniaria, J.A.: Learning lexical properties from word usage patterns: Which context words should be used? Connectionist models of learning, development and evolution, 213–282 (1999)
Lowe, W.: What is the dimensionality of human semantic space? In: Proceedings of the 6th Neural Computation and Psychology workshop, pp. 303–311. Springer, Heidelberg (2000)
Lowe, W.: Towards a theory of semantic space. In: Moore, J.D., Stenning, K. (eds.) Proceedings of the Twenty-Third Annual Conference of the Cognitive Science Society, pp. 576–581. Lawrence Erlbaum Associates, Mahwah (2001)
Lund, K., Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments & Computers 28(2), 203–208 (1996)
Patel, M., Bulliniaria, J.A., Levy, J.P.: Extracting semantic representations from large text corpora. In: Proceedings of the Fourth Neural Computation and Psychology Workshop, pp. 199–212 (1997)
Peirce, C.S.: The Nature of Meaning. In: Peirce Edition Project,(ed.) Essential Peirce: Selected Philosophical Writings, vol. 2(1893-1913), pp. 208–225. Indiana Univ. Press (1998)
Sahlgren, M.: Towards a flexible model of word meaning. In: Proceedings of AAAI Spring Symposium 2002, Palo Alto, California, USA, Stanford University (2002)
Srinivasan, P.: Text mining: Generating hypotheses from medline. Journal of the American Society for Information Science and Technology 55(5), 396–413 (2004)
Swanson, D.R.: Fish oil, raynaud’s syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine 30, 7–18 (1986)
Swanson, D.R.: Undiscovered public knowledge. Library Quarterly 56, 103–118 (1986)
Swanson, D.R.: Two medical literatures that are logically but not bibliographically connected. Journal of the American Society for Information Science 38, 228–233 (1987)
Swanson, D.R., Smalheiser, N.R.: Implicit text linkages between medline records: Using arrowsmith as an aid to scientific discovery. Library Trends 48, 48–59 (1999)
Swanson, D.R., Smalheiser, N.R.: An interactive system for finding complementary literatures: A stimulus to scientific discovery. Artificial Intelligence 91(2), 183–203 (1997)
Weeber, M., Vos, R., Klein, H., de Jong-van den Berg, L.T.W.: Using concepts in literature-based discovery: simulating swanson’s raynaud- fish-oil and migraine-magnesium discoveries. JASIST 52(7), 548–557 (2001)
Weeber, M., Klein, H., de Jong-can den Berg, L.T.W.: Using concepts in literature-based discovery: Simulating swanson’s raynaud-fish oil and migrain-magnesium discoveries. Journal of the American Society for Information Science and Technology 52(7), 548–557 (2001)
Widdows, D.: Geometry and Meaning. CSLI Publications (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cole, R.J., Bruza, P.D. (2005). A Bare Bones Approach to Literature-Based Discovery: An Analysis of the Raynaud’s/Fish-Oil and Migraine-Magnesium Discoveries in Semantic Space. In: Hoffmann, A., Motoda, H., Scheffer, T. (eds) Discovery Science. DS 2005. Lecture Notes in Computer Science(), vol 3735. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11563983_9
Download citation
DOI: https://doi.org/10.1007/11563983_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29230-2
Online ISBN: 978-3-540-31698-5
eBook Packages: Computer ScienceComputer Science (R0)