Bisociative Knowledge Discovery pp 33-50

Part of the Lecture Notes in Computer Science book series (LNCS, volume 7250) | Cite as

From Information Networks to Bisociative Information Networks

  • Tobias Kötter
  • Michael R. Berthold

Abstract

The integration of heterogeneous data from various domains without the need for prefiltering prepares the ground for bisociative knowledge discoveries where attempts are made to find unexpected relations across seemingly unrelated domains. Information networks, due to their flexible data structure, lend themselves perfectly to the integration of these heterogeneous data sources. This chapter provides an overview of different types of information networks and categorizes them by identifying several key properties of information units and relations which reflect the expressiveness and thus ability of an information network to model heterogeneous data from diverse domains. The chapter progresses by describing a new type of information network known as bisociative information networks. This kind of network combines the key properties of existing networks in order to provide the foundation for bisociative knowledge discoveries. Finally based on this data structure three different patterns are described that fulfill the requirements of a bisociation by connecting concepts from seemingly unrelated domains.

References

  1. 1.
    Abello, J., Korn, J.: Mgv: a system for visualizing massive multidigraphs. Transactions on Visualization and Computer Graphics 8(1), 21–38 (2002)CrossRefGoogle Scholar
  2. 2.
    Albert, R., Barabasi, A.-L.: Statistical mechanics of complex networks. Reviews of Modern Physics 74, 47–97 (2002)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Auillans, P., de Mendez, P.O., Rosenstiehl, P., Vatant, B.: A Formal Model for Topic Maps. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 69–83. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  4. 4.
    Baitaluk, M., Qian, X., Godbole, S., Raval, A., Ray, A., Gupta, A.: Pathsys: integrating molecular interaction graphs for systems biology. BMC Bioinformatics 7, 55 (2006)CrossRefGoogle Scholar
  5. 5.
    Bales, M.E., Johnson, S.B.: Graph theoretic modeling of large-scale semantic networks. Journal of Biomedical Informatics 39, 451–464 (2006)CrossRefGoogle Scholar
  6. 6.
    Belew, R.: Adaptive information retrieval: using a connectionist representation to retrieve and learn about documents. In: Proceedings of the 12th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 11–20 (1989)Google Scholar
  7. 7.
    Belleau, F., Nolin, M.-A., Tourigny, N., Rigault, P., Morissette, J.: Bio2rdf: towards a mashup to build bioinformatics knowledge systems. Journal of Biomedical Informatics 41, 706–716 (2008)CrossRefGoogle Scholar
  8. 8.
    Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Scientific American 5, 34–43 (2001)CrossRefGoogle Scholar
  9. 9.
    Birkland, A., Yona, G.: Biozon: a system for unification, management and analysis of heterogeneous biological data. BMC Bioinformatics 7, 70 (2006)CrossRefGoogle Scholar
  10. 10.
    Bodenreider, O.: Biomedical ontologies in action: role in knowledge management, data integration and decision support. IMIA Yearbook of Medical Informatics 1, 67–79 (2008)Google Scholar
  11. 11.
    Brandes, U., Erlebach, T.: Network Analysis: Methodological Foundations. Springer (2005)Google Scholar
  12. 12.
    Burgun, A., Bodenreider, O.: Accessing and integrating data and knowledge for biomedical research. IMIA Yearbook of Medical Informatics 1, 91–101 (2008)Google Scholar
  13. 13.
    Chen, H., Ding, L., Wu, Z., Yu, T., Dhanapalan, L., Chen, J.Y.: Semantic web for integrated network analysis in biomedicine. Briefings in Bioinformatics 10, 177–192 (2009)CrossRefGoogle Scholar
  14. 14.
    Chen, H., Ng, T.: An algorithmic approach to concept exploration in a large knowledge network (automatic thesaurus consultation): Symbolic branch-and-bound search vs. connectionist hopfield net activation. Journal of the American Society for Information Science 46(5), 348–369 (1995)CrossRefGoogle Scholar
  15. 15.
    Cheung, K.-H., Yip, K.Y., Smith, A., Deknikker, R., Masiar, A., Gerstein, M.: Yeasthub: a semantic web use case for integrating data in the life sciences domain. Bioinformatics 21(suppl.1), i85–i96 (2005)Google Scholar
  16. 16.
    Chua, H.N., Sung, W.-K., Wong, L.: An efficient strategy for extensive integration of diverse biological data for protein function prediction. Bioinformatics 23, 3364–3373 (2007)CrossRefGoogle Scholar
  17. 17.
    Consortium, G.O.: Creating the gene ontology resource: design and implementation. Genome Research 11, 1425–1433 (2001)CrossRefGoogle Scholar
  18. 18.
    Crestani, F.: Application of spreading activation techniques in information retrieval. Artificial Intelligence Review 11, 453–482, 12 (1997)Google Scholar
  19. 19.
    Dienhart, J.M.: A linguistic look at riddles. Journal of Pragmatics 31(1), 95–125 (1999)CrossRefGoogle Scholar
  20. 20.
    Dubitzky, W., Kötter, T., Schmidt, O., Berthold, M.R.: Towards Creative Information Exploration Based on Koestler’s Concept of Bisociation. In: Berthold, M.R. (ed.) Bisociative Knowledge Discovery. LNCS (LNAI), vol. 7250, pp. 11–32. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  21. 21.
    Durand, P., Labarre, L., Meil, A., Divol, J.-L., Vandenbrouck, Y., Viari, A., Wojcik, J.: Genolink: a graph-based querying and browsing system for investigating the function of genes and proteins. BMC Bioinformatics 7(1), 21 (2006)CrossRefGoogle Scholar
  22. 22.
    Figeys, D.: Combining different ’omics’ technologies to map and validate protein-protein interactions in humans. Briefings in Functional Genomics and Proteomics 2, 357–365 (2004)CrossRefGoogle Scholar
  23. 23.
    I.O. for Standardization. Information Technology – Document Description and Processing Languages – Topic Maps – Data Model. ISO, Geneva, Switzerland (2006)Google Scholar
  24. 24.
    Franke, L., van Bakel, H., Fokkens, L., de Jong, E.D., Egmont-Petersen, M., Wijmenga, C.: Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. The American Journal of Human Genetics 78, 1011–1025 (2006)CrossRefGoogle Scholar
  25. 25.
    Furnas, G.W.: Generalized fisheye views. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, vol. 17(4), pp. 16–23 (1986)Google Scholar
  26. 26.
    Gavin, A.-C., Aloy, P., Grandi, P., Krause, R., Boesche, M., Marzioch, M., Rau, C., Jensen, L.J., Bastuck, S., Dümpelfeld, B., Edelmann, A., Heurtier, M.-A., Hoffman, V., Hoefert, C., Klein, K., Hudak, M., Michon, A.-M., Schelder, M., Schirle, M., Remor, M., Rudi, T., Hooper, S., Bauer, A., Bouwmeester, T., Casari, G., Drewes, G., Neubauer, G., Rick, J.M., Kuster, B., Bork, P., Russell, R.B., Superti-Furga, G.: Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 631–636 (2006)CrossRefGoogle Scholar
  27. 27.
    Getoor, L., Diehl, C.: Link mining: a survey. ACM SIGKDD Explorations Newsletter 7(2), 3–12 (2005)CrossRefGoogle Scholar
  28. 28.
    Han, J.: Mining Heterogeneous Information Networks by Exploring the Power of Links. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds.) DS 2009. LNCS, vol. 5808, pp. 13–30. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  29. 29.
    Hayes, J.: A graph model for RDF. Master’s thesis, Technische Universität Darmstadt, Dept. of Computer Science, Darmstadt, Germany. In: Collaboration with the Computer Science Dept., University of Chile, Santiago de Chile (2004)Google Scholar
  30. 30.
    Hayes, J., Gutierrez, C.: Bipartite Graphs as Intermediate Model for RDF. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 47–61. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  31. 31.
    Jansen, R., Yu, H., Greenbaum, D., Kluger, Y., Krogan, N.J., Chung, S., Emili, A., Snyder, M., Greenblatt, J.F., Gerstein, M.: A bayesian networks approach for predicting protein-protein interactions from genomic data. Science 302, 449–453 (2003)CrossRefGoogle Scholar
  32. 32.
    Kiemer, L., Costa, S., Ueffing, M., Cesareni, G.: Wi-phi: A weighted yeast interactome enriched for direct physical interactions. Proteomics 7, 932–943 (2007)CrossRefGoogle Scholar
  33. 33.
    Koehler, J., Rawlings, C., Verrier, P., Mitchell, R., Skusa, A., Ruegg, A., Philippi, S.: Linking experimental results, biological networks and sequence analysis methods using ontologies and generalised data structures. Silico Biology 5, 33–44 (2005)Google Scholar
  34. 34.
    Koestler, A.: The Act of Creation. Macmillan (1964)Google Scholar
  35. 35.
    Kötter, T., Berthold, M.R.: (Missing) concept discovery in heterogeneous information networks. In: Proceedings of the 2nd International Conference on Computational Creativity, pp. 135–140 (2011)Google Scholar
  36. 36.
    Kötter, T., Berthold, M.R.: (Missing) Concept Discovery in Heterogeneous Information Networks. In: Berthold, M.R. (ed.) Bisociative Knowledge Discovery. LNCS (LNAI), vol. 7250, pp. 230–245. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  37. 37.
    Kötter, T., Thiel, K., Berthold, M.R.: Domain bridging associations support creativity. In: Proceedings of the International Conference on Computational Creativity, pp. 200–204 (2010)Google Scholar
  38. 38.
    Kwoh, C.K., Ng, P.Y.: Network analysis approach for biology. Cellular and Molecular Life Sciences 64, 1739–1751 (2007)CrossRefGoogle Scholar
  39. 39.
    Lakoff, G., Johnson, M.: Metaphors We Live by. University of Chicago Press (1980)Google Scholar
  40. 40.
    Lassila, O., Swick, R.R.: Resource Description Framework (RDF) model and syntax specification. W3C Working Draft (February 2002)Google Scholar
  41. 41.
    Li, J., Li, X., Su, H., Chen, H., Galbraith, D.W.: A framework of integrating gene relations from heterogeneous data sources: an experiment on arabidopsis thaliana. Bioinformatics 22(16), 2037–2043 (2006)CrossRefGoogle Scholar
  42. 42.
    Martinez Morales, A.A.: A directed hypergraph model for RDF. In: Simperl, E., Diederich, J., Schreiber, G. (eds.) Proceedings of the KWEPSY 2007, vol. 275 (2007)Google Scholar
  43. 43.
    Nagel, U., Thiel, K., Kötter, T., Piątek, D., Berthold, M.R.: Bisociative Discovery of Interesting Relations between Domains. In: Gama, J., Bradley, E., Hollmén, J. (eds.) IDA 2011. LNCS, vol. 7014, pp. 306–317. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  44. 44.
    Nagel, U., Thiel, K., Kötter, T., Piatek, D., Berthold, M.R.: Towards Discovery of Subgraph Bisociations. In: Berthold, M.R. (ed.) Bisociative Knowledge Discovery. LNCS (LNAI), vol. 7250, pp. 263–284. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  45. 45.
    Pavlopoulos, G., Wegener, A.-L., Schneider, R.: A survey of visualization tools for biological network analysis. BioData Mining 1(1), 1–12 (2008)CrossRefGoogle Scholar
  46. 46.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers (1988)Google Scholar
  47. 47.
    Pepper, S.: The tao of topic maps: finding the way in the age of infoglut. In: Proceedings of XML Europe (2000)Google Scholar
  48. 48.
    Schaeffer, S.E.: Graph clustering. Computer Science Review 1, 27–64 (2007)CrossRefMATHGoogle Scholar
  49. 49.
    Sevon, P., Eronen, L., Hintsanen, P., Kulovesi, K., Toivonen, H.: Link Discovery in Graphs Derived from Biological Databases. In: Leser, U., Naumann, F., Eckman, B. (eds.) DILS 2006. LNCS (LNBI), vol. 4075, pp. 35–49. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  50. 50.
    Shen, Z., Ma, K.-L., Eliassi-Rad, T.: Visual analysis of large heterogeneous social networks by semantic and structural abstraction. IEEE Transactions on Visualization and Computer Graphics 12(6), 1427–1439 (2006)CrossRefGoogle Scholar
  51. 51.
    Smith, A.K., Cheung, K.-H., Yip, K.Y., Schultz, M., Gerstein, M.K.: Linkhub: a semantic web system that facilitates cross-database queries and information retrieval in proteomics. BMC Bioinformatics 8, S5 (2007)Google Scholar
  52. 52.
    Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg, L.J., Eilbeck, K., Ireland, A., Mungall, C.J., Consortium, O.B.I., Leontis, N., Rocca-Serra, P., Ruttenberg, A., Sansone, S.-A., Scheuermann, R.H., Shah, N., Whetzel, P.L., Lewis, S.: The obo foundry: coordinated evolution of ontologies to support biomedical data integration. Nature Biotechnology 25, 1251–1255 (2007)CrossRefGoogle Scholar
  53. 53.
    Thiel, K., Berthold, M.R.: Node similarities from spreading activation. In: Proceedings of the IEEE International Conference on Data Mining (2010)Google Scholar
  54. 54.
    Thiel, K., Berthold, M.R.: Node Similarities from Spreading Activation. In: Berthold, M.R. (ed.) Bisociative Knowledge Discovery. LNCS (LNAI), vol. 7250, pp. 246–262. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  55. 55.
    Troyanskaya, O.G., Dolinski, K., Owen, A.B., Altman, R.B., Botstein, D.: A bayesian framework for combining heterogeneous data sources for gene function prediction (in saccharomyces cerevisiae). Proceedings of the National Academy of Sciences 100, 8348–8353 (2003)CrossRefGoogle Scholar
  56. 56.
    Tzitzikas, Y., Constantopoulos, P., Spyratos, N.: Mediators over ontology-based information sources. In: Second International Conference on Web Information Systems Engineering, pp. 31–40 (2001)Google Scholar
  57. 57.
    van Ham, F., van Wijk, J.: Interactive visualization of small world graphs. In: van Wijk, J. (ed.) Proc. IEEE Symposium on Information Visualization INFOVIS 2004, pp. 199–206 (2004)Google Scholar

Copyright information

© The Author(s) 2012 2012

Authors and Affiliations

  • Tobias Kötter
    • 1
  • Michael R. Berthold
    • 1
  1. 1.Nycomed-Chair for Bioinformatics and Information MiningUniversity of KonstanzKonstanzGermany

Personalised recommendations