Toward a Semantic Framework for the Querying, Mining and Visualization of Cancer Microenvironment Data

  • Michelangelo Ceci
  • Fabio Fumarola
  • Pietro Hiram Guzzi
  • Federica Mandreoli
  • Riccardo Martoglia
  • Elio Masciari
  • Massimo Mecella
  • Wilma Penzo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7451)

Abstract

Over the last decade, the advances in the high-throughput omic technologies have given the possibility to profile tumor cells at different levels, fostering the discovery of new biological data and the proliferation of a large number of bio-technological databases. In this paper we describe a framework for enabling the interoperability among different biological data sources and for ultimately supporting expert users in the complex process of extraction, navigation and visualization of the precious knowledge hidden in such a huge quantity of data. The system will be used in a pilot study on the Multiple Myeloma (MM).

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aittokallio, T., Schwikowski, B.: Graph-based methods for analysing networks in cell biology. Brief Bioinform. 7(3), 243–255 (2006)CrossRefGoogle Scholar
  2. 2.
    Ashburn, T.T., Thor, K.B.: Drug repositioning: identifying and developing new uses for existing drugs. Nature Reviews Drug Discovery 3, 673–683 (2004)CrossRefGoogle Scholar
  3. 3.
    Boutros, P.C.: Fun with microarrays part iii: Integration and the end of microarrays as we know them. Hypothesis 6(1) (2008)Google Scholar
  4. 4.
    Catarci, T., Santucci, G.: Query by diagram: A graphical environment for querying databases. In: Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data, Minneapolis, Minnesota, May 24-27, p. 515. ACM Press (1994)Google Scholar
  5. 5.
    Ciriello, G., Guerra, C.: A review on models and algorithms for motif discovery in protein interaction networks. Briefings in Functional Genomics and Proteomics 7(2), 147–156 (2008)CrossRefGoogle Scholar
  6. 6.
    Costa, G., Manco, G., Ortale, R.: An incremental clustering scheme for data de-duplication. Data Min. Knowl. Discov. 20(1), 152–187 (2010)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Cramer, P.E., Cirrito, J.R., Wesson, D.W., Lee, C.Y.D., Karlo, J.C., Zinn, A.E., Casali, B.T., Restivo, J.L., Goebel, W.D., James, M.J., Brunden, K.R., Wilson, D.A., Landreth, G.E.: Apoe-directed therapeutics rapidly clear ß-amyloid and reverse deficits in ad mouse models. Science 335(6075), 1503–1506 (2012)CrossRefGoogle Scholar
  8. 8.
    Deodhar, M., Gupta, G., Ghosh, J., Cho, H., Dhillon, I.S.: A scalable framework for discovering coherent co-clusters in noisy data. In: Pohoreckyj Danyluk, A., Bottou, L., Littman, M.L. (eds.) ICML. ACM International Conference Proceeding Series, vol. 382, p. 31. ACM (2009)Google Scholar
  9. 9.
    Hanisch, D., Zien, A., Zimmer, R., Lengauer, T.: Co-clustering of biological networks and gene expression data. In: ISMB, pp. 145–154 (2002)Google Scholar
  10. 10.
    Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In: KDD, pp. 43–52 (1999)Google Scholar
  11. 11.
    Plessis, L.D., Kunca, N., Dessimoz, C.: The what, where, how and why of gene ontology a primer for bioinformaticians. Briefings in Bioinformatics (2011)Google Scholar
  12. 12.
    Elfeky, M.G., Saad, A.A., Fouad, S.A.: ODMQL: Object Data Mining Query Language. In: Dittrich, K.R., Oliva, M., Rodriguez, M.E. (eds.) ECOOP-WS 2000. LNCS, vol. 1944, pp. 128–140. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  13. 13.
    Ernst, J., Bar-Joseph, Z.: Stem: a tool for the analysis of short time series gene expression data. BMC Bioinformatics (2006)Google Scholar
  14. 14.
    Ernst, J., Nau, G.J., Bar-Joseph, Z.: Clustering short time series gene expression data. Bioinformatics 21(suppl. 1), i159–i168Google Scholar
  15. 15.
    Fogel, D.B.: Evolutionary computation - toward a new philosophy of machine intelligence, 3rd edn. Wiley-VCH (2006)Google Scholar
  16. 16.
    Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: ICML, pp. 148–156 (1996)Google Scholar
  17. 17.
    Golumbic, M.C.: Algorithmic Graph Theory and Perfect Graphs. Academic Press, New York (1980)MATHGoogle Scholar
  18. 18.
    Gottlieb, A., Stein, G.Y., Ruppin, E., Sharan, R.: Predict: a method for inferring novel drug indications with application to personalized medicine. Mol. Syst. Biol. 7 (2011)Google Scholar
  19. 19.
    Guzzi, P.H., Mina, M., Guerra, C., Cannataro, M.: Semantic similarity analysis of protein data: assessment with biological features and issues. Briefings in Bioinformatics (2011)Google Scholar
  20. 20.
    Halevy, A.Y., Franklin, M.J., Maier, D.: Principles of dataspace systems. In: PODS, pp. 1–9 (2006)Google Scholar
  21. 21.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann (2000)Google Scholar
  22. 22.
    Hanisch, D., Zien, A., Zimmer, R., Lengauer, T.: Co-clustering of biological networks and gene expression data. Bioinformatics 18(suppl. 1), S145–S154 (2002)CrossRefGoogle Scholar
  23. 23.
    Harris, M.A., Clark, J., Ireland, A., Lomax, J., Ashburner, M., Foulger, R., Eilbeck, K., Lewis, S., Marshall, B., Mungall, C., Richter, J., Rubin, G.M., Blake, J.A., Bult, C., Dolan, M., Drabkin, H., Eppig, J.T., Hill, D.P., Ni, L., Ringwald, M., Balakrishnan, R., Cherry, J.M., Christie, K.R., Costanzo, M.C., Dwight, S.S., Engel, S., Fisk, D.G., Hirschman, J.E., Hong, E.L., Nash, R.S., Sethuraman, A., Theesfeld, C.L., Botstein, D., Dolinski, K., Feierbach, B., Berardini, T., Mundodi, S., Rhee, S.Y., Apweiler, R., Barrell, D., Camon, E., Dimmer, E., Lee, V., Chisholm, R., Gaudet, P., Kibbe, W., Kishore, R., Schwarz, E.M., Sternberg, P., Gwinn, M., Hannick, L., Wortman, J., Berriman, M., Wood, V., Tonellato, P., Jaiswal, P., Seigfried, T., White, R.: The gene ontology (go) database and informatics resource. Nucleic Acids Res. 32, 258–261 (2004)CrossRefGoogle Scholar
  24. 24.
    He, H., Singh, A.K.: Closure-tree: An index structure for graph queries. In: ICDE, pp. 38–49 (2006)Google Scholar
  25. 25.
    Hu, G., Agarwal, P.: Human disease-drug network based on genomic expression profiles. PLoS One 4(8), e6536 (2009)CrossRefGoogle Scholar
  26. 26.
    Hvoreckya, J., Drlikb, M., Munk, M.: The effect of visual query languages on the improvement of information retrieval skills. Procedia - Social and Behavioral Sciences 2(2), 717–723 (2010)CrossRefGoogle Scholar
  27. 27.
    Imielinski, T., Virmani, A.: Msql: A query language for database mining. Data Min. Knowl. Discov. 3(4), 373–408 (1999)CrossRefGoogle Scholar
  28. 28.
    Ioannou, E., Nejdl, W., Niederée, C., Velegrakis, Y.: On-the-fly entity-aware query processing in the presence of linkage. PVLDB 3(1), 429–438 (2010)Google Scholar
  29. 29.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computing Surveys 31 (September 1999)Google Scholar
  30. 30.
    Karmel, R., Gibson, D.: Event-based record linkage in health and aged care services data: a methodological innovation. BMC Health Services Research (2007)Google Scholar
  31. 31.
    Keim, D.A., Mansmann, F., Schneidewind, J., Thomas, J., Ziegler, H.: Visual Analytics: Scope and Challenges. In: Simoff, S.J., Böhlen, M.H., Mazeika, A. (eds.) Visual Data Mining. LNCS, vol. 4404, pp. 76–90. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  32. 32.
    Koza, J.R.: Genetic Programming On the Programming of Computers by Means of Natural Selection. MIT Press (1992)Google Scholar
  33. 33.
    Kuchaiev, O., Milenkovic, T., Memisevic, V., Hayes, W., Przulj, N.: Topological network alignment uncovers biological function and phylogeny. J. of the Royal Society (2010)Google Scholar
  34. 34.
    Lamb, J.: The Connectivity Map: a new tool for biomedical research. Nature Reviews Cancer 7(1), 54–60 (2007)CrossRefGoogle Scholar
  35. 35.
    Li, J., Zhu, X., Chen, J.Y.: Building disease-specific drug-protein connectivity maps from molecular interaction networks and pubmed abstracts. PLoS Comput. Biol. 5(7), e1000450 (2009)CrossRefGoogle Scholar
  36. 36.
    Massari, A., Pavani, S., Saladini, L., Chrysanthis, P.K.: Qbi: Query by icons. In: Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, San Jose, California, May 22-25, p. 477. ACM Press (1995)Google Scholar
  37. 37.
    Natale, D., Arighi, C., Barker, W., Blake, J., Chang, T.-C., Hu, Z., Liu, H., Smith, B., Wu, C.: Framework for a protein ontology. BMC Bioinformatics 8 (2007)Google Scholar
  38. 38.
    Opitz, D., Maclin, R.: Popular ensemble methods: An empirical study. Journal of Artificial Intelligence Research 11, 169–198 (1999)MATHGoogle Scholar
  39. 39.
    Perlman, L., Gottlieb, A., Atias, N., Ruppin, E., Sharan, R.: Combining drug and gene similarity measures for drug-target elucidation. Journal of Computational Biology a Journal of Computational Molecular Cell Biology 18(2), 133–145 (2011)Google Scholar
  40. 40.
    Polyviou, S., Evripidou, P., Samaras, G.: Query by browsing: A visual query language based on the relational model and the desktop user interface paradigm. In: The 3rd Hellenic Symposium on Data Management (2004)Google Scholar
  41. 41.
    Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Machine Learning 37(3), 297–336 (1999)MATHCrossRefGoogle Scholar
  42. 42.
    Sen, P., Namata, G., Bilgic, M., Getoor, L., Gallagher, B., Eliassi-Rad, T.: Collective classification in network data. AI Magazine 29(3), 93–106 (2008)Google Scholar
  43. 43.
    Shah, M., Corbeil, J.: A general framework for analyzing data from two short time-series microarray experiments. IEEE/ACM Trans. Comput. Biol. Bioinformatics 8(1), 14–26 (2011)CrossRefGoogle Scholar
  44. 44.
    Sirota, M., Dudley, J.T., Kim, J., Chiang, A.P., Morgan, A.A., Sweet-Cordero, A., Sage, J., Butte, A.J.: Discovery and preclinical validation of drug indications using compendia of public gene expression data. Science Translational Medicine 3(96), 96–77 (2011)Google Scholar
  45. 45.
    Stojanova, D., Ceci, M., Appice, A., Džeroski, S.: Network Regression with Predictive Clustering Trees. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part III. LNCS, vol. 6913, pp. 333–348. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  46. 46.
    Wang, X., Wu, M., Li, Z., Chan, C.: Short time-series microarray analysis: Methods and challenges. BMC Systems Biology 2 (2008)Google Scholar
  47. 47.
    Zhao, P., Han, J.: On graph query optimization in large networks. Proc. VLDB Endow. 3(1-2), 340–351 (2010)Google Scholar
  48. 48.
    Zhu, L., Ng, W.K., Cheng, J.: Structure and attribute index for approximate graph matching in large graphs. Inf. Syst. 36(6), 958–972 (2011)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Michelangelo Ceci
    • 4
  • Fabio Fumarola
    • 4
  • Pietro Hiram Guzzi
    • 3
  • Federica Mandreoli
    • 6
  • Riccardo Martoglia
    • 6
  • Elio Masciari
    • 1
  • Massimo Mecella
    • 2
  • Wilma Penzo
    • 5
  1. 1.ICAR-CNRItaly
  2. 2.La Sapienza UniversityItaly
  3. 3.Magna Graecia UniversityItaly
  4. 4.University of BariItaly
  5. 5.DEISUniversity of BolognaItaly
  6. 6.DIIUniversity of Modena and Reggio EmiliaItaly

Personalised recommendations