Cancer Data Integration and Querying with GeneTegra

  • E. Patrick Shironoshita
  • Yves R. Jean-Mary
  • Ray M. Bradley
  • Patricia Buendia
  • Mansur R. Kabuka
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7348)

Abstract

We present the GeneTegra system, an ontology-based information integration environment. We show its ability to query multiple data sources, and we evaluate the relative performance of different data repositories. GeneTegra uses Semantic Web standards to resolve the semantic and syntactic diversity of the large and increasingly complex body of publicly available data. GeneTegra contains mechanisms to create ontology models of data sources using the OWL 2 Web Ontology Language, and to define, plan, and execute queries against these models using the SPARQL query language. Data source formats supported include relational databases and XML and RDF data sources. Experimental results have been obtained to show that GeneTegra obtains equivalent results from different data repositories containing the same data, illustrating the ability of the methods proposed in querying heterogeneous sources using the same modeling paradigm.

Keywords

Data integration ontology Semantic Web SPARQL OWL 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Collins, F.S., Green, E.D., Guttmacher, A.E., Guyer, M.S.: A vision for the future of genomics research. Nature 422(6934), 835–847 (2003)CrossRefGoogle Scholar
  2. 2.
    Galperin, M.Y., Cochrane, G.R.: The 2011 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection. Nucleic Acids Research 39(Database), D1–D6 (2010)CrossRefGoogle Scholar
  3. 3.
    Brazhnik, O., Jones, J.F.: Anatomy of Data Integration. J. Biomed. Inform. 40(3), 252–269 (2007)CrossRefGoogle Scholar
  4. 4.
    Neumann, E.: A Life Science Semantic Web: Are We There Yet? Sci. STKE 2005(283), pe22 (2005)Google Scholar
  5. 5.
    Bernstein, P.A., Haas, L.M.: Information integration in the enterprise. Commun. ACM 51, 72–79 (2008)CrossRefGoogle Scholar
  6. 6.
    Roddick, J.F., de Vries, D.: Reduce, Reuse, Recycle: Practical Approaches to Schema Integration, Evolution and Versioning. In: Roddick, J.F., Benjamins, V.R., Si-said Cherfi, S., Chiang, R., Claramunt, C., Elmasri, R.A., Grandi, F., Han, H., Hepp, M., Lytras, M.D., Mišić, V.B., Poels, G., Song, I.-Y., Trujillo, J., Vangenot, C. (eds.) ER Workshops 2006. LNCS, vol. 4231, pp. 209–216. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Hepp, M., Leenheer, P.D., de Moor, A.: Ontology management: semantic web, semantic web services, and business applications. Springer (2007)Google Scholar
  8. 8.
    Dillon, T., Chang, E., Hadzic, M.: Ontology Support for Biomedical Information Resources. In: 21st IEEE International Symposium on Computer-Based Medical Systems, CBMS 2008, pp. 7–16 (2008)Google Scholar
  9. 9.
    Bodenreider, O.: Biomedical ontologies in action: role in knowledge management, data integration and decision support. Yearb. Med. Inform., 67–79 (2008)Google Scholar
  10. 10.
    Goble, C., Stevens, R.: State of the nation in data integration for bioinformatics. J. Biomed. Inform. 41(5), 687–693 (2008)CrossRefGoogle Scholar
  11. 11.
    Prud’hommeaux, E., Deus, H., Marshall, M.S.: Tutorial: Query Federation with SWObjects. Nature Precedings, http://precedings.nature.com/documents/5538/version/1
  12. 12.
    IO Informatics. Sentient Products Overview, http://www.io-informatics.com/products/index.html
  13. 13.
    McCusker, J.P., Phillips, J.A., González Beltrán, A., Finkelstein, A., Krauthammer, M.: Semantic web data warehousing for caGrid. BMC Bioinformatics (10 suppl. 10),S2 (2009)Google Scholar
  14. 14.
    Shironoshita, E.P., Jean-Mary, Y.R., Bradley, R.M., Kabuka, M.R.: semCDI: a query formulation for semantic data integration in caBIG. J. Am. Med. Inform. Assoc. 15(4), 559–568 (2008)CrossRefGoogle Scholar
  15. 15.
    Shironoshita, E.P., Bradley, R.M., Jean-Mary, Y.R., Taylor, T.J., Ryan, M.T., Kabuka, M.R.: Semantic Representation and Querying of caBIG Data Services. In: Bairoch, A., Cohen-Boulakia, S., Froidevaux, C. (eds.) DILS 2008. LNCS (LNBI), vol. 5109, pp. 108–115. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  16. 16.
    Knutsen, T., Gobu, V., Knaus, R., Padilla-Nash, H., Augustus, M., Strausberg, R.L., et al.: The interactive online SKY/M-FISH & CGH database and the Entrez cancer chromosomes search database: linkage of chromosomal aberrations with the genome sequence. Genes Chromosomes Cancer 44(1), 52–64 (2005)CrossRefGoogle Scholar
  17. 17.
    Reinhold, W.C., Reimers, M.A., Lorenzi, P., Ho, J., Shankavaram, U.T., Ziegler, M.S., et al.: Multifactorial regulation of E-cadherin expression: an integrative study. Mol. Cancer Ther. 9(1), 1–16 (2010)CrossRefGoogle Scholar
  18. 18.
    Jean-Mary, Y.R., Shironoshita, E.P., Kabuka, M.R.: Ontology matching with semantic verification. Web Semantics: Science, Services and Agents on the World Wide Web 7(3), 235–251 (2009)CrossRefGoogle Scholar
  19. 19.
    Jean-Mary, Y.R., Shironoshita, E.P., Kabuka, M.R.: ASMOV: Results for OAEI 2010. In: Ontology Matching Workshop OM 2010 (2010)Google Scholar
  20. 20.
    Shironoshita, E.P., Jean-Mary, Y.R., Bradley, R.M., Kabuka, M.R.: semQA: SPARQL with Idempotent Disjunction. IEEE Transactions on Knowledge and Data Engineering 21(3), 401–414 (2009)CrossRefGoogle Scholar
  21. 21.
    Bizer, C.: D2RQ - treating non-RDF databases as virtual RDF graphs. In: Proceedings off the 3rd International Semantic Web Conference (ISWC 2004) (2004), http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.126.2314
  22. 22.
    Broekstra, J., Kampman, A., Van Harmelen, F.: Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema, pp. 54–68 (2002)Google Scholar
  23. 23.
    Das, S., Sundara, S., Cyganiak, R.: R2RML: RDB to RDF Mapping Language. W3C Candidate Recommendation, February 23 (2012), http://www.w3.org/TR/r2rml/

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • E. Patrick Shironoshita
    • 1
  • Yves R. Jean-Mary
    • 1
  • Ray M. Bradley
    • 1
  • Patricia Buendia
    • 1
  • Mansur R. Kabuka
    • 1
    • 2
  1. 1.INFOTECH Soft, Inc.MiamiUSA
  2. 2.University of MiamiCoral GablesUSA

Personalised recommendations