Conceptual Modeling of Human Genome: Integration Challenges

  • Oscar Pastor
  • Juan Carlos Casamayor
  • Matilde Celma
  • Laura Mota
  • M. Ángeles Pastor
  • Ana M. Levin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7260)


While Information Systems (IS) principles have been successfully applied to the design, implementation and management of a diverse set of domains, the Bioinformatics domain in general and the Genomic one in particular, often lacks a rigorous IS background, based on elaborating a precise Conceptual Model where the relevant concepts of the domain were properly defined. On the contrary, current genomic data repositories focus on the solution space in the form of diverse, ad-hoc databases that use to be hard to manage, evolve and intercommunicate. Conceptual Modeling as a central strategy is then far from the current biological data source ontologies that are heterogeneous, imprecise and too often even inconsistent when compared among them. To solve this problem, a concrete Conceptual Schema for the Human Genome (CSHG) is introduced in its latest version on this chapter. With a holistic perspective, the CSHG focuses on the different genomic views that must be integrated and emphasizes the value of the approach in order to deal appropriately the challenge of correctly interpreting the human genome.


Conceptual Modeling Information Systems in Bioinformatics Semantic Integration Human Genome 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Olivé, A.: Conceptual Modelling of Information Systems. Springer, Heidelberg (2007)zbMATHGoogle Scholar
  2. 2.
    Falkenberg, E., Hesse, W., Lindgreen, W., Nilsson, E., Han, J., Rolland, C., Stamper, R., Van Assche, F., Verrijn-Stuart, A., Voss, K.: A Framework of Information System Concepts. IFIP (1998)Google Scholar
  3. 3.
    Pastor, O., Molina, J.C.: Model-Driven Architecture in Practice. Springer, Heidelberg (2007)Google Scholar
  4. 4.
    Thorisson, G.A., Muilu, J., Brookes, A.: Genotype-phenotype databases: challenges and solutions for the post-genomic era. Nature Reviews – Genetics 10 (2009)Google Scholar
  5. 5.
    Stein, L.: Creating a bioinformatics nation. Nature 417, 119–120 (2002)CrossRefGoogle Scholar
  6. 6.
    Crowd sourcing human mutations (Editorial). Nature Genetics 43(4) (2011)Google Scholar
  7. 7.
    The Gene Ontology Consortium.: Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000)Google Scholar
  8. 8. (accessed December 05, 2011)
  9. 9. (accessed December 05, 2011)
  10. 10.
    Hubbard, T., et al.: The Ensembl genome database project. Nucleic Acids Research 30(1), 38–41 (2002)CrossRefMathSciNetGoogle Scholar
  11. 11. (accessed December 05, 2011)
  12. 12. (accessed December 05, 2011)
  13. 13. (accessed December 05, 2011)
  14. 14.
    Cooper, D.N., Krawczak, M.: Human gene mutation database. Hum. Genet. 98(5), 629 (1996)CrossRefGoogle Scholar
  15. 15.
    Hamosh, A., Scott, A.F., Amberger, J., Valle, D., McKusick, V.A.: Online Mendelian Inheritance in Man (OMIM). Hum. Mutat. 15(1), 57–61 (2000)CrossRefGoogle Scholar
  16. 16. (accessed December 05, 2011)
  17. 17. (accessed December 05, 2011)
  18. 18.
    Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H., Kanehisa, M.: KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 27(1), 29–34 (1999)CrossRefGoogle Scholar
  19. 19.
    Apweiler, R., et al.: UniProt: the Universal Protein knowledgebase. Nucleic Acids Res. 32 (Database issue): D115-9 (2004)Google Scholar
  20. 20. (accessed December 05, 2011)
  21. 21.
    Apweiler, R., et al.: The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res. 29(1), 37–40 (2001)CrossRefGoogle Scholar
  22. 22. (accessed December 05, 2011)
  23. 23.
    Croft, D., et al.: Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res. (Database issue): D691-7 (2011)Google Scholar
  24. 24.
  25. 25.
    Pastor, O.: Conceptual Modeling Meets the Human Genome. In: Li, Q., Spaccapietra, S., Yu, E., Olivé, A. (eds.) ER 2008. LNCS, vol. 5231, pp. 1–11. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  26. 26.
    Paton, W.N., Khan, S., Hayes, A., Moussouni, F., Brass, A., Eilbeck, K., Globe, C., Hubbard, S., Oliver, S.: Conceptual modeling of genomic information. Bioinformatics 16(6), 548–557 (2000)CrossRefGoogle Scholar
  27. 27.
    Ram, S., Wei, W.: Modeling the Semantics of 3D Protein Structures. In: Atzeni, P., Chu, W., Lu, H., Zhou, S., Ling, T.-W. (eds.) ER 2004. LNCS, vol. 3288, pp. 696–708. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  28. 28.
    Garwood, K., Garwood, C., Hedeler, C., Griffiths, T., Swainston, N., Oliver, S., Paton, W.: Model-driven user interface for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it. Bioinformatics 7, 532, 1–14 (2006)Google Scholar
  29. 29.
    Bornberg-Bauer, E., Paton, N.: Conceptual data modelling for bioinformatics. Briefings in Bioinformatics 3(2), 166–180 (2002)CrossRefGoogle Scholar
  30. 30.
  31. 31.
    Hedeler, C., Wong, H.M., Cornell, M.J., Alam, I., Soanes, D., Rattray, M., Hubbrad, S.J., Talbot, N.J., Oliver, S.G., Paton, N.: e-Fungi: a data resource for comparative analysis of fungal genomes. BMC Genomics 8, 426, 1–15 (2007)Google Scholar
  32. 32.
    Pastor, O., Levin, A., Celma, M., Casamayor, J., Virrueta, A., Eraso, L.: Model-Based Engineering Applied to the Interpretation of the Human Genome. In: Kaschek, R., Delcambre, L. (eds.) The Evolution of Conceptual Modeling. LNCS, vol. 6520, pp. 306–330. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  33. 33.
    Pastor, O., van der Kroon, M., Levin, A., Casamayor, J.C., Celma, M.: A Conceptual Modeling Approach to Improve Human Genome Understanding. In: Embley, D., Thalheim, B. (eds.) Handbook of Conceptual Modeling: Theory, Practice and Research Challenges, pp. 517–541. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  34. 34.
    International Human Genome Sequencing Consortium: Initial sequencing and analysis of the human genome. Nature 409(6822), 860–921 (2001)Google Scholar
  35. 35.
    Gene Nomenclature Committee,
  36. 36.
    National Center for Biotechnology Information,
  37. 37.
    Gerstein, M.B., Bruce, C., Rozowsky, J., Zheng, D., Du, J., Korbel, J., Emanuelsson, O., Zhang, Z., Weissman, S., Snyder, M.: What is a gene, post-ENCODE? History and updated definition. Genome Res. 17, 669–681 (2007)CrossRefGoogle Scholar
  38. 38.
    Blitzblau, H.G., Bell, G.W., Rodriguez, J., Bell, S.P., Hochwagen, A.: Mapping of Meiotic Single-Stranded DNA Reveals Double-Strand-Break Hotspots near Centromeres and Telomeres. Current Biology 17(23), 2003–2012 (2007)CrossRefGoogle Scholar
  39. 39.
    Paigen, K., Petkov, P.: Mammalian recombination hot spots: properties, control and evolution. Nature Reviews Genetics 11, 221–233 (2010)CrossRefGoogle Scholar
  40. 40.
    Holmquist, G.P.: Chromosome bands, their chromatin flavors, and their functional features. Am. J. Hum. Genet. 51(1), 17–37 (1992)Google Scholar
  41. 41.
    Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., Walter, P.: Molecular Biology of the Cell. Garland Science, New York (2002), Google Scholar
  42. 42.
    Collins, F.S.: The Language of Life: DNA and the Revolution in Personalized Medicine. Harper Colllins Publishers (2010)Google Scholar
  43. 43.
    Wheeler, D.A., et al.: The complete genome of an individual by massively parallel DNA sequencing. Nature 452, 872–877 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Oscar Pastor
    • 1
  • Juan Carlos Casamayor
    • 1
  • Matilde Celma
    • 1
  • Laura Mota
    • 1
  • M. Ángeles Pastor
    • 1
  • Ana M. Levin
    • 1
  1. 1.Centro de Investigación en Métodos de Producción de Software (PROS), Departamento de Sistemas Informáticos y ComputaciónUniversitat Politècnica de ValènciaValenciaSpain

Personalised recommendations