SNP-Converter: An Ontology-Based Solution to Reconcile Heterogeneous SNP Descriptions for Pharmacogenomic Studies

  • Adrien Coulet
  • Malika Smaïl-Tabbone
  • Pascale Benlian
  • Amedeo Napoli
  • Marie-Dominique Devignes
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4075)


Pharmacogenomics explores the impact of individual genomic variations in health problems such as adverse drug reactions. Records of millions of genomic variations, mostly known as Single Nucleotide Polymorphisms (SNP), are available today in various overlapping and heterogeneous databases. Selecting and extracting from these databases or from private sources a proper set of polymorphisms are the first steps of a KDD (Knowledge Discovery in Databases) process in pharmacogenomics. It is however a tedious task hampered by the heterogeneity of SNP nomenclatures and annotations. Standards for representing genomic variants have been proposed by the Human Genome Variation Society (HGVS). The SNP-Converter application is aimed at converting any SNP description into an HGVS-compliant pivot description and vice versa. Used in the frame of a knowledge system, the SNP-Converter application contributes as a wrapper to semantic data integration and enrichment.


Single Nucleotide Polymorphism Genomic Variation UCSC Genome Browser Pharmacogenomic Study Private Database 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kruglyak, L., Nickerson, D.: Variation is the spice of life. Nat. Genet. 27(3), 234–236 (2001)CrossRefGoogle Scholar
  2. 2.
    Frawley, W., Piatetsky-Shapiro, G., Matheus, C.: Knowledge Discovery in databases: An Overview. In: Knowledge Discovery in Databases, pp. 1–30. AAAI/MIT Press (1991)Google Scholar
  3. 3.
    Janetzko, D., Cherfi, H., Kennke, R., Napoli, A., Toussaint, Y.: Knowledge-based Selection of Association Rules for Text Mining. In: 16th European Conference on Artificial Intelligence, ECAI 2004, Valencia (2004)Google Scholar
  4. 4.
    Vetere, G., Lenzerini, M.: Models for Semantic. Interoperability in Service Oriented Architectures. IBM Systems Journal 44 (2005)Google Scholar
  5. 5.
    Gruber, T.R.: A Translation Approach to Portable Ontology Specifications. Knowledge Acquisition 5, 199–220 (1993)CrossRefGoogle Scholar
  6. 6.
    Evans, W., Relling, M.: Pharmacogenomics: moving toward individualized medicine. Nature. 29, 464–468 (2004)CrossRefGoogle Scholar
  7. 7.
    Klein, T., Chang, J., Cho, M., Easton, K., Fergerson, R., Hewett, M., Lin, Z., Liu, Y., Liu, S., Oliver, D., et al.: Integrating genotype and phenotype information: an overview of the PharmGKB project. Pharmacogenom. J. 1, 167–170 (2001)Google Scholar
  8. 8.
    Marsh, S., Kwok, P., McLeod, H.: SNP databases and pharmacogenetics: great start, but a long way to go. Hum. Mutat. 20(3), 174–179 (2002)CrossRefGoogle Scholar
  9. 9.
    Coulet, A., Smaïl-Tabbone, M., Napoli, A., Benlian, P., Devignes, M.D.: SNPOntology for semantic integration of genomic variation data. In: ISMB 2006, Fortaleza (2006),
  10. 10.
    Anand, S., Bell, D., Hughes, J.: The role of domain knowledge in data mining. In: Conference on Information and Knowledge Management CIKM 1995, Baltimore, USA (1995)Google Scholar
  11. 11.
    Euler, T., Scholz, M.: Using Ontologies in a KDD workbench. In: ECML/PKDD 2004 Workshop on Knowledge Discovery and Ontologies (KDO 2004), Pisa (2004)Google Scholar
  12. 12.
    Catarci, T., Lenzerini, M.: Representing and using inter-schema knowledge in cooperative information systems. Journal of Intelligent and Cooperative Information Systems 2, 375–398 (1993)CrossRefGoogle Scholar
  13. 13.
    Levy, A.: Logic-Based Techniques in Data Integration Logic Based Artificial Intelligence. Jack Minker. Kluwer Publishers, Dordrecht (2000)Google Scholar
  14. 14.
    den Dunnen, J., Antonarakis, S.: Mutation nomenclature extensions and suggestions to describe complex mutations: a discussion. Hum. Mutat. 15, 7–12 (2000)CrossRefGoogle Scholar
  15. 15.
    den Dunnen, J., Paalman, M.: Standardizing mutation nomenclature: why bother? Hum. Mutat. 22, 181–182 (2003)CrossRefGoogle Scholar
  16. 16.
    Cotton, R.G.H., Kazazian, H.H.: Toward a human variome project. Hum. Mutat. 26(6), 499 (2005)CrossRefGoogle Scholar
  17. 17.
    Sherry, S., Ward, M., Sirotkin, K.: dbSNP—Database for Single Nucleotide Polymorphisms and Other Classes of Minor Genetic Variation. Genome. Res. 9, 677–679 (1999)Google Scholar
  18. 18.
    Fredman, D., Munns, G., Rios, D., Sjoholm, F., Siegfried, M., Lenhard, B., Lehvaslaiho, H., Brookes, A.: HGVbase: a curated resource describing human DNA variation and phenotype relationships. Nucleic Acids Res. 32, D516–519 (2004)CrossRefGoogle Scholar
  19. 19.
    Hemminger, B., Saelim, B., Sullivan, P.: TAMAL: an integrated approach to chosing SNPs for genetics studies of human compex traits. Bioinformatics 22, 626–627 (2006)CrossRefGoogle Scholar
  20. 20.
    Karchin, R., Diekhaux, M., Kelly, L., Thomas, D., Pieper, U., Eswar, N., Haussler, D., Sali, A.: LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources. Bioinformatics 21, 2814–2820 (2005)CrossRefGoogle Scholar
  21. 21.
    Sugawara, H., Mizushima, H., Kano, T., Shigemoto, Y., Hashimoto, Y., Tomabechi, I., Sakagami, N., et al.: Polymorphism Markup Language (PML) for the interoperability of data on SNPs and other sequence variations. In: 19th International CODATA Conference (2004)Google Scholar
  22. 22.
    OMG Single Nucleotide Polymorphisms specification (2005),
  23. 23.
    Oliver, D., Rubin, D., Stuart, J., Hewett, M., Klein, T., Altman, R.: Ontology development for a pharmacogenetics knowledge base. In: Pac. Symp. Biocomput., pp. 65–76 (2002)Google Scholar
  24. 24.
    Horrocks, P., Patel-Schneider, F., van Harmelen, F.: From SHIQ and RDF to OWL: The making of a web ontology language. Journal of Web Semantics 1(1), 7–26 (2003)Google Scholar
  25. 25.
    Noy, N., Sintek, M., Decker, S., et al.: Creating Semantic Web contents with Protege-2000. IEEE Intelligent Systems 16, 60–71 (2001)CrossRefGoogle Scholar
  26. 26.
    W3C Web Ontology Working Group (WOWG), Owl web ontology language semantics and abstract syntax. W3C recommendation (2004),
  27. 27.
    Kasprzyk, A., Keefe, D., Smedley, D., London, D., Spooner, W., Melsopp, C., Hammond, M., Rocca-Serra, P., Cox, T., Birney, E.: EnsMart: A Generic System for Fast and Flexible Access to Biological Data. Genome Res. 14, 160–169 (2004)CrossRefGoogle Scholar
  28. 28.
    Cheung, K.H., Kevin, Y., Yip, K.Y., Smith, A., de Knikker, R., Masiar, A., Gerstein, M.: YeastHub: a semantic web use case for integrating data in the life sciences domain. Bioinformatics 21(28), i85–i96 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Adrien Coulet
    • 1
    • 2
  • Malika Smaïl-Tabbone
    • 2
  • Pascale Benlian
    • 3
  • Amedeo Napoli
    • 2
  • Marie-Dominique Devignes
    • 2
  1. 1.KIKA MedicalParisFrance
  2. 2.LORIA (UMR 7503 CNRS-INPL-INRIA-Nancy2-UHP, Campus scientifique)Vandoeuvre-lès-NancyFrance
  3. 3.INSERM UMRS 538, Biochimie – Biologie, MoléculaireUniversité Pierre et Marie Curie – Paris6ParisFrance

Personalised recommendations