Skip to main content

IMGT-KG: A Knowledge Graph for Immunogenetics

  • Conference paper
  • First Online:
The Semantic Web – ISWC 2022 (ISWC 2022)

Abstract

Knowledge graphs are emerging as one of the most popular means for data federation, transformation, integration and sharing, promising to improve data visibility and reusability. Immunogenetics is the branch of life sciences that studies the genetics of the immune system. Although the complexity and the connected nature of immunogenetics data make knowledge graphs a prominent choice to represent and describe immunogenetics entities and relations, hence enabling a plethora of applications, little effort has been directed towards building and using such knowledge graphs so far. In this work, we present the IMGT Knowledge Graph (IMGT-KG), the first of its kind FAIR knowledge graph in immunogenetics. IMGT-KG acquires and integrates data from different immunogenetics databases, hence creating links between them. Consequently, IMGT-KG provides access to 79 670 110 triplets with 10 430 268 entities, 673 concepts and 173 properties. IMGT-KG reuses many existing terms from domain ontologies or vocabularies and provides external links to other resources of the same domain, as well as a set of rules to guide inference on nucleotide sequence positions by applying Allen Interval Algebra. Such inference allows, for example, reasoning about genomics sequence positions. IMGT-KG fills in the gap between genomics and protein sequences and opens a perspective to effective queries and integrative immuno-omics analyses. We make openly and freely available IMGT-KG with detailed documentation and a Web interface for access and exploration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.imgt.org/.

  2. 2.

    A feature is a region in a sequence—a succession of nucleotide or amino acids—with coordinates (start and end value) and a label.

  3. 3.

    https://www.imgt.org/imgt-kg/, gives access to the entire IMGT®database.

  4. 4.

    Successions of amino acids.

  5. 5.

    Either an insertion of nucleotide, either a deletion of nucleotide or substitution of nucleotide.

  6. 6.

    https://www.imgt.org/ligmdb/.

  7. 7.

    http://www.imgt.org/genedb/.

  8. 8.

    https://www.imgt.org/3Dstructure-DB/.

  9. 9.

    https://www.imgt.org/mAb-DB/.

  10. 10.

    RDF, RDFS, and OWL.

  11. 11.

    Uniform Resource Identifier.

  12. 12.

    https://jena.apache.org/.

  13. 13.

    http://ontofox.hegroup.org.

  14. 14.

    https://github.com/stardog-union/pellet.

  15. 15.

    http://www.sequenceontology.org/.

  16. 16.

    https://obofoundry.org/.

  17. 17.

    https://github.com/monarch-initiative/GENO-ontology.

  18. 18.

    https://ncit.nci.nih.gov/ncitbrowser/.

  19. 19.

    https://www.ncbi.nlm.nih.gov/taxonomy.

  20. 20.

    Java Persistence Query Language.

  21. 21.

    A nucleotide sequence consists of many features with a position and IMGT label.

  22. 22.

    https://jena.apache.org/documentation/tdb/.

  23. 23.

    https://jena.apache.org/documentation/fuseki2/index.html.

  24. 24.

    We plan to communicate our results and resources to the biological community.

  25. 25.

    https://www.w3.org/TR/void/.

  26. 26.

    https://yasgui.triply.cc/.

  27. 27.

    http://lambdamusic.github.io/Ontospy/.

  28. 28.

    https://lod.proconsortium.org/.

  29. 29.

    http://geneontology.org/.

References

  1. Allen, J.F., Hayes, P.J.: Moments and points in an interval-based temporal logic. Comput. Intell. 5(3), 225–238 (1989). https://doi.org/10.1111/j.1467-8640.1989.tb00329.x

    Article  Google Scholar 

  2. Ashburner, M., et al.: Gene ontology: tool for the unification of biology (2000). https://doi.org/10.1038/75556, http://www.flybase.bio.indiana.edu, http://fruitfly.bdgp.berkeley.edu, http://www.genome.stanford.edu, http://www.informatics.jax.org

  3. Berners-Lee, T.: Linked Data’s rule (2006). https://www.w3.org/DesignIssues/LinkedData.html

  4. Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Sci. Am. 284(5), 34–43 (2001). https://doi.org/10.1038/scientificamerican0501-34

    Article  Google Scholar 

  5. Bizer, C., Heath, T., Berners-Lee, T.: Linked data - the story so far. Int. J. Semant. Web Inf. Syst. 5(3), 1–22 (2009). https://doi.org/10.4018/jswis.2009081901

    Article  Google Scholar 

  6. Bolleman, J.T., et al.: FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation. J. Biomed. Seman. 7(1), 1–12 (2016). https://doi.org/10.1186/s13326-016-0067-z

    Article  Google Scholar 

  7. Chen, C., et al.: Protein ontology on the semantic web for knowledge discovery. Sci. Data 7(1) (2020). https://doi.org/10.1038/s41597-020-00679-9

  8. Ehrenmann, F., Giudicelli, V., Duroux, P., Lefranc, M.P.: IMGT/collier de perles: IMGT standardized representation of domains (IG, TR, and IgSF variable and constant domains, MH and MhSF groove domains). Cold Spring Harb. Protoc. 6(6), 726–736 (2011). https://doi.org/10.1101/pdb.prot5635

    Article  Google Scholar 

  9. Eilbeck, K., et al.: The sequence ontology: a tool for the unification of genome annotations. Genome Biol. 6(5) (2005). https://doi.org/10.1186/gb-2005-6-5-r44

  10. Giudicelli, V.: IMGT/LIGM-DB, the IMGT(R) comprehensive database of immunoglobulin and T cell receptor nucleotide sequences. Nucleic Acids Res. 34(90001), D781–D784 (2006). https://doi.org/10.1093/nar/gkj088

    Article  Google Scholar 

  11. Giudicelli, V., Chaume, D., Lefranc, M.P.: IMGT/GENE-DB: a comprehensive database for human and mouse immunoglobulin and T cell receptor genes. Nucleic Acids Res. 33(Database Iss.), 256–261 (2005). https://doi.org/10.1093/nar/gki010

  12. Giudicelli, V., Lefranc, M.P.: IMGT-Ontology 2012. Front. Genet. 3(May), 1–16 (2012). https://doi.org/10.3389/fgene.2012.00079

    Article  Google Scholar 

  13. Lefranc, M.P., et al.: IMGT R, the international ImMunoGeneTics information system R 25 years on. Nucleic Acids Res. 43(D1), D413–D422 (2015). https://doi.org/10.1093/nar/gku1056. http://www.imgt.org

  14. Manso, T., et al.: IMGT® databases, related tools and web resources through three main axes of research and development. Nucleic Acids Res. 50(D1), D1262–D1272 (2022). https://doi.org/10.1093/nar/gkab1136

  15. Nguyen, D.Q.: A survey of embedding models of entities and relationships for knowledge graph completion. In: Graph-Based Natural Language Processing (TextGraphs 2020), pp. 1–14 (2021). https://doi.org/10.18653/v1/2020.textgraphs-1.1

  16. Pojero, F., et al.: The role of immunogenetics in covid-19 (2021). https://doi.org/10.3390/ijms22052636

  17. Rossi, A., Barbosa, D., Firmani, D., Matinata, A., Merialdo, P.: Knowledge graph embedding for link prediction: a comparative analysis. ACM Trans. Knowl. Discov. Data 15(2) (2021). https://doi.org/10.1145/3424672, http://arxiv.org/abs/2002.00819

  18. Smith, B., et al.: Relations in biomedical ontologies. Genome Biol. 6(5) (2005). https://doi.org/10.1186/gb-2005-6-5-r46

  19. Xiang, Z., Courtot, M., Brinkman, R.R., Ruttenberg, A., He, Y.: OntoFox: web-based support for ontology reuse. BMC Res. Notes 3 175 (2010). https://doi.org/10.1186/1756-0500-3-175, http://www.biomedcentral.com/1756-0500/3/175

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gaoussou Sanou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sanou, G., Giudicelli, V., Abdollahi, N., Kossida, S., Todorov, K., Duroux, P. (2022). IMGT-KG: A Knowledge Graph for Immunogenetics. In: Sattler, U., et al. The Semantic Web – ISWC 2022. ISWC 2022. Lecture Notes in Computer Science, vol 13489. Springer, Cham. https://doi.org/10.1007/978-3-031-19433-7_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19433-7_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19432-0

  • Online ISBN: 978-3-031-19433-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics