Informatics Tools to Advance the Biology of Glycosaminoglycans and Proteoglycans

  • Lewis J. FreyEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1229)


Glycomics researchers have identified the need for integrated database systems for collecting glycomics information in a consistent format. The goal is to create a resource for knowledge discovery and dissemination to wider research communities. This has the potential to extend the research community to include biologists, clinicians, chemists, and computer scientists. This chapter discusses the technology and approach needed to create integrated data resources to empower the broader community to leverage extant glycomics data. The focus is on glycosaminoglycan (GAGs) and proteoglycan research, but the approach can be generalized. The methods described span the development of glycomics standards from CarbBank to Glyco Connection Tables. The existence of integrated data sets provides a foundation for novel methods of analysis such as machine learning for knowledge discovery. The implications of predictive analysis are examined in relation to disease biomarker to expand the target audience of GAG and proteoglycan research.

Key words

Glycosaminoglycan Proteoglycan Data integration Machine learning Data representation Informatics 


  1. 1.
    Editorial (2005) Sweet collaborations. Nat Methods 2:799Google Scholar
  2. 2.
    National Research Council (US) Committee on Assessing the Importance and Impact of Glycomics and Glycosciences (2012) Transforming glycoscience: a roadmap for the future. National Academies Press, Washington, DCGoogle Scholar
  3. 3.
    Raman R, Venkataraman M, Ramakrishnan S, Lang W, Raguram S, Sasisekharan R (2006) Advancing glycomics: implementation strategies at the Consortium for Functional Glycomics. Glycobiology 16(5):82R–90R. doi: 10.1093/glycob/cwj080 PubMedCrossRefGoogle Scholar
  4. 4.
    Esko JD, Kimata K, Lindahl U (2009) Proteoglycans and sulfated glycosaminoglycans. In: Varki A, Cummings RD, Esko JD et al (eds) Essentials of glycobiology, 2nd edn. Cold Spring Harbor Laboratory, Cold Spring Harbor, NYGoogle Scholar
  5. 5.
    Aoki-kinoshita KF (2008) An introduction to bioinformatics for glycomics research. PLoS Comput Biol 4(5):1–7. doi: 10.1371/journal.pcbi.1000075 CrossRefGoogle Scholar
  6. 6.
    Raman R, Raguram S, Venkataraman G, Paulson JC, Sasisekharan R (2005) Glycomics: an integrated systems approach to structure-function relationships of glycans. Nat Methods 2(11):817–824. doi: 10.1038/NMETH807 PubMedCrossRefGoogle Scholar
  7. 7.
    Sasisekharan R, Raman R, Prabhakar V (2006) Glycomics approach to structure-function relationships of glycosaminoglycans. Annu Rev Biomed Eng 8:181–231. doi: 10.1146/annurev.bioeng.8.061505.095745 PubMedCrossRefGoogle Scholar
  8. 8.
    Perez S, Mulloy B (2005) Prospects for glycoinformatics. Curr Opin Struct Biol 15:517–524PubMedCrossRefGoogle Scholar
  9. 9.
    International Union of Pure and Applied Chemistry (1997) Compendium of analytical nomenclature, 3rd edn. Blackwell Science, Oxford, UK, ISBN 86542-6155Google Scholar
  10. 10.
    Bohne-lang A, Lang E, Fo T (2001) LINUCS: linear notation for unique description of carbohydrate sequences. Carbohydr Res 336:1–11PubMedCrossRefGoogle Scholar
  11. 11.
    Aoki-kinoshita K, Yamaguchi A, Ueda N, Akutsu T, Mamitsuka H, Goto S, Kanehisa M (2004) KCaM (KEGG carbohydrate matcher): a software tool for analyzing the structures of carbohydrate sugar chains. Nucleic Acids Res 32:W267–W272CrossRefGoogle Scholar
  12. 12.
    Sahoo SS, Thomas C, Sheth A, Henson C, York WS (2005) GLYDE-an expressive XML standard for the representation of glycan structure. Carbohydr Res 340:2802–2807. doi: 10.1016/j.carres.2005.09.019 PubMedCrossRefGoogle Scholar
  13. 13.
    York WS, Kochut KJ, Miller JA, Sahoo S, Thomas C, Henson C (2007) GLYDE-II–GLYcan structural data exchange using connection tables. University of Georgia Technical ReportGoogle Scholar
  14. 14.
    Herget S, Ranzinger R, Maass K (2008) GlycoCT–a unifying sequence format for carbohydrates. Carbohydr Res 343:2162–2171. doi: 10.1016/j.carres.2008.03.011 PubMedCrossRefGoogle Scholar
  15. 15.
    Doubet S, Albersheim P (1992) CarbBank. Glycobiology 2(6):505PubMedCrossRefGoogle Scholar
  16. 16.
    Doubet S, Bock K, Smith D, Darvill A, Albersheim P (1989) The complex carbohydrate structure database. Trends Biochem Sci 14(12):475–477PubMedCrossRefGoogle Scholar
  17. 17.
    Consortium for Functional Glycomics (2013) Accessed 23 Dec 2013
  18. 18.
    Consortium for Functional Glycomics Binding Proteins (2013) Accessed 23 Dec 2013
  19. 19.
    Lütteke T, Bohne-lang A, Loss A, Goetz T, Frank M, Lieth CW (2006) an Internet portal to support glycomics and glycobiology research. Glycobiology 16(5):71–81. doi: 10.1093/gly cob/cwj049
  20. 20. database (2013) Accessed 23 Dec 2013
  21. 21.
    Hashimoto K, Goto S, Kawano S, Aoki-kinoshita KF, Ueda N, Hamajima M et al (2006) REVIEW KEGG as a glycome informatics resource. Glycobiology 16(5):63–70. doi: 10.1093/glycob/cwj010 CrossRefGoogle Scholar
  22. 22.
    KEGG GenomeNet (2013) Accessed 23 Dec 2013
  23. 23.
    Ranzinger R, Herget S, Wetter T, Lieth CW (2008) GlycomeDB: an integration of open-access carbohydrate structure databases. BMC Bioinformatics 13:1–13. doi: 10.1186/1471-2105-9-384 Google Scholar
  24. 24.
    GlycomeDB (2013) Accessed 23 Dec 2013
  25. 25.
    Ceroni A, Dell A, Haslam SM (2007) The GlycanBuilder: a fast, intuitive and flexible software tool for building and displaying glycan structures. Source Code Biol Med 13:1–13. doi: 10.1186/1751-0473-2-3 Google Scholar
  26. 26.
    Ceroni A, Dell A, Haslam SM (2007) GlycanBuilder. Accessed 23 Dec 2013
  27. 27.
    GlycosWorkbench (2013) Accessed 23 Dec 2013
  28. 28.
    IBM Watson (2013) Accessed 23 Dec 2013
  29. 29.
    Murdoch TB, Detsky AS (2013) The inevitable application of big data to health care. JAMA 309(13):5–6CrossRefGoogle Scholar
  30. 30.
    Aoki-kinoshita KF (2003) Efficient tree-matching methods for accurate carbohydrate database queries. Genome Inform 143:134–143Google Scholar
  31. 31.
    Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197PubMedCrossRefGoogle Scholar
  32. 32.
    Ueda N, Aoki-kinoshita KF, Yamaguchi A, Akutsu T (2005) A probabilistic model for mining labeled ordered trees: capturing patterns in carbohydrate sugar chains. IEEE Trans Knowl Data Eng 17(8):1051–1064CrossRefGoogle Scholar
  33. 33.
    Aoki-kinoshita KF, Ueda N, Mamitsuka H, Kanehisa M (2006) ProfilePSTMM: capturing tree-structure motifs in carbohydrate sugar chains. Bioinformatics 22(14):25–34. doi: 10.1093/bioinformatics/btl244 CrossRefGoogle Scholar
  34. 34.
    Kawano S, Hashimoto K, Miyama T, Goto S, Kanehisa M (2005) Prediction of glycan structures from gene expression data based on glycosyltransferase reactions. Bioinformatics 21:3976–3982PubMedCrossRefGoogle Scholar
  35. 35.
    Suga A, Yamanishi Y, Hashimoto K, Goto S, Kanehisa M (2007) An improved scoring scheme for predicting glycan structures from gene expression data. Genome Inform 18:237–246PubMedCrossRefGoogle Scholar
  36. 36.
    Venkataraman G, Shriver Z, Raman R, Sasisekharan R (1999) Sequencing complex polysaccharides. Science 286:537–542PubMedCrossRefGoogle Scholar
  37. 37.
    Shriver Z, Raman R, Venkataraman G, Drummond K, Turnbull J et al (2000) Sequencing of 3-O sulfate containing heparin decasaccharides with a partial antithrombin III binding site. Proc Natl Acad Sci U S A 97:10359–10364PubMedCentralPubMedCrossRefGoogle Scholar
  38. 38.
    Guerrini M, Raman R, Venkataraman G, Torri G, Sasisekharan R, Casu B (2002) A novel computational approach to integrate NMR spectroscopy and capillary electrophoresis for structure assignment of heparin and heparan sulfate oligosaccharides. Glycobiology 12:713–719PubMedCrossRefGoogle Scholar
  39. 39.
    Maxwell E, Tan Y, Tan Y, Hu H, Benson G, Aizikov K et al (2012) GlycReSoft: a software package for automated recognition of glycans from LC/MS data. PLoS One 7(9):e45474. doi: 10.1371/journal.pone.0045474 PubMedCentralPubMedCrossRefGoogle Scholar
  40. 40.
    Li L, Zhang F, Zaia J, Linhardt RJ (2012) Top-down approach for the direct characterization of low molecular weight heparins using LC-FT-MS. Anal Chem 84:8822–8829PubMedCentralPubMedCrossRefGoogle Scholar
  41. 41.
    Lieth CW, Bohne-Lang A, Lohmann KK, Frank M (2004) Bioinformatics for glycomics: status, methods, requirements, and perspectives. Brief Bioinform 5:164–178PubMedCrossRefGoogle Scholar
  42. 42.
    Lieth CW, Lutteke T, Frank M (2006) The role of informatics in glycobiology research with special emphasis on automatic interpretation of MS spectra. Biochim Biophys Acta 1760:568–577PubMedCrossRefGoogle Scholar
  43. 43.
    Lütteke T, Frank M, von der Lieth CW (2005) Carbohydrate structure suite (CSS): analysis of carbohydrate 3D structures derived from the PDB. Nucleic Acids Res 33:D242–D246PubMedCentralPubMedCrossRefGoogle Scholar
  44. 44.
    Wang H, Julenius K, Hryhorenko J et al (2007) Systematic analysis of proteoglycan modification sites in caenorhabditis elegans by scanning mutagenesis. J Biol Chem. doi: 10.1074/jbc.M609193200 Google Scholar
  45. 45.
    Shao C, Shi X, White M, Huang Y, Hartshorn K, Zaia J (2013) Comparative glycomics of leukocyte glycosaminoglycans. FEBS J 280:2447–2461. doi: 10.1111/febs.12231 PubMedCrossRefGoogle Scholar
  46. 46.
    Konishi Y, Aoki-kinoshita KF (2012) The GlycomeAtlas tool for visualizing and querying glycome data. Bioinformatics 28(21):2849–2850. doi: 10.1093/bioinformatics/bts516 PubMedCrossRefGoogle Scholar
  47. 47.
    Shi X, Zaia J (2009) Organ-specific heparan sulfate structural phenotypes. J Biol Chem 284:11806–11814PubMedCentralPubMedCrossRefGoogle Scholar
  48. 48.
    Smetsers TFCM, Westerlo EMA, Dam GB et al (2003) Localization and characterization of melanoma-associated glycosaminoglycans: differential expression of chondroitin and heparan sulfate epitopes in melanoma localization and characterization of melanoma-associated glycosaminoglycans. Cancer Res 63:2965–2970PubMedGoogle Scholar
  49. 49.
    Suarez ER, Paredes-gamero EJ, Giglio AD, Luis I, Nader HB, Aparecida M et al (2013) Heparan sulfate mediates trastuzumab effect in breast cancer cells. BMC Cancer 13(1):444. doi: 10.1186/1471-2407-13-444 PubMedCentralPubMedCrossRefGoogle Scholar
  50. 50.
    Gomes AM, Stelling MP, Pavao MSG (2013) Heparan sulfate and heparanase as modulators of breast cancer progression. Biomed Res Int. 11 pgs. 10.1155/2013/852093
  51. 51.
    Packer NH, von der Lieth CW, Aoki-Kinoshita KF, Lebrilla CB, Paulson JC et al (2008) Frontiers in glycomics: bioinformatics and biomarkers in disease. Proteomics 8:8–20PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Biomedical Informatics CenterMedical University of South CarolinaCharlestonUSA

Personalised recommendations