Abstract—
Carbohydrates are one of the most chemically diverse classes of biomolecules. The amount of accumulated information on carbohydrates is far beyond the level allowing navigation in this data ocean without special tools, which are glycomic databases and prognostic services built on top of these data. Existing databases, focused on solving the particular challenges in glycoscience, are not fully compatible with each other in coverage, data formats, and features served to users. Major problems in the modern glyco-databases include data quality, gaps in coverage, and absence of a widely accepted carbohydrate notation. Most demanded are databases with broad coverage, which can provide a universal dataspace on structures, properties, and functions of carbohydrates, associated with taxonomy and other features of their natural sources. In the framework of the Carbohydrate Structure Database (CSDB) project, we created a database architecture aimed at development of the extensible glycoinformatic portal with continuous maintenance and regular content updates. This architecture was implemented in software free of drawbacks typical for glycomic databases. For the 15 years of existence, CSDB has become the main source of data on glycans of microorganisms, and a platform for multiple carbohydrate-related services. This project includes a global-scale database of natural carbohydrates; among its key features are free access, annual data deposition and updates, search and correction of errors (including those in publications), and regular announcement of new services.
Similar content being viewed by others
REFERENCES
Egorova, K.S. and Toukach, P.V., Angew. Chem., Int. Ed. Engl., 2018, vol. 57, pp. 14986–14990. https://doi.org/10.1002/anie.201803576
Lütteke, T., in A Practical Guide to Using Glycomics Databases, Aoki-Kinoshita, K.F., Ed., Tokyo: Springer, 2017, pp. 335–350. https://doi.org/10.1007/978-4-431-56454-6_16
Bohm, M., Bohne-Lang, A., Frank, M., Loss, A., Rojas-Macias, M.A., and Lütteke, T., Nucleic Acids Res., 2019, vol. 47, pp. D1195–D1201. https://doi.org/10.1093/nar/gky994
Lütteke, T., Bohne-Lang, A., Loss, A., Goetz, T., Frank, M., and Lieth, C.W., Glycobiology, 2006, vol. 16, pp. 71R–81R. https://doi.org/10.1093/glycob/cwj049
Doubet, S., Bock, K., Smith, D., Darvill, A., and Albersheim, P., Trends Biochem. Sci., 1989, vol. 14, pp. 475–477. https://doi.org/10.1016/0968-0004(89)90175-8
Doubet, S. and Albersheim, P., Glycobiology, 1992, vol. 2, pp. 505–507. https://doi.org/10.1093/glycob/2.6.505
Campbell, M.P., Peterson, R., Mariethoz, J., Gasteiger, E., Akune, Y., Aoki-Kinoshita, K.F., Lisacek, F., and Packer, N.H., Nucleic Acids Res., 2014, vol. 42, pp. D215–D221. https://doi.org/10.1093/nar/gkt1128
Campbell, M.P. and Packer, N.H., Biochim. Biophys. Acta, 2016, vol. 1860, pp. 1669–1675. https://doi.org/10.1016/j.bbagen.2016.02.016
Cooper, C.A., Joshi, H.J., Harrison, M.J., Wilkins, M.R., and Packer, N.H., Nucleic Acids Res., 2003, vol. 31, pp. 511–513. https://doi.org/10.1093/nar/gkg099
Cooper, C.A., Harrison, M.J., Wilkins, M.R., and Packer, N.H., Nucleic Acids Res., 2001, vol. 29, pp. 332–335. https://doi.org/10.1093/Nar/29.1.332
Zhao, S., Walsh, I., Abrahams, J.L., Royle, L., Nguyen-Khuong, T., Spencer, D., Fernandes, D.L., Packer, N.H., Rudd, P.M., and Campbell, M.P., Bioinformatics, 2018, vol. 34, pp. 3231–3232. https://doi.org/10.1093/bioinformatics/bty319
Campbell, M.P., Royle, L., Radcliffe, C.M., Dwek, R.A., and Rudd, P.M., Bioinformatics, 2008, vol. 24, pp. 1214–1216. https://doi.org/10.1093/bioinformatics/btn090
Aoki-Kinoshita, K.F. and Kanehisa, M., in Glycoinformatics, Lütteke, T. and Frank, M., Eds., New York: Humana Press, 2015, pp. 97–107. https://doi.org/10.1007/978-1-4939-2343-4_7
Toukach, P.V. and Egorova, K.S., in Glycoscience: Biology and Medicine, Taniguchi, N., Endo, T., Hart, G., Seeberger, P., and Wong, C.H., Eds., Tokyo: Springer, 2015, pp. 241–250. https://doi.org/10.1007/978-4-431-54841-6_24
Toukach, P.V. and Egorova, K.S., Nucleic Acids Res., 2016, vol. 44, pp. D1229–D1236. https://doi.org/10.1093/nar/gkv840
Toukach, P.V. and Knirel, Y.A., Glycoconjugate J., 2005, vol. 2, pp. 216–217.
Toukach, P.V., J. Chem. Inf. Model., 2011, vol. 51, pp. 159–170. https://doi.org/10.1021/ci100150d
York, W.S., Mazumder, R., Ranzinger, R., Edwards, N., Kahsay, R., Aoki-Kinoshita, K.F., Campbell, M.P., Cummings, R.D., Feizi, T., Martin, M., Natale, D.A., Packer, N.H., Woods, R.J., Agarwal, G., Arpinar, S., Bhat, S., Blake, J., Castro, L.J.G., Fochtman, B., Gildersleeve, J., Goldman, R., Holmes, X., Jain, V., Kulkarni, S., Mahadik, R., Mehta, A., Mousavi, R., Nakarakommula, S., Navelkar, R., Pattabiraman, N., Pierce, M.J., Ross, K., Vasudev, P., Vora, J., Williamson, T., and Zhang, W., Glycobiology, 2020, vol. 30, pp. 72–73. https://doi.org/10.1093/glycob/cwz080
Kahsay, R., Vora, J., Navelkar, R., Mousavi, R., Fochtman, B.C., Holmes, X., Pattabiraman, N., Ranzinger, R., Mahadik, R., Williamson, T., Kulkarni, S., Agarwal, G., Martin, M., Vasudev, P., Garcia, L., Edwards, N., Zhang, W., Natale, D.A., Ross, K., Aoki-Kinoshita, K.F., Campbell, M.P., York, W.S., and Mazumder, R., Bioinformatics, 2020, vol. 36, pp. 3941–3943. https://doi.org/10.1093/bioinformatics/btaa238
von der Lieth, C.W., Freire, A.A., Blank, D., Campbell, M.P., Ceroni, A., Damerell, D.R., Dell, A., Dwek, R.A., Ernst, B., Fogh, R., Frank, M., Geyer, H., Geyer, R., Harrison, M.J., Henrick, K., Herget, S., Hull, W.E., Ionides, J., Joshi, H.J., Kamerling, J.P., Leeflang, B.R., Lutteke, T., Lundborg, M., Maass, K., Merry, A., Ranzinger, R., Rosen, J., Royle, L., Rudd, P.M., Schloissnig, S., Stenutz, R., Vranken, W.F., Widmalm, G., and Haslam, S.M., Glycobiology, 2011, vol. 21, pp. 493–502. https://doi.org/10.1093/glycob/cwq188
Rojas-Macias, M.A., Ståhle, J., Lütteke, T., and Widmalm, G., Glycobiology, 2015, vol. 25, pp. 341–347. https://doi.org/10.1093/glycob/cwu116
Lütteke, T. and von der Lieth, C.W., Glycobiology, 2005, vol. 15, pp. 1209–1210. https://doi.org/10.1093/glycob/cwj039
Lütteke, T., in A Practical Guide to Using Glycomics Databases, Aoki-Kinoshita, K.F., Ed., Tokyo: Springer, 2017, pp. 29–40. https://doi.org/10.1007/978-4-431-56454-6_3
Fujita, A., Aoki, N.P., Shinmachi, D., Matsubara, M., Tsuchiya, S., Shiota, M., Ono, T., Yamada, I., and Aoki-Kinoshita, K.F., Nucleic Acids Res., 2021, vol. 49, pp. D1529–D1533. https://doi.org/10.1093/nar/gkaa947
Tiemeyer, M., Aoki, K., Paulson, J., Cummings, R.D., York, W.S., Karlsson, N.G., Lisacek, F., Packer, N.H., Campbell, M.P., Aoki, N.P., Fujita, A., Matsubara, M., Shinmachi, D., Tsuchiya, S., Yamada, I., Pierce, M., Ranzinger, R., Narimatsu, H., and Aoki-Kinoshita, K.F., Glycobiology, 2017, vol. 27, pp. 915–919. https://doi.org/10.1093/glycob/cwx066
Egorova, K.S. and Toukach, P.V., J. Chem. Inf. Model., 2012, vol. 52, pp. 2812–2814. https://doi.org/10.1021/ci3002815
Herget, S., Ranzinger, R., Maass, K., and Lieth, C.W., Carbohydr. Res., 2008, vol. 343, pp. 2162–2171. https://doi.org/10.1016/j.carres.2008.03.011
Ranzinger, R., Herget, S., Lieth, C.W., and Frank, M., Nucleic Acids Res., 2011, vol. 39, pp. D373–D376. https://doi.org/10.1093/nar/gkq1014
Varki, A., Cummings, R.D., Aebi, M., Packer, N.H., Seeberger, P.H., Esko, J.D., Stanley, P., Hart, G., Darvill, A., Kinoshita, T., Prestegard, J.J., Schnaar, R.L., Freeze, H.H., Marth, J.D., Bertozzi, C.R., Etzler, M.E., Frank, M., Vliegenthart, J.F., Lutteke, T., Perez, S., Bolton, E., Rudd, P., Paulson, J., Kanehisa, M., Toukach, P., Aoki-Kinoshita, K.F., Dell, A., Narimatsu, H., York, W., Taniguchi, N., and Kornfeld, S., Glycobiology, 2015, vol. 25, pp. 1323–1324. https://doi.org/10.1093/glycob/cwv091
Neelamegham, S., Aoki-Kinoshita, K., Bolton, E., Frank, M., Lisacek, F., Lutteke, T., O’Boyle, N., Packer, N.H., Stanley, P., Toukach, P., Varki, A., Woods, R.J., and Group, S.D., Glycobiology, 2019, vol. 29, pp. 620–624. https://doi.org/10.1093/glycob/cwz045
Willighagen, E.L. and Brandle, M.P., J. Cheminform., 2011, vol. 3, p. 15. https://doi.org/10.1186/1758-2946-3-15
Aoki-Kinoshita, K.F., Aoki, N.P., Fujita, A., Fujita, N., Kawasaki, T., Matsubara, M., Okuda, S., Shikanai, T., Shinmachi, D., Solovieva, E., Suzuki, Y., Tsuchiya, S., Yamada, I., and Narimatsu, H., Perspect. Sci., 2017, vol. 11, pp. 18–23. https://doi.org/10.1016/j.pisc.2016.05.012
Katayama, T., Wilkinson, M.D., Aoki-Kinoshita, K.F., Kawashima, S., Yamamoto, Y., Yamaguchi, A., Okamoto, S., Kawano, S., Kim, J.D., Wang, Y., Wu, H., Kano, Y., Ono, H., Bono, H., Kocbek, S., Aerts, J., Akune, Y., Antezana, E., Arakawa, K., Aranda, B., Baran, J., Bolleman, J., Bonnal, R.J., Buttigieg, P.L., Campbell, M.P., Chen, Y.A., Chiba, H., Cock, P.J., Cohen, K.B., Constantin, A., Duck, G., Dumontier, M., Fujisawa, T., Fujiwara, T., Goto, N., Hoehndorf, R., Igarashi, Y., Itaya, H., Ito, M., Iwasaki, W., Kalas, M., Katoda, T., Kim, T., Kokubu, A., Komiyama, Y., Kotera, M., Laibe, C., Lapp, H., Lutteke, T., Marshall, M.S., Mori, T., Mori, H., Morita, M., Murakami, K., Nakao, M., Narimatsu, H., Nishide, H., Nishimura, Y., Nystrom-Persson, J., Ogishima, S., Okamura, Y., Okuda, S., Oshita, K., Packer, N.H., Prins, P., Ranzinger, R., Rocca-Serra, P., Sansone, S., Sawaki, H., Shin, S.H., Splendiani, A., Strozzi, F., Tadaka, S., Toukach, P., Uchiyama, I., Umezaki, M., Vos, R., Whetzel, P.L., Yamada, I., Yamasaki, C., Yamashita, R., York, W.S., Zmasek, C.M., Kawamoto, S., and Takagi, T., J. Biomed. Semantics, 2014, vol. 5, p. 5. https://doi.org/10.1186/2041-1480-5-5
Aoki-Kinoshita, K.F., Bolleman, J., Campbell, M.P., Kawano, S., Kim, J.D., Lutteke, T., Matsubara, M., Okuda, S., Ranzinger, R., Sawaki, H., Shikanai, T., Shinmachi, D., Suzuki, Y., Toukach, P., Yamada, I., Packer, N.H., and Narimatsu, H., J. Biomed. Semantics, 2013, vol. 4, p. 39. https://doi.org/10.1186/2041-1480-4-39
Ranzinger, R., Aoki-Kinoshita, K.F., Campbell, M.P., Kawano, S., Lutteke, T., Okuda, S., Shinmachi, D., Shikanai, T., Sawaki, H., Toukach, P., Matsubara, M., Yamada, I., and Narimatsu, H., Bioinformatics, 2015, vol. 31, pp. 919–925. https://doi.org/10.1093/bioinformatics/btu732
Yamada, I., Campbell, M.P., Edwards, N., Castro, L.J., Lisacek, F., Mariethoz, J., Ono, T., Ranzinger, R., Shinmachi, D., and Aoki-Kinoshita, K.F., Glycobiology, 2021, vol. 31, pp. 741–750. https://doi.org/10.1093/glycob/cwab013
Toukach, P.V. and Egorova, K.S., J. Chem. Inf. Model., 2020, vol. 60, pp. 1276–1289. https://doi.org/10.1021/acs.jcim.9b00744
Tanaka, K., Aoki-Kinoshita, K.F., Kotera, M., Sawaki, H., Tsuchiya, S., Fujita, N., Shikanai, T., Kato, M., Kawano, S., Yamada, I., and Narimatsu, H., J. Chem. Inf. Model., 2014, vol. 54, pp. 1558–1566. https://doi.org/10.1021/ci400571e
Matsubara, M., Aoki-Kinoshita, K.F., Aoki, N.P., Yamada, I., and Narimatsu, H., J. Chem. Inf. Model., 2017, vol. 57, pp. 632–637. https://doi.org/10.1021/acs.jcim.6b00650
Bochkov, A.Y. and Toukach, P.V., J. Chem. Inf. Model., 2021, vol. 61, pp. 4940–4948. https://doi.org/10.1021/acs.jcim.1c00917
Lu, Z., Database, 2011, vol. 2011, art. ID baq036. https://doi.org/10.1093/database/baq036
Federhen, S., Nucleic Acids Res., 2012, vol. 40, pp. D136–D143. https://doi.org/10.1093/nar/gkr1178
Benson, D.A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., and Sayers, E.W., Nucleic Acids Res., 2013, vol. 41, pp. D36–D42. https://doi.org/10.1093/nar/gks1195
The Uniprot Consortium, Nucleic Acids Res., 2017, vol. 45, pp. D158–D169. https://doi.org/10.1093/nar/gkw1099
Toukach, P., Joshi, H.J., Ranzinger, R., Knirel, Y., and Lieth, C.W., Nucleic Acids Res., 2007, vol. 35, pp. D280–D286. https://doi.org/10.1093/nar/gkl883
Li, X., Xu, Z., Hong, X., Zhang, Y., and Zou, X., Int. J. Mol. Sci., 2020, vol. 21, p. 6727. https://doi.org/10.3390/ijms21186727
Abrahams, J.L., Taherzadeh, G., Jarvas, G., Guttman, A., Zhou, Y., and Campbell, M.P., Curr. Opin. Struct. Biol., 2020, vol. 62, pp. 56–69. https://doi.org/10.1016/j.sbi.2019.11.009
Scherbinina, S.I. and Toukach, P.V., Int. J. Mol. Sci., 2020, vol. 21, p. 7702. https://doi.org/10.3390/ijms21207702
Copoiu, L. and Malhotra, S., Curr. Opin. Struct. Biol., 2020, vol. 62, pp. 132–139. https://doi.org/10.1016/j.sbi.2019.12.020
A Practical Guide to Using Glycomics Databases, Aoki-Kinoshita, K.F., Ed., Tokyo: Springer, 2017. https://doi.org/10.1007/978-4-431-56454-6
Aoki-Kinoshita, K.F., Mol. Cell. Proteomics, 2013, vol. 12, pp. 1036–1045. https://doi.org/10.1074/mcp.R112.026252
Toukach, F.V., Information technologies in structural glycochemistry and glycobiology, Doctoral (Chem.) Dissertation (habilitation), Moscow: Zelinskii Inst. Org. Chem. Russ. Acad. Sci., 2019.
Egorova, K.S. and Toukach, P.V., Carbohydr. Res., 2014, vol. 389, pp. 112–114. https://doi.org/10.1016/j.carres.2013.10.009
ICD-11: in Praise of Good Data, Lancet Infect. Dis., 2018, vol. 18, p. 813. https://doi.org/10.1016/s1473-3099(18)30436-5
Baumann, N., Int. J. Clin. Pract., 2016, vol. 70, pp. 171–174. https://doi.org/10.1111/ijcp.12767
Kim, S., Thiessen, P.A., Bolton, E.E., Chen, J., Fu, G., Gindulyte, A., Han, L., He, J., He, S., Shoemaker, B.A., Wang, J., Yu, B., Zhang, J., and Bryant, S.H., Nucleic Acids Res., 2016, vol. 44, pp. D1202–D1213. https://doi.org/10.1093/nar/gkv951
Pavlech, L.L., J. Med. Libr. Assoc., 2016, vol. 104, pp. 88–90. https://doi.org/10.3163/1536-5050.104.1.020
Stroylov, V., Panova, M., and Toukach, P., Int. J. Mol. Sci., 2020, vol. 21, p. 7626. https://doi.org/10.3390/ijms21207626
Frank, M., Lutteke, T., and Lieth, C.W., Nucleic Acids Res., 2007, vol. 35, pp. 287–290. https://doi.org/10.1093/nar/gkl907
Egorova, K.S. and Toukach, P.V., Glycobiology, 2017, vol. 27, pp. 285–290. https://doi.org/10.1093/glycob/cww137
Egorova, K.S., Knirel, Y.A., and Toukach, P.V., Glycobiology, 2019, vol. 29, pp. 285–287. https://doi.org/10.1093/glycob/cwz006
Egorova, K.S., Smirnova, N.S., and Toukach, P.V., Glycobiology, 2021, vol. 31, pp. 524–529. https://doi.org/10.1093/glycob/cwaa107
Egorova, K.S. and Toukach, P.V., in A Practical Guide to Using Glycomics Databases, Aoki-Kinoshita, K.F., Ed., Tokyo: Springer, 2017, pp. 75–113. https://doi.org/10.1007/978-4-431-56454-6_5
Toukach, P.V. and Egorova, K.S., in Glycoinformatics, Lütteke, T. and Frank, M., Eds., New York: Humana Press, 2015, pp. 55–85. https://doi.org/10.1007/978-1-4939-2343-4_5
Egorova, K.S., Kondakova, A.N., and Toukach, P.V., Database, 2015, vol. 2015, art. ID bav073. https://doi.org/10.1093/database/bav073
Chernyshov, I.Y. and Toukach, P.V., Bioinformatics, 2018, vol. 34, pp. 2679–2681. https://doi.org/10.1093/bioinformatics/bty168
Kapaev, R.R. and Toukach, P.V., J. Chem. Inf. Model., 2016, vol. 56, pp. 1100–1104. https://doi.org/10.1021/acs.jcim.6b00083
Kapaev, R.R. and Toukach, P.V., Bioinformatics, 2018, vol. 34, pp. 957–963. https://doi.org/10.1093/bioinformatics/btx696
Kapaev, R.R., Egorova, K.S., and Toukach, P.V., J. Chem. Inf. Model., 2014, vol. 54, pp. 2594–2611. https://doi.org/10.1021/ci500267u
Kapaev, R.R. and Toukach, P.V., Anal. Chem., 2015, vol. 87, pp. 7006–7010. https://doi.org/10.1021/acs.analchem.5b01413
ACKNOWLEDGMENTS
The authors are grateful to Yu.A. Knirel’ for supporting the project at the initial stage and for data verification; K.S. Egorova for the work with literature, data verification, and assistance in the design of the glycosyltransferase module; N.A. Kalinchuk, K.V. Kazantsev, E.A. Belozertseva, E.L. Zdorovenko, E.V. Shikina, and N.S. Smirnova for the work with literature and data annotation; A.Yu. Bochkov, I.Yu. Chernyshev, and R.R. Kapaev for the development and programming of structure input modules, generation of 3D structures, and statistical prediction of NMR spectra, respectively; to other participants of the project in 2005–2021.
Funding
The works within the development, maintenance, and popularization of the CSDB in 2021–2022, including the preparation of this review, were supported by the Russian Science Foundation (project no. 18-14-00098-P).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
This article does not contain descriptions of any studies involving human participants or animals as objects of studies.
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Translated by authors
Abbreviations: API, application programming interface; DB, database; CCSD, complex carbohydrate structure database; CSDB, Carbohydrate Structure Database; ESKAPE, Enterococcus faecium Staphyllococcus aureus Klebsiella pneumonia Acinetobacter baumannii Pseudomonas aeruginosa Enterobacter spp.; IUPAC, International Union of Pure and Applied Chemistry; NCBI, National Center for Biotechnology Information; PDB, Protein Data Bank; SNFG, Symbol Nomenclature for Glycans.
Correspondence author: phone: +7 (916) 172-47-10.
Rights and permissions
About this article
Cite this article
Toukach, P.V., Shirkovskaya, A.I. Carbohydrate Structure Database and Other Glycan Databases as an Important Element of Glycoinformatics. Russ J Bioorg Chem 48, 457–466 (2022). https://doi.org/10.1134/S1068162022030190
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1068162022030190