Skip to main content

Databases, Knowledgebases, and Software Tools for Virus Informatics

  • Chapter
  • First Online:
Translational Informatics

Part of the book series: Advances in Experimental Medicine and Biology ((AEMB,volume 1368))

Abstract

Virus infection is a common social health issue. In the past decades, serious virus infectious events have caused great loss in people’s life and the economics. The nature of rapid widespread and frequent variation increases the difficulty for precision viral prevention and treatment. In the era of big data and artificial intelligence (AI), advances in bioinformatics techniques bring unprecedented opportunities for virus informatics study, which contribute to the systems-level modeling of virus biology. In this chapter, data resources including virus-related databases and knowledgebases are introduced. Bioinformatics models and software tools for multiple sequence alignment, evolutionary analysis, and genome-wide research of viruses are summarized and emphasized. Translational applications of recently developed data-driven and AI-assisted methods to viral cases such as SARS-CoV-2, HBV/HCV, and influenza virus are discussed. Finally, the concept and significance of virus informatics are highlighted for both virus surveillance and health promotion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Hatano Y, Ideta T, Hirata A, Hatano K, Tomita H, Okada H et al (2021) Virus-driven carcinogenesis. Cancers (Basel) 13(11):2625

    Article  CAS  Google Scholar 

  2. Windhaber S, Xin Q, Lozach PY (2021) Orthobunyaviruses: from virus binding to penetration into mammalian host cells. Viruses 13(5):872

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Alnuqaydan AM, Almutary AG, Sukamaran A, Yang BTW, Lee XT, Lim WX et al (2021) Middle East respiratory syndrome (MERS) virus-pathophysiological axis and the current treatment strategies. AAPS PharmSciTech 22:173

    Article  CAS  PubMed  Google Scholar 

  4. Goyal M, Tewatia N, Vashisht H, Jain R, Kumar S (2021) Novel corona virus (COVID-19); global efforts and effective investigational medicines: a review. J Infect Public Health 14:910–921

    Article  PubMed  PubMed Central  Google Scholar 

  5. Goettsch W, Beerenwinkel N, Deng L, Dolken L, Dutilh BE, Erhard F et al (2021) ITN-VIROINF: understanding (harmful) virus-host interactions by linking virology and bioinformatics. Viruses 13(5):766

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Ramirez-Salinas GL, Garcia-Machorro J, Rojas-Hernandez S, Campos-Rodriguez R, de Oca AC, Gomez MM et al (2020) Bioinformatics design and experimental validation of influenza A virus multi-epitopes that induce neutralizing antibodies. Arch Virol 165:891–911

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Hu T, Li J, Zhou H, Li C, Holmes EC, Shi W (2021) Bioinformatics resources for SARS-CoV-2 discovery and surveillance. Brief Bioinform 22:631–641

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Ibrahim B, McMahon DP, Hufsky F, Beer M, Deng L, Mercier PL et al (2018) A new era of virus bioinformatics. Virus Res 251:86–90

    Article  CAS  PubMed  Google Scholar 

  9. Pickett BE, Sadat EL, Zhang Y, Noronha JM, Squires RB, Hunt V et al (2012) ViPR: an open bioinformatics database and analysis resource for virology research. Nucleic Acids Res 40:D593–D598

    Article  CAS  PubMed  Google Scholar 

  10. Hatcher EL, Zhdanov SA, Bao Y, Blinkova O, Nawrocki EP, Ostapchuck Y et al (2017) Virus Variation Resource - improved response to emergent viral outbreaks. Nucleic Acids Res 45:D482–D490

    Article  CAS  PubMed  Google Scholar 

  11. Canakoglu A, Pinoli P, Bernasconi A, Alfonsi T, Melidis DP, Ceri S (2021) ViruSurf: an integrated database to investigate viral sequences. Nucleic Acids Res 49:D817–D824

    Article  CAS  PubMed  Google Scholar 

  12. Goodacre N, Aljanahi A, Nandakumar S, Mikailov M, Khan AS (2018) A Reference Viral Database (RVDB) to enhance bioinformatics analysis of high-throughput sequencing for novel virus detection. mSphere 3(2):e00069-18

    Article  PubMed  PubMed Central  Google Scholar 

  13. Wang Y, Tong Y, Zhang Z, Zheng R, Huang D, Yang J et al (2021) ViMIC: a database of human disease-related virus mutations, integration sites and cis-effects. Nucleic Acids Res 50(D1):D918–D927

    Article  PubMed Central  CAS  Google Scholar 

  14. Yang X, Lian X, Fu C, Wuchty S, Yang S, Zhang Z (2021) HVIDB: a comprehensive database for human-virus protein-protein interactions. Brief Bioinform 22:832–844

    Article  CAS  PubMed  Google Scholar 

  15. Cook HV, Doncheva NT, Szklarczyk D, von Mering C, Jensen LJ (2018) Viruses.STRING: a virus-host protein-protein interaction database. Viruses 10(10):519

    Article  PubMed Central  CAS  Google Scholar 

  16. Xiang Y, Zou Q, Zhao L (2020) VPTMdb: a viral posttranslational modification database. Brief Bioinform 22(4):bbaa251

    Article  CAS  Google Scholar 

  17. Cai Z, Fan Y, Zhang Z, Lu C, Zhu Z, Jiang T et al (2021) VirusCircBase: a database of virus circular RNAs. Brief Bioinform 22:2182–2190

    Article  CAS  PubMed  Google Scholar 

  18. Tang D, Li B, Xu T, Hu R, Tan D, Song X et al (2020) VISDB: a manually curated database of viral integration sites in the human genome. Nucleic Acids Res 48:D633–D641

    Article  CAS  PubMed  Google Scholar 

  19. Zhao WM, Song SH, Chen ML, Zou D, Ma LN, Ma YK et al (2020) The 2019 novel coronavirus resource. Yi Chuan 42:212–221

    PubMed  Google Scholar 

  20. Feng Z, Chen M, Liang T, Shen M, Chen H, Xie XQ (2021) Virus-CKB: an integrated bioinformatics platform and analysis resource for COVID-19 research. Brief Bioinform 22:882–895

    Article  CAS  PubMed  Google Scholar 

  21. Chen TF, Chang YC, Hsiao Y, Lee KH, Hsiao YC, Lin YH et al (2021) DockCoV2: a drug database against SARS-CoV-2. Nucleic Acids Res 49:D1152–D1159

    Article  CAS  PubMed  Google Scholar 

  22. Gowthaman R, Guest JD, Yin R, Adolf-Bryfogle J, Schief WR, Pierce BG (2021) CoV3D: a database of high resolution coronavirus protein structures. Nucleic Acids Res 49:D282–D287

    Article  CAS  PubMed  Google Scholar 

  23. Mahdi A, Blaszczyk P, Dlotko P, Salvi D, Chan TS, Harvey J et al (2021) OxCOVID19 Database, a multimodal data repository for better understanding the global impact of COVID-19. Sci Rep 11:9237

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Shu Y, McCauley J (2017) GISAID: global initiative on sharing all influenza data - from vision to reality. Euro Surveill 22(13):30494

    Article  PubMed  PubMed Central  Google Scholar 

  25. Squires RB, Noronha J, Hunt V, Garcia-Sastre A, Macken C, Baumgarth N et al (2012) Influenza research database: an integrated bioinformatics resource for influenza research and surveillance. Influenza Other Respir Viruses 6:404–416

    Article  PubMed  PubMed Central  Google Scholar 

  26. Ding X, Yuan X, Mao L, Wu A, Jiang T (2020) FluReassort: a database for the study of genomic reassortments among influenza viruses. Brief Bioinform 21:2126–2132

    Article  PubMed  Google Scholar 

  27. Squires B, Macken C, Garcia-Sastre A, Godbole S, Noronha J, Hunt V et al (2008) BioHealthBase: informatics support in the elucidation of influenza virus host pathogen interactions and virulence. Nucleic Acids Res 36:D497–D503

    Article  CAS  PubMed  Google Scholar 

  28. Muthaiyan M, Naorem LD, Seenappa V, Pushan SS, Venkatesan A (2021) Ebolabase: Zaire ebolavirus-human protein interaction database for drug-repurposing. Int J Biol Macromol 182:1384–1391

    Article  CAS  PubMed  Google Scholar 

  29. Lathwal A, Kumar R, Raghava GPS (2020) OvirusTdb: a database of oncolytic viruses for the advancement of therapeutics in cancer. Virology 548:109–116

    Article  CAS  PubMed  Google Scholar 

  30. Usman Z, Velkov S, Protzer U, Roggendorf M, Frishman D, Karimzadeh H (2020) HDVdb: a comprehensive hepatitis D virus database. Viruses 12(5):538

    Article  CAS  PubMed Central  Google Scholar 

  31. Yan B, Zhang S, Yu S, Hussain S, Liu T, Wang B et al (2020) HRRD: a manually-curated database about the regulatory relationship between HPV and host RNA. Sci Rep 10:19586

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Do CB, Mahabhashyam MS, Brudno M, Batzoglou S (2005) ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res 15:330–340

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Novak A, Miklos I, Lyngso R, Hein J (2008) StatAlign: an extendable software package for joint Bayesian estimation of alignments and evolutionary trees. Bioinformatics 24:2403–2404

    Article  CAS  PubMed  Google Scholar 

  35. Troshin PV, Procter JB, Barton GJ (2011) Java bioinformatics analysis web services for multiple sequence alignment—JABAWS:MSA. Bioinformatics 27:2001–2002

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Yachdav G, Wilzbach S, Rauscher B, Sheridan R, Sillitoe I, Procter J et al (2016) MSAViewer: interactive JavaScript visualization of multiple sequence alignments. Bioinformatics 32:3501–3503

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Gascuel O (1997) BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol Biol Evol 14:685–695

    Article  CAS  PubMed  Google Scholar 

  38. Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ (2015) IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32:268–274

    Article  CAS  PubMed  Google Scholar 

  39. Lartillot N, Lepage T, Blanquart S (2009) PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating. Bioinformatics 25:2286–2288

    Article  CAS  PubMed  Google Scholar 

  40. Bouckaert R, Heled J, Kuhnert D, Vaughan T, Wu CH, Xie D et al (2014) BEAST 2: a software platform for Bayesian evolutionary analysis. PLoS Comput Biol 10:e1003537

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  41. Zhang D, Gao F, Jakovlic I, Zou H, Zhang J, Li WX et al (2020) PhyloSuite: an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies. Mol Ecol Resour 20:348–355

    Article  PubMed  Google Scholar 

  42. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S (2011) MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol 28:2731–2739

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Abril JF, Guigo R (2000) gff2ps: visualizing genomic annotations. Bioinformatics 16:743–744

    Article  CAS  PubMed  Google Scholar 

  44. Liu W, Xie Y, Ma J, Luo X, Nie P, Zuo Z et al (2015) IBS: an illustrator for the presentation and visualization of biological sequences. Bioinformatics 31:3359–3361

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Zablocki O, Michelsen M, Burris M, Solonenko N, Warwick-Dugdale J, Ghosh R et al (2021) VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature. PeerJ 9:e11088

    Article  PubMed  PubMed Central  Google Scholar 

  46. Flageul A, Lucas P, Hirchaud E, Touzain F, Blanchard Y, Eterradossi N et al (2021) Viral variant visualizer (VVV): a novel bioinformatic tool for rapid and simple visualization of viral genetic diversity. Virus Res 291:198201

    Article  CAS  PubMed  Google Scholar 

  47. Martin DP, Murrell B, Golden M, Khoosal A, Muhire B (2015) RDP4: detection and analysis of recombination patterns in virus genomes. Virus Evol 1:vev003

    Article  PubMed  PubMed Central  Google Scholar 

  48. Alawi M, Burkhardt L, Indenbirken D, Reumann K, Christopeit M, Kroger N et al (2019) DAMIAN: an open source bioinformatics tool for fast, systematic and cohort based analysis of microorganisms in diagnostic samples. Sci Rep 9:16841

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  49. Borges V, Pinheiro M, Pechirra P, Guiomar R, Gomes JP (2018) INSaFLU: an automated open web-based bioinformatics suite “from-reads” for influenza whole-genome-sequencing-based surveillance. Genome Med 10:46

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  50. Li G, Ruan S, Zhao X, Liu Q, Dou Y, Mao F (2021) Transcriptomic signatures and repurposing drugs for COVID-19 patients: findings of bioinformatics analyses. Comput Struct Biotechnol J 19:1–15

    Article  PubMed  CAS  Google Scholar 

  51. Vastrad B, Vastrad C, Tengli A (2020) Bioinformatics analyses of significant genes, related pathways, and candidate diagnostic biomarkers and molecular targets in SARS-CoV-2/COVID-19. Gene Rep 21:100956

    Article  PubMed  PubMed Central  Google Scholar 

  52. Xie TA, Han MY, Su XR, Li HH, Chen JC, Guo XG (2020) Identification of Hub genes associated with infection of three lung cell lines by SARS-CoV-2 with integrated bioinformatics analysis. J Cell Mol Med 24:12225–12230

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Grifoni A, Sidney J, Zhang Y, Scheuermann RH, Peters B, Sette A (2020) A sequence homology and bioinformatic approach can predict candidate targets for immune responses to SARS-CoV-2. Cell Host Microbe 27:671–80.e2

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Min YQ, Mo Q, Wang J, Deng F, Wang H, Ning YJ (2020) SARS-CoV-2 nsp1: bioinformatics, potential structural and functional features, and implications for drug/vaccine designs. Front Microbiol 11:587317

    Article  PubMed  PubMed Central  Google Scholar 

  55. Barker H, Parkkila S (2020) Bioinformatic characterization of angiotensin-converting enzyme 2, the entry receptor for SARS-CoV-2. PLoS One 15:e0240647

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Teufel A (2015) Bioinformatics and database resources in hepatology. J Hepatol 62:712–719

    Article  PubMed  Google Scholar 

  57. Lin Y, Qian F, Shen L, Chen F, Chen J, Shen B (2019) Computer-aided biomarker discovery for precision medicine: data resources, models and applications. Brief Bioinform 20:952–975

    Article  CAS  PubMed  Google Scholar 

  58. Tang Y, Zhang Y, Hu X (2020) Identification of potential hub genes related to diagnosis and prognosis of hepatitis B virus-related hepatocellular carcinoma via integrated bioinformatics analysis. Biomed Res Int 2020:4251761

    PubMed  PubMed Central  Google Scholar 

  59. Huang DP, Zeng YH, Yuan WQ, Huang XF, Chen SQ, Wang MY et al (2021) Bioinformatics analyses of potential miRNA-mRNA regulatory axis in HBV-related hepatocellular carcinoma. Int J Med Sci 18:335–346

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Liu J, Ma Z, Liu Y, Wu L, Hou Z, Li W (2019) Screening of potential biomarkers in hepatitis C virus-induced hepatocellular carcinoma using bioinformatic analysis. Oncol Lett 18:2500–2508

    CAS  PubMed  PubMed Central  Google Scholar 

  61. Zhan Z, Chen Y, Duan Y, Li L, Mew K, Hu P et al (2019) Identification of key genes, pathways and potential therapeutic agents for liver fibrosis using an integrated bioinformatics analysis. PeerJ 7:e6645

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  62. Liu S, Huang Z, Deng X, Zou X, Li H, Mu S et al (2021) Identification of key candidate biomarkers for severe influenza infection by integrated bioinformatical analysis and initial clinical validation. J Cell Mol Med 25:1725–1738

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Hu YJ, Chow KC, Liu CC, Lin LJ, Wang SC, Wang SD (2015) Using combinatorial bioinformatics methods to analyze annual perspective changes of influenza viruses and to accelerate development of effective vaccines. J Formos Med Assoc 114:774–778

    Article  CAS  PubMed  Google Scholar 

  64. Kaewpongsri S, Sukasem C, Srichunrusami C, Pasomsub E, Zwang J, Pairoj W et al (2010) An integrated bioinformatics approach to the characterization of influenza A/H5N1 viral sequences by microarray data: implication for monitoring H5N1 emerging strains and designing appropriate influenza vaccines. Mol Cell Probes 24:387–395

    Article  CAS  PubMed  Google Scholar 

  65. Shen L, Ye B, Sun H, Lin Y, van Wietmarschen H, Shen B (2017) Systems Health: a transition from disease management toward health promotion. Adv Exp Med Biol 1028:149–164

    Article  PubMed  Google Scholar 

Download references

Acknowledgments

This study was supported by the COVID-19 Research Projects of West China Hospital Sichuan University (Grant no. HX-2019-nCoV-057), the National Natural Science Foundation of China (grant nos. 32070671, and 31900490), the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (grant no. 20KJB180010), and the regional innovation cooperation between Sichuan and Guangxi Provinces (grant no. 2020YFQ0019).

Competing Interests

The authors declare no conflict of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bairong Shen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Lin, Y., Qian, Y., Qi, X., Shen, B. (2022). Databases, Knowledgebases, and Software Tools for Virus Informatics. In: Shen, B. (eds) Translational Informatics. Advances in Experimental Medicine and Biology, vol 1368. Springer, Singapore. https://doi.org/10.1007/978-981-16-8969-7_1

Download citation

Publish with us

Policies and ethics