Skip to main content

Computational Genomics

  • Chapter
  • First Online:
Advances in Bioinformatics

Abstract

The use of computational methodologies for analysis of biological data is not recent; however, with the reduction of the cost of DNA sequencing associated with the increase in the volume of genomic data produced by the sequencing platforms, it has become essential to use computational approaches to handle and extract more information from the data of complete genomes and/or transcriptomes using bioinformatics tools. The challenge for this starts with simple sequence alignments, until the assembly of the whole genomes with the challenge to process the high volume of data, which requires high computational capacity and/or improvement of the algorithms in order to optimize the use of computers. This chapter will show how DNA sequences are decoded, how sequences are compared through alignment, what are the main approaches to assembly genomes, and how to evaluate their quality followed by gene prediction techniques, and finally, how interaction networks can be implemented from genomic data after processed by the steps presented here.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Altman RB, Raychaudhuri S (2001) Whole-genome expression analysis: challenges beyond clustering. Curr Opin Struct Biol 11(3):340–347

    Article  CAS  PubMed  Google Scholar 

  • Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410

    Article  CAS  PubMed  Google Scholar 

  • Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Anfinsen CB (1973) Principles that govern the folding of protein chains. Science 181(4096):223–230

    Article  CAS  PubMed  Google Scholar 

  • Ansorge W, Sproat BS, Stegemann J, Schwager C (1986) A non-radioactive automated method for DNA sequence determination. J Biochem Biophys Methods 13(6):315–323

    Article  CAS  PubMed  Google Scholar 

  • Ansorge W, Sproat B, Stegemann J, Schwager C, Zenke M (1987) Automated DNA sequencing: ultrasensitive detection of fluorescent bands during electrophoresis. Nucleic Acids Res 15(11):4593–4602

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Aoki K, Ogata Y, Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biology. Plant Cell Physiol 48(3):381–390

    Article  CAS  PubMed  Google Scholar 

  • Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M et al (2004) Uniprot: the universal protein knowledgebase. Nucleic Acids Res 32(suppl_1):D115–D119

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Araujo FA, Barh D, Silva A, Guimarães L, Ramos RTJ (2018) Go feat: a rapid web-based functional annotation tool for genomic and transcriptomic data. Sci Rep 8(1):1–4

    Article  PubMed  PubMed Central  Google Scholar 

  • Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M et al (2008) The RAST server: rapid annotations using subsystems technology. BMC Genomics 9(1):1–15

    Article  CAS  Google Scholar 

  • Baker M (2012) De novo genome assembly: what every biologist should know. Nat Methods 9(4):333–337

    Article  CAS  Google Scholar 

  • Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD et al (2012) Spades: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19(5):455–477

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Beretta S, Denti L, Previtali M (2019a) Graph theory and definitions. Academic Press, Cambridge, MA

    Book  Google Scholar 

  • Beretta S, Denti L, Previtali M (2019b) Network properties. Academic Press, Cambridge, MA

    Book  Google Scholar 

  • Buchfink B, Xie C, Huson DH (2015) Fast and sensitive protein alignment using diamond. Nat Methods 12(1):59–60

    Article  CAS  PubMed  Google Scholar 

  • Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK, Lander ES, Nusbaum C, Jaffe DB (2008) Allpaths: de novo assembly of whole-genome shotgun microreads. Genome Res 18(5):810–820

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Card DC, Schield DR, Reyes-Velasco J, Fujita MK, Andrew AL, Oyler-McCance SJ, Fike JA, Tomback DF, Ruggiero RP, Castoe TA (2014) Two low coverage bird genomes and a comparison of reference-guided versus de novo genome assemblies. PLoS One 9(9):e106649

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Celis JE, Kruhøffer M, Gromova I, Frederiksen C, Østergaard M, Thykjaer T, Gromov P, Yu J, Pálsdóttir H, Magnusson N et al (2000) Gene expression profiling: monitoring transcription and translation products using dna microarrays and proteomics. FEBS Lett 480(1):2–16

    Article  CAS  PubMed  Google Scholar 

  • Chasman D, Siahpirani AF, Roy S (2016) Network-based approaches for analysis of complex biological systems. Curr Opin Biotechnol 39:157–166

    Article  CAS  PubMed  Google Scholar 

  • Chen Q, Lan C, Zhao L, Wang J, Chen B, Chen YPP (2017) Recent advances in sequence assembly: principles and applications. Brief Funct Genomics 16(6):361–378

    Article  CAS  PubMed  Google Scholar 

  • Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M (2005) Blast2go: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21(18):3674–3676

    Article  CAS  PubMed  Google Scholar 

  • Consortium GO (2015) Gene ontology consortium: going forward. Nucleic Acids Res 43(D1):D1049–D1056

    Article  CAS  Google Scholar 

  • Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) The Knuth-Morris-Pratt algorithm. In: Introduction to algorithms, 2nd edn. MIT Press, Cambridge, MA

    Google Scholar 

  • Crick F (1970) Central dogma of molecular biology. Nature 227(5258):561–563

    Article  CAS  PubMed  Google Scholar 

  • Dasgupta D, Yu S, Nino F (2011) Recent advances in artificial immune systems: models and applications. Appl Soft Comput 11(2):1574–1587

    Article  Google Scholar 

  • Dayhoff M, Schwartz R, Orcutt B (1978) A model of evolutionary change in proteins. In: Atlas of protein sequence and structure, vol 5. The National Biomedical Research Foundation, Silver Spring, MD, pp 345–352

    Google Scholar 

  • De Smet R, Marchal K (2010) Advantages and limitations of current network inference methods. Nat Rev Microbiol 8(10):717–729

    Article  PubMed  CAS  Google Scholar 

  • Delcher AL, Bratke KA, Powers EC, Salzberg SL (2007) Identifying bacterial genes and endosymbiont dna with glimmer. Bioinformatics 23(6):673–679

    Article  CAS  PubMed  Google Scholar 

  • Dey A, Saha I, Maulik U (2017) A survey on multiple sequence alignment using metaheuristics. In: 2017 7th international conference on communication systems and network technologies (CSNT). IEEE, pp 279–284

    Google Scholar 

  • Durbin R, Eddy SR, Krogh A, Mitchison G (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Earl D, Bradnam K, John JS, Darling A, Lin D, Fass J, Yu HOK, Buffalo V, Zerbino DR, Diekhans M et al (2011) Assemblathon 1: a competitive assessment of de novo short read assembly methods. Genome Res 21(12):2224–2241

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Edgar RC (2004) Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Edgar RC (2010) Search and clustering orders of magnitude faster than blast. Bioinformatics 26(19):2460–2461

    Article  CAS  PubMed  Google Scholar 

  • Edgar RC, Batzoglou S (2006) Multiple sequence alignment. Curr Opin Struct Biol 16(3):368–373

    Article  CAS  PubMed  Google Scholar 

  • Ekblom R, Wolf JB (2014) A field guide to whole-genome sequencing, assembly and annotation. Evol Appl 7(9):1026–1042

    Article  PubMed  PubMed Central  Google Scholar 

  • El-Gebali S, Mistry J, Bateman A, Eddy SR, Luciani A, Potter SC, Qureshi M, Richardson LJ, Salazar GA, Smart A et al (2019) The pfam protein families database in 2019. Nucleic Acids Res 47(D1):D427–D432

    Article  CAS  PubMed  Google Scholar 

  • El-Metwally S, Hamza T, Zakaria M, Helmy M (2013) Next-generation sequence assembly: four stages of data processing and computational challenges. PLoS Comput Biol 9(12):e1003345

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Feng DF, Doolittle RF (1987) Progressive sequence alignment as a prerequisitetto correct phylogenetic trees. J Mol Evol 25(4):351–360

    Article  CAS  PubMed  Google Scholar 

  • Ferragina P, Manzini G (2000) Opportunistic data structures with applications. In: Proceedings 41st annual symposium on foundations of computer science. IEEE, pp 390–398

    Google Scholar 

  • Fionda V (2019) Networks in biology. In: Ranganathan S, Gribskov M, Nakai K, Schön-bach C (eds) Encyclopedia of bioinformatics and computational biology. Academic Press, Oxford, pp 915–921. https://doi.org/10.1016/B978-0-12-809633-8.20420-2

    Chapter  Google Scholar 

  • Fonseca NA, Rung J, Brazma A, Marioni JC (2012) Tools for mapping high-throughput sequencing data. Bioinformatics 28(24):3169–3177

    Article  CAS  PubMed  Google Scholar 

  • Gurevich A, Saveliev V, Vyahhi N, Tesler G (2013) Quast: quality assessment tool for genome assemblies. Bioinformatics 29(8):1072–1075

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gusfield D (1997) Algorithms on stings, trees, and sequences: computer science and computational biology. ACM Sigact News 28(4):41–60

    Article  Google Scholar 

  • Hacia JG (1999) Resequencing and mutational analysis using oligonucleotide microarrays. Nat Genet 21(1):42–47

    Article  CAS  PubMed  Google Scholar 

  • Hatakeyama M, Aluri S, Balachadran MT, Sivarajan SR, Patrignani A, Grüter S, Poveda L, Shimizu-Inatsugi R, Baeten J, Francoijs KJ et al (2018) Multiple hybrid de novo genome assembly of finger millet, an orphan allotetraploid crop. DNA Res 25(1):39–47

    Article  CAS  PubMed  Google Scholar 

  • Hatem A, Bozdağ D, Toland AE, Çatalyürek ÜV (2013) Benchmarking short sequence mapping tools. BMC Bioinformatics 14(1):184

    Article  PubMed  PubMed Central  Google Scholar 

  • Heather JM, Chain B (2016) The sequence of sequencers: the history of sequencing DNA. Genomics 107(1):1–8

    Article  CAS  PubMed  Google Scholar 

  • Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci 89(22):10915–10919

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Higgins DG, Sharp PM (1988) Clustal: a package for performing multiple sequence alignment on a microcomputer. Gene 73(1):237–244

    Article  CAS  PubMed  Google Scholar 

  • Higgins DG, Bleasby AJ, Fuchs R (1992) Clustal v: improved software for multiple sequence alignment. Bioinformatics 8(2):189–191

    Article  CAS  Google Scholar 

  • Hoffmann S, Otto C, Kurtz S, Sharma CM, Khaitovich P, Vogel J, Stadler PF, Hackermüller J (2009) Fast mapping of short sequences with mismatches, insertions and deletions using index structures. PLoS Comput Biol 5(9):e1000502

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Holley RW, Apgar J, Everett GA, Madison JT, Marquisee M, Merrill SH, Penswick JR, Zamir A (1965) Structure of a ribonucleic acid. Science 147:1462–1465

    Article  CAS  PubMed  Google Scholar 

  • Holt C, Yandell M (2011) Maker2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12(1):491

    Article  PubMed  PubMed Central  Google Scholar 

  • Howard JT, Ashwell MS, Baynes RE, Brooks JD, Yeatts JL, Maltecca C (2017) Gene co-expression network analysis identifies porcine genes associated with variation in metabolizing fenbendazole and flunixin meglumine in the liver. Sci Rep 7(1):1–12

    Article  CAS  Google Scholar 

  • Hsiao LL, Stears RL, Hong RL, Gullans SR (2000) Prospective use of dna microarrays for evaluating renal function and disease. Curr Opin Nephrol Hypertens 9(3):253–258

    Article  CAS  PubMed  Google Scholar 

  • Husi H, Skipworth RJ, Fearon KC, Ross JA (2013) Lscluster, a large-scale sequence clustering and aligning software for use in partial identity mapping and splice-variant analysis. J Proteome 84:185–189

    Article  CAS  Google Scholar 

  • Hyatt D, Chen GL, LoCascio PF, Land ML, Larimer FW, Hauser LJ (2010) Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11(1):119

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Jancura P, Mavridou E, Carrillo-de Santa Pau E, Marchiori E (2012) A methodology for detecting the orthology signal in a PPI network at a functional complex level. BMC Bioinformatics 13:S18

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Jiang Z, Zhou X, Li R, Michal JJ, Zhang S, Dodson MV, Zhang Z, Harland RM (2015) Whole transcriptome analysis with sequencing: methods, challenges and potential solutions. Cell Mol Life Sci 72(18):3425–3439

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kambara H, Nishikawa T, Katayama Y, Yamaguchi T (1988) Optimization of parameters in a dna sequenator using fluorescence detection. Bio/Technology 6(7):816–821

    CAS  Google Scholar 

  • Karlebach G, Shamir R (2008) Modelling and analysis of gene regulatory networks. Nat Rev Mol Cell Biol 9(10):770–780

    Article  CAS  PubMed  Google Scholar 

  • Keel BN, Snelling WM (2018) Comparison of burrows-wheeler transform-based mapping algorithms used in high-throughput whole-genome sequencing: application to illumina data for livestock genomes1. Front Genet 9:35

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Kent WJ (2002) Blat—the blast-like alignment tool. Genome Res 12(4):656–664

    CAS  PubMed  PubMed Central  Google Scholar 

  • Koh GC, Porras P, Aranda B, Hermjakob H, Orchard SE (2012) Analyzing protein–protein interaction networks. J Proteome Res 11(4):2014–2031

    Article  CAS  PubMed  Google Scholar 

  • Kommadath A, Bao H, Arantes AS, Plastow GS, Tuggle CK, Bearson SM, Stothard P et al (2014) Gene co-expression network analysis identifies porcine genes associated with variation in salmonella shedding. BMC Genomics 15(1):1–15

    Article  Google Scholar 

  • König S, Romoth LW, Gerischer L, Stanke M (2016) Simultaneous gene finding in multiple genomes. Bioinformatics 32(22):3388–3395

    Article  PubMed  PubMed Central  Google Scholar 

  • Koonin EV, Galperin M (2013) Sequence—evolution—function: computational approaches in comparative genomics. Springer, Dordrecht

    Google Scholar 

  • Korf I (2004) Gene finding in novel genomes. BMC Bioinformatics 5(1):59

    Article  PubMed  PubMed Central  Google Scholar 

  • Kultima JR, Sunagawa S, Li J, Chen W, Chen H, Mende DR, Arumugam M, Pan Q, Liu B, Qin J et al (2012) Mocat: a metagenomics assembly and gene prediction toolkit. PLoS One 7(10):e47656

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W et al (2001) Initial sequencing and analysis of the human genome. Nature 409(6822):860–921

    Article  CAS  PubMed  Google Scholar 

  • Langmead B, Salzberg SL (2012) Fast gapped-read alignment with bowtie 2. Nat Methods 9(4):357

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short dna sequences to the human genome. Genome Biol 10(3):R25

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Li H, Homer N (2010) A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform 11(5):473–483

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Li H, Ruan J, Durbin R (2008) Mapping short dna sequencing reads and calling variants using mapping quality scores. Genome Res 18(11):1851–1858

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lindgreen S, Gardner PP, Krogh A (2007) Mastr: multiple alignment and structure prediction of non-coding rnas using simulated annealing. Bioinformatics 23(24):3304–3311

    Article  CAS  PubMed  Google Scholar 

  • Lipman DJ, Pearson WR (1985) Rapid and sensitive protein similarity searches. Science 227(4693):1435–1441

    Article  CAS  PubMed  Google Scholar 

  • Lischer HE, Shimizu KK (2017) Reference-guided de novo assembly approach improves genome reconstruction for related species. BMC Bioinformatics 18(1):1–12

    Article  Google Scholar 

  • Liu Y, Guo J, Hu G, Zhu H (2013) Gene prediction in metagenomic fragments based on the SVM algorithm. BMC Bioinformatics 14:S12

    Article  PubMed  PubMed Central  Google Scholar 

  • Lomsadze A, Burns PD, Borodovsky M (2014) Integration of mapped rna-seq reads into automatic training of eukaryotic gene finding algorithm. Nucleic Acids Res 42(15):e119–e119

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Lourenço HR, Martin OC, Stützle T (2003) Iterated local search. In: Handbook of metaheuristics. Springer, Boston, pp 320–353

    Chapter  Google Scholar 

  • Luckey JA, Drossman H, Kostichka AJ, Mead DA, D’Cunha J, Norris TB, Smith LM (1990) High speed dna sequencing by capillary electrophoresis. Nucleic Acids Res 18(15):4417–4421

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lukashin AV, Borodovsky M (1998) Genemark.hmm: new solutions for gene finding. Nucleic Acids Res 26(4):1107–1115

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, He G, Chen Y, Pan Q, Liu Y et al (2012) Soapdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience 1(1):2047-217X

    Article  Google Scholar 

  • Majoros WH, Pertea M, Salzberg SL (2004) Tigrscan and glimmerhmm: two open source ab initio eukaryotic gene-finders. Bioinformatics 20(16):2878–2879

    Article  CAS  PubMed  Google Scholar 

  • Manger ID, Relman DA (2000) How the host ‘sees’ pathogens: global gene expression responses to infection. Curr Opin Immunol 12(2):215–218

    Article  CAS  PubMed  Google Scholar 

  • Marbach D, Costello JC, Küffner R, Vega NM, Prill RJ, Camacho DM, Allison KR, Kellis M, Collins JJ, Stolovitzky G (2012) Wisdom of crowds for robust gene network inference. Nat Methods 9(8):796–804

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Mardis ER (2011) A decade’s perspective on dna sequencing technology. Nature 470(7333):198–203

    Article  CAS  PubMed  Google Scholar 

  • Martin JA, Wang Z (2011) Next-generation transcriptome assembly. Nat Rev Genet 12(10):671–682

    Article  CAS  PubMed  Google Scholar 

  • Martorell-Marugán J, Tabik S, Benhammou Y, del Val C, Zwir I, Herrera F, Carmona-Sáez P (2019) Deep learning in omics data analysis and precision medicine. Codon Publications, Brisbane, pp 37–53

    Google Scholar 

  • Maulik U, Saha I (2009) Modified differential evolution based fuzzy clustering for pixel classification in remote sensing imagery. Pattern Recogn 42(9):2135–2149

    Article  Google Scholar 

  • Maxam AM, Gilbert W (1977) A new method for sequencing dna. Proc Natl Acad Sci 74(2):560–564

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • McKernan KJ, Peckham HE, Costa GL, McLaughlin SF, Fu Y, Tsung EF, Clouser CR, Duncan C, Ichikawa JK, Lee CC et al (2009) Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Res 19(9):1527–1541

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Miller JR, Zhou P, Mudge J, Gurtowski J, Lee H, Ramaraj T, Walenz BP, Liu J, Stupar RM, Denny R et al (2017) Hybrid assembly with long and short reads improves discovery of gene family expansions. BMC Genomics 18(1):541

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Min B, Grigoriev IV, Choi IG (2017) Fungap: Fungal genome annotation pipeline using evidence-based gene model evaluation. Bioinformatics 33(18):2936–2937

    Article  CAS  PubMed  Google Scholar 

  • Misra S, Narayanan R, Lin S, Choudhary A (2010) Fangs: high speed sequence mapping for next generation sequencers. In: Proceedings of the 2010 ACM symposium on applied computing. ACM, New York, pp 1539–1546

    Chapter  Google Scholar 

  • Mitchell AL, Attwood TK, Babbitt PC, Blum M, Bork P, Bridge A, Brown SD, Chang HY, El-Gebali S, Fraser MI et al (2019) Interpro in 2019: improving coverage, classification and access to protein sequence annotations. Nucleic Acids Res 47(D1):D351–D360

    Article  CAS  PubMed  Google Scholar 

  • Mladenović N, Hansen P (1997) Variable neighborhood search. Comput Oper Res 24(11):1097–1100

    Article  Google Scholar 

  • Morris AP, Zeggini E (2010) An evaluation of statistical approaches to rare variant analysis in genetic association studies. Genet Epidemiol 34(2):188–193

    Article  PubMed  Google Scholar 

  • Naama B, Bouzeboudja H, Allali A (2013) Application of Tabu search and genetic algorithm in minimize losses in power system. Using the b-coefficient method. Energy Procedia 36:687–693

    Article  Google Scholar 

  • Nagarajan N, Pop M (2013) Sequence assembly demystified. Nat Rev Genet 14(3):157–167

    Article  CAS  PubMed  Google Scholar 

  • Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv (CSUR) 33(1):31–88

    Article  Google Scholar 

  • Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48(3):443–453

    Article  CAS  PubMed  Google Scholar 

  • Nowak RM, Jastrzębski JP, Kuśmirek W, Sałamatin R, Rydzanicz M, Sobczyk-Kopcioł A, Sulima-Celińska A, Paukszto Ł, Makowczenko KG, Płoski R et al (2019) Hybrid de novo whole-genome assembly and annotation of the model tapeworm hymenolepis diminuta. Sci Data 6(1):1–14

    Article  CAS  Google Scholar 

  • Oates CJ, Mukherjee S (2012) Network inference and biological dynamics. Ann Appl Stat 6(3):1209

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ortuño FM, Valenzuela O, Rojas F, Pomares H, Florido JP, Urquiza JM, Rojas I (2013) Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns. Bioinformatics 29(17):2112–2121

    Article  PubMed  CAS  Google Scholar 

  • Palmer J, Stajich J (2017) Funannotate: eukaryotic genome annotation pipeline

    Google Scholar 

  • Pavlopoulos GA, Secrier M, Moschopoulos CN, Soldatos TG, Kossida S, Aerts J, Schneider R, Bagos PG (2011) Using graph theory to analyze biological networks. BioData Min 4(1):10

    Article  PubMed  PubMed Central  Google Scholar 

  • Pearson WR, Lipman DJ (1988) Improved tools for biological sequence comparison. Proc Natl Acad Sci 85(8):2444–2448

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Pellegrini M (2019) Community detection in biological networks. In: Encyclopedia of bioinformatics and computational biology. Elsevier, Amsterdam

    Google Scholar 

  • Pennisi E (2003) A low number wins the GeneSweep pool. Science 300:1484

    Article  CAS  PubMed  Google Scholar 

  • Pitkänen E, Rousu J, Ukkonen E (2010) Computational methods for metabolic reconstruction. Curr Opin Biotechnol 21(1):70–77

    Article  PubMed  CAS  Google Scholar 

  • Pizzuti C, Rombo SE (2014) Algorithms and tools for protein–protein interaction networks clustering, with a special focus on population-based stochastic methods. Bioinformatics 30(10):1343–1352

    Article  CAS  PubMed  Google Scholar 

  • Pop M, Phillippy A, Delcher AL, Salzberg SL (2004) Comparative genome assembly. Briefings in. Bioinformatics 5(3):237–248

    CAS  PubMed  Google Scholar 

  • Prober JM, Trainor GL, Dam RJ, Hobbs FW, Robertson CW, Zagursky RJ, Cocuzza AJ, Jensen MA, Baumeister K (1987) A system for rapid dna sequencing with fluorescent chain-terminating dideoxynucleotides. Science 238(4825):336–341

    Article  CAS  PubMed  Google Scholar 

  • Proulx SR, Promislow DE, Phillips PC (2005) Network thinking in ecology and evolution. Trends Ecol Evol 20(6):345–353

    Article  PubMed  Google Scholar 

  • Raghava GP, Barton GJ (2006) Quantification of the variation in percentage identity for protein sequence alignments. BMC Bioinformatics 7(1):1–4

    Article  CAS  Google Scholar 

  • Remmert M, Biegert A, Hauser A, Söding J (2012) Hhblits: lightning-fast iterative protein sequence searching by hmm-hmm alignment. Nat Methods 9(2):173–175

    Article  CAS  Google Scholar 

  • Rognes T, Flouri T, Nichols B, Quince C, Mahé F (2016) Vsearch: a versatile open source tool for metagenomics. PeerJ 4:e2584

    Article  PubMed  PubMed Central  Google Scholar 

  • Rothberg JM, Hinz W, Rearick TM, Schultz J, Mileski W, Davey M, Leamon JH, Johnson K, Milgrew MJ, Edwards M et al (2011) An integrated semiconductor device enabling non-optical genome sequencing. Nature 475(7356):348–352

    Article  CAS  PubMed  Google Scholar 

  • Salzberg SL, Delcher AL, Kasif S, White O (1998) Microbial gene identification using interpolated markov models. Nucleic Acids Res 26(2):544–548

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Sanger F, Coulson A (1996) A rapid method for determining sequences in DNA by primed synthesis with DNA polymerase. Sel Pap Frederick Sanger Comment 94:382

    Google Scholar 

  • Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci 74(12):5463–5467

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Schadt EE, Turner S, Kasarskis A (2010) A window into third-generation sequencing. Hum Mol Genet 19(R2):R227–R240

    Article  CAS  PubMed  Google Scholar 

  • Schneeberger K, Ossowski S, Ott F, Klein JD, Wang X, Lanz C, Smith LM, Cao J, Fitz J, Warthmann N et al (2011) Reference-guided assembly of four diverse arabidopsis thaliana genomes. Proc Natl Acad Sci 108(25):10249–10254

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Schnepp PM, Chen M, Keller ET, Zhou X (2019) Snv identification from single-cell rna sequencing data. Hum Mol Genet 28(21):3569–3583

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Scholz MB, Lo CC, Chain PS (2012) Next generation sequencing and bioinformatic bottle-necks: the current state of metagenomic data analysis. Curr Opin Biotechnol 23(1):9–15

    Article  CAS  PubMed  Google Scholar 

  • Schweikert G, Zien A, Zeller G, Behr J, Dieterich C, Ong CS, Philips P, De Bona F, Hartmann L, Bohlen A et al (2009) mgene: accurate svm-based gene finding with an application to nematode genomes. Genome Res 19(11):2133–2143

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Seemann T (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics 30(14):2068–2069

    Article  CAS  PubMed  Google Scholar 

  • Shaik R, Ramakrishna W (2013) Genes and co-expression modules common to drought and bacterial stress responses in arabidopsis and rice. PLoS One 8(10):e77261

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Shendure J, Ji H (2008) Next-generation dna sequencing. Nat Biotechnol 26(10):1135–1145

    Article  CAS  PubMed  Google Scholar 

  • Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I (2009) Abyss: a parallel assembler for short read sequence data. Genome Res 19(6):1117–1123

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Smith TF, Waterman MS et al (1981) Identification of common molecular subsequences. J Mol Biol 147(1):195–197

    Article  CAS  PubMed  Google Scholar 

  • Smith LM, Fung S, Hunkapiller MW, Hunkapiller TJ, Hood LE (1985) The synthesis of oligonucleotides containing an aliphatic amino group at the 5′ terminus: synthesis of fluorescent dna primers for use in dna sequence analysis. Nucleic Acids Res 13(7):2399–2412

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Smith LM, Sanders JZ, Kaiser RJ, Hughes P, Dodd C, Connell CR, Heiner C, Kent SB, Hood LE (1986) Fluorescence detection in automated dna sequence analysis. Nature 321(6071):674–679

    Article  CAS  PubMed  Google Scholar 

  • Sohn Ji, Nam JW (2018) The present and future of de novo whole-genome assembly. Brief Bioinform 19(1):23–40

    CAS  PubMed  Google Scholar 

  • Suzuki S, Kakuta M, Ishida T, Akiyama Y (2014) Ghostx: an improved sequence homology search algorithm using a query suffix array and a database suffix array. PLoS One 9(8):e103833

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Swerdlow H, Gesteland R (1990) Capillary gel electrophoresis for rapid, high resolution dna sequencing. Nucleic Acids Res 18(6):1415–1419

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tatusova T, DiCuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, Lomsadze A, Pruitt KD, Borodovsky M, Ostell J (2016) Ncbi prokaryotic genome annotation pipeline. Nucleic Acids Res 44(14):6614–6624

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Thompson JD, Higgins DG, Gibson TJ (1994) Clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22(22):4673–4680

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tieri P, Farina L, Petti M, Astolfi L, Paci P, Castiglione F (2019) Network inference and reconstruction in bioinformatics. Encycl Bioinform Comput Biol 2:805–813

    Google Scholar 

  • Toledo-Arana A, Solano C (2010) Deciphering the physiological blueprint of a bacterial cell: revelations of unanticipated complexity in transcriptome and proteome. BioEssays 32(6):461–467

    Article  CAS  PubMed  Google Scholar 

  • Tordini F, Aldinucci M, Milanesi L, Liò P, Merelli I (2016) The genome conformation as an integrator of multi-omic data: the example of damage spreading in cancer. Front Genet 7:194

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • van Dam S, Vosa U, van der Graaf A, Franke L, de Magalhaes JP (2018) Gene co-expression analysis for functional classification and gene–disease predictions. Brief Bioinform 19(4):575–592

    PubMed  Google Scholar 

  • Van Dijk EL, Auger H, Jaszczyszyn Y, Thermes C (2014) Ten years of next-generation sequencing technology. Trends Genet 30(9):418–426

    Article  PubMed  CAS  Google Scholar 

  • van Dijk EL, Jaszczyszyn Y, Naquin D, Thermes C (2018) The third revolution in sequencing technology. Trends Genet 34(9):666–681

    Article  PubMed  CAS  Google Scholar 

  • Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA et al (2001) The sequence of the human genome. Science 291(5507):1304–1351

    Article  CAS  PubMed  Google Scholar 

  • Voelkerding KV, Dames SA, Durtschi JD (2009) Next-generation sequencing: from basic research to diagnostics. Clin Chem 55(4):641–658

    Article  CAS  PubMed  Google Scholar 

  • Wang Y, Chen L, Song N, Lei X (2015) Gass: genome structural annotation for eukaryotes based on species similarity. BMC Genomics 16(1):150

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Wattam AR, Abraham D, Dalay O, Disz TL, Driscoll T, Gabbard JL, Gillespie JJ, Gough R, Hix D, Kenyon R et al (2014) Patric, the bacterial bioinformatics database and analysis resource. Nucleic Acids Res 42(D1):D581–D591

    Article  CAS  PubMed  Google Scholar 

  • Wu TD, Nacu S (2010) Fast and snp-tolerant detection of complex variants and splicing in short reads. Bioinformatics 26(7):873–881

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Yandell M, Ence D (2012) A beginner’s guide to eukaryotic genome annotation. Nat Rev Genet 13(5):329–342

    Article  CAS  PubMed  Google Scholar 

  • Yang X, Koltes JE, Park CA, Chen D, Reecy JM (2015) Gene co-expression network analysis provides novel insights into myostatin regulation at three different mouse developmental timepoints. PLoS One 10(2):e0117607

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Yuan L, Qian G, Chen L, Wu CL, Dan HC, Xiao Y, Wang X (2018) Co-expression network analysis of biomarkers for adrenocortical carcinoma. Front Genet 9:328

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de bruijn graphs. Genome Res 18(5):821–829

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhang J, Lin H, Balaji P, Feng WC (2013) Optimizing burrows-wheeler transform-based sequence alignment on multicore architectures. In: 2013 13th IEEE/ACM international symposium on cluster, cloud, and grid computing. IEEE, pp 377–384

    Google Scholar 

  • Zhang B, Tian Y, Zhang Z (2014) Network biology in medicine and beyond. Circulation: cardiovascular. Genetics 7(4):536–547

    Google Scholar 

  • Zhang KY, Gao YZ, Du MZ, Liu S, Dong C, Guo FB (2019) Vgas: a viral genome annotation system. Front Microbiol 10:184

    Article  PubMed  PubMed Central  Google Scholar 

  • Zhao X, Li W (2019) Gene coexpression network analysis identified potential biomarkers in gestational diabetes mellitus progression. Mol Gen Genom Med 7(1):e00515

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Araújo, F.A., Cavalcante, A.L.Q., de Barros Braga, M., Kato, R.B., Ramos, R.T.J., Santos, E.F.F.D.L. (2021). Computational Genomics. In: Singh, V., Kumar, A. (eds) Advances in Bioinformatics. Springer, Singapore. https://doi.org/10.1007/978-981-33-6191-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-981-33-6191-1_11

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-33-6190-4

  • Online ISBN: 978-981-33-6191-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics