Environmental Metagenomics: The Data Assembly and Data Analysis Perspectives

Review Paper


Novel gene finding is one of the emerging fields in the environmental research. In the past decades the research was focused mainly on the discovery of microorganisms which were capable of degrading a particular compound. A lot of methods are available in literature about the cultivation and screening of these novel microorganisms. All of these methods are efficient for screening of microbes which can be cultivated in the laboratory. Microorganisms which live in extreme conditions like hot springs, frozen glaciers, acid mine drainage, etc. cannot be cultivated in the laboratory, this is because of incomplete knowledge about their growth requirements like temperature, nutrients and their mutual dependence on each other. The microbes that can be cultivated correspond only to less than 1 % of the total microbes which are present in the earth. Rest of the 99 % of uncultivated majority remains inaccessible. Metagenomics transcends the culture requirements of microbes. In metagenomics DNA is directly extracted from the environmental samples such as soil, seawater, acid mine drainage etc., followed by construction and screening of metagenomic library. With the ongoing research, a huge amount of metagenomic data is accumulating. Understanding this data is an essential step to extract novel genes of industrial importance. Various bioinformatics tools have been designed to analyze and annotate the data produced from the metagenome. The Bio-informatic requirements of metagenomics data analysis are different in theory and practice. This paper reviews the tools that are available for metagenomic data analysis and the capability such tools—what they can do and their web availability.


Environmental engineering Enzymes Metagenomics DNA Bioinformatics 



The authors are indebted to staff of Bionivid Technology Pvt. Ltd., Bangalore, India for their help in preparing the manuscript.


  1. 1.
    J.C. Wooley, A. Godzik, I. Friedberg, A primer on metagenomics. PLoS Comput. Biol. 6(2), e1000667 (2010). doi:10.1371/journal.pcbi.1000667 Google Scholar
  2. 2.
    J. Handelsman, Metagenomics: application of genomics to uncultured microorganism. Microbiol. Mol. Biol. 68(4), 669–685 (2004)Google Scholar
  3. 3.
    C. Schmeisser, H. Steele, W.R. Streit, Metagenomics, biotechnology with non-culturable microbes. Appl. Microbiol. Biotechnol. 75(5), 955–962 (2007)Google Scholar
  4. 4.
    P. Hugenholtz, G.W. Tyson, Metagenomics. Nature 455(7212), 48–483 (2008)Google Scholar
  5. 5.
    C.S. Riesenfeld, R.M. Goodman, J. Handelsman, Uncultured soil bacteria are a reservoir of new antibiotic resistance genes. Environ. Microbiol. 6(9), 981–989 (2004)Google Scholar
  6. 6.
    N. Weiland, C. Löscher, R. Metzger, S. Schmitz, Construction and screening of marine metagenomiclibraries. Methods Mol. Biol. 668, 51–65 (2010)Google Scholar
  7. 7.
    F.R. Valera, Environmental genomics, the big picture? FEMS Microbiol. Lett. 231, 153–158 (2004)Google Scholar
  8. 8.
    D.A. Cowan, A. Arslanoglu, S.G. Burton, G.C. Baker, R.A. Cameron, J.J. Smith, Q. Meyer, Metagenomics, gene discovery and the ideal biocatalyst. Biochem. Soc. Trans. 32, 298–302 (2004)Google Scholar
  9. 9.
    X. Li, L. Qin, Metagenomics based drug discovery and marine microbial diversity. Trends Biotechnol. 23, 539–543 (2005)Google Scholar
  10. 10.
    S.G. Tringe, C.V. Mering, A. Kobayashi, A.A. Salamov, K. Chen, H.W. Chang, M. Podar, J.M. Short, E.J. Mathur, C. DetterJ, P. Bork, P. Hugenholtz, E.M. Rubin, Comparative metagenomics of microbial communities. Science 308, 554–557 (2005)Google Scholar
  11. 11.
    H.L. Steele, W.R. Streit, Metagenomics: advances in ecology and biotechnology. FEMS Microbiol. Lett. 247, 105–111 (2005)Google Scholar
  12. 12.
    M. Ferrer, F.M. Abarca, P.N. Golyshin, Mining genomes and ‘metagenomes’ for novel catalysts. Curr. Opin. Biotechnol. 16, 588–593 (2005)Google Scholar
  13. 13.
    R. Daniel, The metagenomics of soil. Nat. Rev. Microbiol. 3, 470–478 (2005)Google Scholar
  14. 14.
    K. Kurokava, T. Itoh, T. Kuwahara, K. Oshima, H. Toh, A. Toyoda, H. Takami, H. Morita, V.K. Sharma, T.P. Srivstava, T.D. Taylor, H. Noguchi, H. Mori, Y. Ogura, D.S. Ehrlich, K. Itoh, T. Takagi, Y. Sakaki, T. Hayasi, M. Hattori, Comparative metagenomics revealed commonly enriched gene sets in human gut microbiomes. DNA Res. 14, 169–181 (2007)Google Scholar
  15. 15.
    J. Kennedy, R.J Marchesi, A.D.W. Dobson, Marine metagenomics: strategies for the discovery of novel enzymes with biotechnological applications from marine environments. Microb. Cell Fact. (2008). doi:10.1186/1475-2859-7-27 Google Scholar
  16. 16.
    P.J. Turnbaugh, J.I. Gordon, An Invitation to the marriage of metagenomics and metabolomics. Cell 134, 708–713 (2008)Google Scholar
  17. 17.
    M. Ferrer, A. Beloqui, K.N. Timmis, P.N. Golyshin, Metagenomics for mining new genetic resources of microbial communities. J. Mol. Microbiol. Biotechnol. 6, 109–123 (2009)Google Scholar
  18. 18.
    H.K. Allen, L.A. Moe, J. Rodbumrer, A. Gaarder, J. Handelsman, Functional metagenomics reveals diverse β-lactamases in a remote Alaskan soil. ISME J. 3, 243–251 (2009)Google Scholar
  19. 19.
    G.Y.S. Wang, E. Graziani, B. Waters, W. Pan, X. Li, J. McDermott, G. Meurer, G. Saxena, R.J. Andersen, J. Davies, Novel natural products from soil DNA libraries in a Streptomycete host. Org. Lett. 2, 2401–2404 (2000)Google Scholar
  20. 20.
    S.F. Brady, J. Clardy, Long-chain N-acyl amino acid antibiotics isolated from heterologously expressed environmental DNA. J. Am. Chem. Soc. 122, 12903–12904 (2000)Google Scholar
  21. 21.
    A. Henne, R.A. Schmitz, M. Bömeke, G. Gottschalk, R. Daniel, Screening of environmental DNA libraries for the presence of genes conferring lipolytic activity on Escherichia coli. Appl. Environ. Microbiol. 66, 3113–3116 (2000)Google Scholar
  22. 22.
    S.F. Brady, C.J. Chao, J. Handelsman, J. Clardy, Cloning and heterologous expression of a natural product biosynthetic gene cluster from eDNA. Org. Lett. 3, 1981–1984 (2001)Google Scholar
  23. 23.
    S. Voget, C. Leggewie, A. Uesbeck, C. Raasch, K.-E. Jaeger, W.R. Streit, Prospecting for novel biocatalysts in a soil metagenome. Appl. Environ. Microbiol. 69, 6235–6242 (2003)Google Scholar
  24. 24.
    M.R. Rondon, P.R. August, A.D. Bettermann, S.F. Brady, T.H. Grossman, M.R. Liles, A. LoianconoK, B.A. Lynch, A. MacneilI, C. Minor, C.L. Tiong, M. Gilman, S. OsburneM, J. Clardy, J. Handelsman, R.M. Goodman, Cloning the soil metagenome: a strategy for accessing the genetic and functional diversity of uncultured microorganisms. Appl. Environ. Microbiol. 66, 2541–2547 (2000)Google Scholar
  25. 25.
    D.E. Gillespie, S.F. Brady, A.D. Bettermann, N.P. Cianciotto, M.R. Liles, M.R. Rondon, J. Clardy, R.M. Goodman, J. Handelsman, Isolation of antibiotics turbomycin A and B from a metagenomic library of soil microbial DNA. Appl. Enviror. Microbiol. 68, 4301–4306 (2002)Google Scholar
  26. 26.
    S.F. Brady, C.J. Chao, J. Clardy, New natural product families from an environmental DNA (eDNA) gene cluster. J. Am. Chem. Soc. 134, 9968–9969 (2002)Google Scholar
  27. 27.
    M.R. Liles, B.F. Manske, B. BintrimS, J. Handelsman, R.M. Goodman, A census of rRNA genes and linked genomic sequences within a soil metagenomic library. Appl. Environ. Microbiol. 69, 2684–2691 (2003)Google Scholar
  28. 28.
    C.S. Riesenfeld, R.M. Goodman, J. Handelsman, Uncultured soil bacteria are a reservoir of new antibiotic resistance genes. Environ. Microbiol. 6, 981–989 (2004)Google Scholar
  29. 29.
    S.W. Lee, K. Won, H.K. Lim, J.C. Kim, G.J. Choi, K.Y. Cho, Screening for novel lipolytic enzymes from uncultured soil Microorganisms. Appl. Microbiol. Biotechnol. 65, 720–726 (2004)Google Scholar
  30. 30.
    J. Yun, S. Kang, S. Park, H. Yoon, M.J. Kim, S. Heu, S. Ryu, Characterization of a novel amylolytic enzyme encoded by a gene from a soil-derived metagenomic library. Appl. Microbiol. Biotechnol. 70, 7229–7235 (2004)Google Scholar
  31. 31.
    K. Lammle, H. Zipper, M. Breuer, B. Hauer, C. Buta, H. Brunner, S. Rupp, Identification of novel enzymes with different hydrolytic activities by metagenome expression cloning. J. Biotechnol. 127, 575–592 (2007)Google Scholar
  32. 32.
    A.H. Treusch, A. Kletzin, G. Raddatz, T. Ochsenreiter, A. Quaiser, G. Meurer, Characterization of large-insert DNA libraries from soil for environmental genomic studies of Archaea. Environ. Microbiol. 6, 970–980 (2004)Google Scholar
  33. 33.
    A. Ginolhac, C. Jarrin, B. Gillet, P. Robe, P. Pujic, Phylogenetic analysis of polyketide synthase I domains from soil metagenomic libraries allows selection of promising Clones. Appl. Microbiol. Biotechnol. 70, 5522–5527 (2004)Google Scholar
  34. 34.
    A. Quaiser, T. Ochsenreiter, C. Lanz, S.C. Schuster, A.H. Treusch, J. Eck, C. Schleper, Acidobacteria form a coherent but highly diverse group within the bacterial domain: evidence from environmental genomics. Mol. Microbiol. 50, 563–575 (2003)Google Scholar
  35. 35.
    S. Courtois, C.M. Cappellano, M. Ball, F.X. Francou, P. Normand, G. Helynck, A. Martinez, J. KolvekS, J. Hopke, M.S. Osburne, P.R. August, R. Nalin, M. Gue´rineau, P. Jeannin, P. Simonet, J.L. Pernodet, Recombinant environmental libraries provide access to microbial diversity for drug discovery from natural products. Appl. Microbiol. Biotechnol. 69, 49–55 (2003)Google Scholar
  36. 36.
    R. Gupta, Q.K. Beg, P. Lorenz, Bacterial alkaline proteases: molecular approaches and industrial applications. Appl. Microbiol. Biotechnol. 59, 15–32 (2002)Google Scholar
  37. 37.
    A. Quaiser, T. Ochsenreiter, H.P. Klenk, A. Kletzin, H. TreuschA, G. Meurer, J. Eck, W. SensenC, C. Schleper, First insight into the genome of an uncultivated crenarchaeote from soil. Environ. Microbiol. 4, 603–611 (2002)Google Scholar
  38. 38.
    R. Ranjan, A. Grover, R.K. Kapardar, R. Sharma, Isolation of novel lipolytic genes from uncultured bacteria of pond water. Biochem. Biophys. Res. Commun. 335, 57–65 (2005)Google Scholar
  39. 39.
    Y.J. Kim, G.S. Choi, S.B. Kim, G.S. Yoon, Y.S. Kim, W.Y. Ryu, Screening and characterization of a novel esterase from a metagenomic library. Proetin Expr. Purif. 45, 315–323 (2006)Google Scholar
  40. 40.
    A. Henne, R. Daniel, R.A. Schmitz, G. Gottschalk, Construction of environmental DNA libraries in Escherichia coli and screening for the presence of genes conferring utilization of 4-hydroxybutyrate. Appl. Microbiol. Biotechnol. 65, 3901–3907 (1999)Google Scholar
  41. 41.
    M.H. Lee, C.H. Lee, T.K. Oh, J.K. Song, J.H. Yoon, Isolation and characterization of a novel lipase from a metagenomic library of tidal flat sediments: evidence for a new family of bacterial lipases. Appl. Environ. Microbiol. 72, 7406–7409 (2006)Google Scholar
  42. 42.
    C. Elend, C. Schmeisser, H. Hoebenreich, H.L. Steel, W.R. Streit, Isolation and characterization of a metagenome-derived and cold-active lipase with high stereospecificity for (R)-ibuprofen esters. J. Biotechnol. 130, 370–377 (2007)Google Scholar
  43. 43.
    M. Ferrer, O.V. Golyshina, T.N. Chernikova, A.N. Khachane, V.A.P.M. Martins dos Santos, M.M. Yakimov, K.N. Timmis, P.N. Golyshin, Microbial enzymes mined from the urania deep-sea hypersaline anoxic basin. Chem. Biol. 12, 895–904 (2005)Google Scholar
  44. 44.
    M. Ferrer, O.V. Golyshina, T.N. Chernikova, A.N. Khachane, D.R. Duarte, V.A.P.M.D. Santos, C. Strompl, K. Elborough, G. Jarvis, A. Neef, M.M. Yakimov, K.N. Timmis, P.N. Golyshin, Novel hydrolase diversity retrieved from a metagenome library of bovine rumen microflora. Environ. Microbiol. 7, 1996–2010 (2005)Google Scholar
  45. 45.
    N. Weiland, C. Löscher, R. Metzger, R.(. Schmitz, in Construction and screening of marine metagenomic libraries. Metagenomics: Methods and Protocols, ed. by W.R. Streit, R. Daniel (Springer Science and Business Media, New York, 2010), pp. 51–65Google Scholar
  46. 46.
    T. Abe, P. SahinF, K. Akiyama, T. Naito, M. Kishigami, K. Miyamoto, Y. Sakakibara, D. Uemura, Construction of a metagenomic library for the marine sponge Halichondria okadai. Biosci. Biotechnol. Biochem. 76, 633–639 (2012)Google Scholar
  47. 47.
    Michael Y. Galperin, Metagenomics: from acid mine to shining sea. Environ. Microbiol. 6, 543–545 (2004)Google Scholar
  48. 48.
    Y. Li, M. Wexler, D.J. Richardson, P.L. Bond, A.W.B. Johnston, Screening a wide host-range, waste-water metagenomic library in tryptophan auxotrophs of Rhizobium leguminosarum and of Escherichia coli reveals different classes of cloned trpgenes. Environ. Microbiol. 7, 1927–1936 (2005)Google Scholar
  49. 49.
    H. Suenaga, T. Ohnuki, K. Miyazaki, Functional screening of a metagenomic library forgenes involved in microbial degradation of aromatic compounds. Environ. Microbiol. 9, 2289–2297 (2007)Google Scholar
  50. 50.
    T. Mori, S. Mizuta, H. Suenaga, K. Miyazaki, Metagenomic screening for Bleomycin resistance genes. Appl. Environ. Microbiol. 74, 6803–6805 (2008)Google Scholar
  51. 51.
    D.G. Lee, J.H. Jeon, M.K. Jang, N.Y. Kim, J.H. Lee, S.J. Kim, G.D. Kim, S.H. Lee, Screening and characterization of a novel fibrinolytic metalloprotease from a metagenomic library. Biotechnol. Lett. 29, 465–472 (2007)Google Scholar
  52. 52.
    N. Ilmberger, W.R. Streit, in Screening for Cellulase-Encoding Clones in Metagenomic Libraries in Metagenomics: Methods and Protocols, ed. by R.W. Streit, R. Daniel (Springer Science and Business Media, New York, 2010), pp. 177–187Google Scholar
  53. 53.
    J. Jüergensen, N. Ilmberger, W.R. Streit, Screening for Cellulases with Industrial Value and Their use in Biomass Conversion in Microbial Metabolic Engineering: Methods and Protocols (Springer Science and Business Media, New York, 2010), pp. 1–16Google Scholar
  54. 54.
    M. Ferrer, A. Beloqui, N. GolyshinP, Screening metagenomic libraries for laccase activities. Metagenomics: Methods Mol. Biol. 668, 189–202 (2010)Google Scholar
  55. 55.
    W. Xiang, J. Zhang, L. Li, H. Liang, H. Luo, J. Zhao, Z. Yang, Q. Sun, Screening a novel Na1/H1 antiporter gene from a metagenomic library of halophiles colonizing in the Dagong ancient brine well in China. FEMS Microbiol. Lett. 306, 22–29 (2010)Google Scholar
  56. 56.
    P.K. Lorenz, F. Niehaus, J. Eck, Screening for novel enzymes for biocatalytic processes: accessing the metagenome as a resource of novel functional sequence space. Curr. Opin. Biotechnol. 13(6), 572–577 (2002)Google Scholar
  57. 57.
    A. Knietsch, T. Waschkowitz, S. Bowien, A. Henne, R. Daniel, Construction and screening of metagenomic libraries derived from enrichment cultures: generation of a gene bank for genes conferring alcohol oxidoreductase activity on Escherichia coli. Appl. Environ. Microbiol. 69(3), 1408–1416 (2003)Google Scholar
  58. 58.
    J. Yun, S. Ryu, Screening for novel enzymes from metagenome and SIGEX, as a way to improve it. Microbiol. Cell Fact. (2005). doi:10.1186/1475-2859-4-8 Google Scholar
  59. 59.
    W.V. Granda, The next meta-challenge for bioinformatics. Bioinformation. 2(8), 358–362 (2008)Google Scholar
  60. 60.
    P. Ribeca, G. Valiente, Computational challenges of sequence classification in microbiomic data. Brief. Bioinform. 12(6), 614–625 (2011)Google Scholar
  61. 61.
    M. Pop, S.L. Salzberg, Bioinformatics challenges of new sequencing technology. Trends Genet. 24(3), 142–149 (2008)Google Scholar
  62. 62.
    M. Pignatelli, A. Moya, Evaluating the fidelity of de novo short read metagenomic assembly using simulated data. PLoS ONE (2011). doi:10.1371/journal.pone.001998 MATHGoogle Scholar
  63. 63.
    J.C. Wooley, Y. Ye, Metagenomics: facts and artifacts, and computational challenges. J. Comput. Sci. Technol. 25, 71–81 (2009)Google Scholar
  64. 64.
    L. Chistoserdova, Recent progress and new challenges in metagenomics for biotechnology. Biotechnol. Lett. 32(2), 1351–1359 (2010)Google Scholar
  65. 65.
    J. Bunge, M. Fitzparick, Estimating the number of species: a review. J. Am. Stat. Assoc. 88(421), 364–373 (1993)Google Scholar
  66. 66.
    J.R. White, N. Nagarajan, M. Pop, Statistical methods for detecting differentially abundant features in clinical metagenomic samples. PLoS Comput. Biol. (2009). doi:10.1371/journal.pcbi.1000352 Google Scholar
  67. 67.
    P.D. Schloss, J. Handelsman, Biotechnological prospects from metagenomics: assessing functional diversity in microbial communities. Curr. Opin. Biotechnol. 14, 303–310 (2003)Google Scholar
  68. 68.
    J.F. Petrosino, S. Highlander, R.A. Luna, R.A. Gibbs, J. Versalovic, Metagenomic pyrosequencing and microbial identification. Clin. Chem. 55(5), 856–866 (2009)Google Scholar
  69. 69.
    J. Shendure, H. Ji, Next-generation DNA sequencing. Nat. Biotechnol. 26, 1135–1145 (2008)Google Scholar
  70. 70.
    K.E. Wommack, J. Bhavsar, J. Ravel, Metagenomics: read length matters. Appl. Environ. Microbiol. 74(5), 1453–1463 (2008)Google Scholar
  71. 71.
    P.K. Wall, J.L. Mack, A.S. Chanderbali, A. Barakat, E. Wolcott, H. Liang, L. Landherr, L.P. Tomsho, Y. Hu, J.E. Carlson, H. Ma, S.C. Schuster, D.E. Soltis, P.S. Soltis, N. Altman, C.W. de Pamphilis, Comparison of next generation sequencing technologies for transcriptome characterization. BMC Genomic 10, 347 (2009)Google Scholar
  72. 72.
    W. Zhang, J. Chen, Y. Yang, Y. Tang, J. Shang, B. Shen, A practical comparison of de novo genome assembly software tools for next-generation sequencing technologies. PLoS ONE (2011). doi:10.1371/journal.pone.0017915 Google Scholar
  73. 73.
    N. Nagarajan, M. Pop, in Sequencing and Genome Assembly Using Next-Generation Technologies in Methods in Molecular Biology Computational Biology, vol. 673, ed. by D. Fenyö (Springer Science, New York, 2010), pp. 1–17Google Scholar
  74. 74.
    M.J. Claesson, Q. Wang, O. O’Sullivan, R.G. Diniz, J.R. Cole, R.P. Ross, P.W. O’Toole, Comparison of two next-generation sequencing technologies for resolving highly complex microbiota composition using tandem variable 16S rRNA gene regions. Nucleic Acids Res. 38(22), e200 (2010)Google Scholar
  75. 75.
    T. LaFramboise, M. Ruffalo, M. Koyutürk, Comparative analysis of algorithms for next-generation sequencing read alignment. Bioinformatics 27, 790–2796 (2011)Google Scholar
  76. 76.
    M. Kanehisa, S. Goto, S. Kawashima, Y. Okuno, M. Hattori, The KEGG resource for deciphering the genome. Nucleic Acids Res. 32, D277–D280 (2004)Google Scholar
  77. 77.
    S. Mitra, P. Rupek, D.C. Richter, T. Urich, J.A. Gilbert, F. Meyer, A. Wilke, D.H. Huson, Functional analysis of metagenomes and metatranscriptomes using SEED and KEGG. BMC Bioinform. 12(1), 521 (2011)Google Scholar
  78. 78.
    F. Meyer, D. Paarmann, M. D’Souza, R. Olson, E.M. Glass, M. Kubal, T. Paczian, A. Rodriguez, R. Stevens, A. Wilke, J. Wilkening, R.A. Edwards, The metagenomics RAST server—a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinform. (2008). doi:10.1186/1471-2105-9-386 Google Scholar
  79. 79.
    D.A. Benson, I.K. Mizrachi, D.J. Lipman, J. Ostell, D.L. Wheeler, GenBank. Nucleic Acids Res. 33, D34–D38 (2005)Google Scholar
  80. 80.
    R.L. Tatusov, N.D. Fedorova, J.D. Jackson, A.R. Jacobs, B. Kiryutin, E.V. Koonin, D.M. Krylov, R. Mazumder, S.L. Mekhedov, A.N. Nikolskaya, B.S. Rao, S. Smirnov, A.V. Sverdlov, S. Vasudevan, Y.I. Wolf, J.J. Yin, D.A. Natale, The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acid Res. 28(1), 33–36 (2000)Google Scholar
  81. 81.
    Z. DeSantisT, P. Hugenholtz, N. Larsen, M. Rojas, E.L. Brodie, K. Keller, T. Huber, D. Dalevi, P. Hu, G.L. Andersen, Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72(7), 5069–5072 (2006)Google Scholar
  82. 82.
    L.K. McNeil, C. Reich, R.K. Aziz, D. Bartels, M. Cohoon, T. Disz, R.A. Edwards, S. Gerdes, K. Hwang, M. Kubal, G.R. Margaryan, F. Meyer, W. Mihalo, G.J. Olsen, R. Olson, A. Osterman, D. Paarmann, T. Paczian, B. Parrello, G.D. Pusch, D.A. Rodionov, X. Shi, O. Vassieva, V. Vonstein, O. Zagnitko, F. Xia, J. Zinner, R. Overbeek, R. Stevens, The National Microbial Pathogen Database Resource (NMPDR): a genomics platform based on subsystem annotation. Nucleic Acids Res. 35, D347–D353 (2007)Google Scholar
  83. 83.
    A. Bateman, E. Birney, L. Cerruti, R. Durbin, L. Etwiller, S.R. Eddy, S.G. Jones, L. HweKevin, M. Marshall, E.L.L. Sonnhammer, The Pfam protein families database. Nucleic Acids Res. 36(D1), 276–280 (2008)Google Scholar
  84. 84.
    D.H. Haft, J.D. Selengut, O. White, The TIGRFAMs database of protein families. Nucleic Acids Res. 31(1), 1371–1373 (2003)Google Scholar
  85. 85.
    S. Sun, J. Chen, W. Li, I. Altintas, A. Lin, S. Peltier, K. Stocks, E.E. Allen, M. Ellisman, J. Grethe, J. Wooley, Community cyber infrastructure for advanced microbial ecology research and analysis: the CAMERA resource. Nucleic Acids Res. 39, D546–D551 (2011)Google Scholar
  86. 86.
    V.K. Sharma, N. Kumar, T. Prakash, T.D. Taylor, MetaBioME: a database to explore commercially useful enzymes in metagenomic datasets. Nucleic Acids Res. 38, D468–D472 (2010)Google Scholar
  87. 87.
    S.F. Altschul, W. Gish, W. Miller, E.W. Myers, D.J. Lipman, Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)Google Scholar
  88. 88.
    J.D. Thompson, D.G. Higgins, T.J. Gibson, CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22(22), 4673–4680 (1994)Google Scholar
  89. 89.
    D.H. Huson, A.F. Auch, J. Qi, S.C. Schuster, MEGAN analysis of metagenomic data. Genome Res. 17(3), 377–386 (2007)Google Scholar
  90. 90.
    V.M. Markowitz, N.N. Ivanova, E. Szeto, K. Palaniappan, K. Chu, D. Dalevi, I.M.A. Chen, Y. Grechkin, I. Dubchak, I. Anderson, A. Lykidis, K. Mavromatis, P. Hugenholtz, N.C. Kyrpides, IMG/M: a data management and analysis system for metagenomes. Nucleic Acids Res. 36, D534–D538 (2008)Google Scholar
  91. 91.
    M. Horton, N. Bodenhausen, J. Bergelson, MARTA: a suite of Java-based tools for assigning taxonomic status to DNA sequences. Seq. Anal. 26(4), 568–569 (2010)Google Scholar
  92. 92.
    K.J. Hoff, T. Lingner, P. Meinicke, M. Tech, Orphelia: predicting genes in metagenomic sequencing reads. Nucleic Acids Res. (2009). doi:10.1093/nar/gkp327 Google Scholar
  93. 93.
    D.C. Richter, F. Ott, A.F. Auch, R. Schmid, D.H. Huson, MetaSim—a sequencing simulator for genomics and metagenomics. PLoS ONE (2008). doi:10.1371/journal.pone.0003373 Google Scholar
  94. 94.
    N.N. Diaz, L. Krause, A. Goesmann, K. Niehaus, T.W. Nattkemper, TACOA—taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach. BMC Bioinform. (2009). doi:10.1186/1471-2105-10-56 Google Scholar
  95. 95.
    W. Gerlach, S. Jünemann, F. Tille, A. Goesmann, J. Stoye, WebCARMA: a web application for the functional and taxonomic classification of unassembled metagenomic reads. BMC Bioinform. 10, 430 (2009)Google Scholar
  96. 96.
    J.H. Badger, G.J. Olsen, CRITICA: coding region identification tool invoking comparative analysis. Mol. Biol. Evol. 16(4), 512–524 (1999)Google Scholar
  97. 97.
    S.F. Altschul, T.L. Madden, A.A. Schäffer, J. Zhang, Z. Zhang, W. Miller, D.J. Lipman, Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)Google Scholar
  98. 98.
    S. Balzer, K. Malde, A. Lanzén, A. Sharma, I. Jonassen, Characteristics of 454 pyrosequencing data—enabling realistic simulation with flowsim. Bioinformatics 26(18), i420–i425 (2010)Google Scholar
  99. 99.
    P.D. Schloss, S.L. Westcott, T. Ryabin, J.R. Hall, M. Hartmann, E.B. Hollister, R.A. Lesniewski, B.B. Oakley, D.H. Parks, C.J. Robinson, J.W. Sahl, B. Stres, G.G. Thallinger, D.J.V. Horn, C.F. Weber, Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbial. 75(23), 7537–7541 (2009)Google Scholar
  100. 100.
    A. Brady, S.L. Salzberg, Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov models. Nat. Methods 6(9), 673–678 (2009)Google Scholar
  101. 101.
    R.C. Edgar, MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics. 5, 113 (2004)Google Scholar
  102. 102.
    C. Notredame, L. Holm, D.G. Higgins, COFFEE: an objective function for multiple sequence alignments. Bioinformatics 14, 407–422 (1998)Google Scholar
  103. 103.
    B. Morgenstern, DIALIGN: multiple DNA and protein sequence alignment at BiBiServ. Nucleic Acids Res. 32, W33–W36 (2004)Google Scholar
  104. 104.
    K. Katoh, K. Kuma, H. Toh, T. Miyata, MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518 (2005)Google Scholar
  105. 105.
    H. Noguchi, J. Park, T. Takagi, MetaGene: prokaryotic gene finding from environmental genome shotgun sequences. Nucleic Acids Res. 34, 5623–5630 (2006)Google Scholar
  106. 106.
    H. Noguchi, T. Taniguchi, T. Itoh, MetaGeneAnnotator: detecting species-specific patterns of ribosomal binding site for precise gene prediction in anonymous prokaryotic and phage genomes. DNA Res. 15, 387–396 (2008)Google Scholar
  107. 107.
    M. Rho, H. Tang, Y. Ye, FragGeneScan: predicting genes in short and error-prone reads. Nucleic Acids Res. 38, e191 (2010)Google Scholar
  108. 108.
    T. RombelI, K.F. Sykes, S. Rayner, S.A. Johnston, ORF-FINDER: a vector for high-throughput gene identification. Gene 282, 33–41 (2002)Google Scholar
  109. 109.
    A.L. Delcher, K.A. Bratke, E.C. Powers, S.L. Salzberg, Identifying bacterial genes and endosymbiont DNA with Glimmer. Genome Anal. 23, 673–679 (2007)Google Scholar
  110. 110.
    H. Teeling, J. Waldmann, T. Lombardot, M. Bauer, F.O. Glöckner, TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences. BMC Bioinform. 5, 163 (2004)Google Scholar
  111. 111.
    J. Tuimala, A Primer to Phylogenetic Analysis Using the PHYLIP Package (Scientific Computing Ltd, Crawley, 2006)Google Scholar
  112. 112.
    A. Brady, S.L. Salzberg, Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov models. Nat. Methods 6, 673–678 (2009)Google Scholar
  113. 113.
    K. Liolios, I.M.A. Chen, K. Mavromatis, N. Tavernarakis, P. Hugenholtz, V.M. Markowitz, N.C. Kyrpides, The genomes on line database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res. 38, D346–D354 (2010)Google Scholar
  114. 114.
    D. Field, G. Garrity, T. Gray, N. Morrison, J. Selengut, P. Sterk, T. Tatusova, N. Thomson, M.J. Allen, S.V. Angiuoli, M. Ashburner, N. Axelrod, S. Baldauf, S. Ballard, J. Boore, G. Cochrane, J. Cole, P. Dawyndt, P.D. Vos, C. de Pamphilis, R. Edwards, N. Faruque, R. Feldman, J. Gilbert, P. Gilna, F.O. Glöckner, P. Goldstein, R. Guralnick, D. Haft, D. Hancock, H. Hermjakob, C.H. Fowler, P. Hugenholtz, I. Joint, L. Kagan, M. Kane, J. Kennedy, G. Kowalchuk, R. Kottmann, E. Kolker, S. Kravitz, N. Kyrpides, J.L. Mack, S.E. Lewis, K. Li, A.L. Lister, P. Lord, N. Maltsev, V. Markowitz, J. Martiny, B. Methe, I. Mizrachi, R. Moxon, K. Nelson, J. Parkhill, L. Proctor, O. White, S.A. Sansone, A. Spiers, R. Stevens, P. Swift, C. Taylor, Y. Tateno, A. Tett, S. Turner, D. Ussery, B. Vaughan, N.W. Whetzel, I.S. Gil, G. Wilson, A. Wipat, The minimum information about a genome sequence (MIGS) specification. Nat. Biotechnol. 26, 541–547 (2008)Google Scholar
  115. 115.
    J.R. Cole, B. Chai, T.L. Marsh, R.J. Farris, Q. Wang, S.A. Kulam, S. Chandra, D.M. McGarrell, T.M. Schmidt, G.M. Garrity, J.M. Tiedje, The Ribosomal Database Project (RDP-II): previewing a new autoaligner that allows regular updates and the new prokaryotic taxonomy. Nucleic Acids Res. 31, 442–443 (2003)Google Scholar
  116. 116.
    E. Pruesse, C. Quast, K. Knittel, B.M. Fuchs, W. Ludwig, J. Peplies, F.O. Glo¨ckner, SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Research 35, 7188–7196 (2007)Google Scholar
  117. 117.
    B. Giardine, C. Riemer, R.C. Hardison, R. Burhans, L. Elnitski, P. Shah, Y. Zhang, D. Blankenberg, I. Albert, J. Taylor, W. Miller, W.J. Kent, A. Nekrutenko, Galaxy: a platform for interactive large-scale genome analysis. Genome Res. 15, 1451–1455 (2005)Google Scholar
  118. 118.
    M. Arumugam, E.D. Harrington, K.U. Foerstner, J. Raes, P. Bork, SmashCommunity: A metagenomic annotation and analysis tool. Bioinformatics 26, 2977–2978 (2010)Google Scholar
  119. 119.
    K.E. Wommack, J. Bhavsar, W. PolsonS, J. Chen, M. Dumas, S. Srinivasiah, M. Furman, S. Jamindar, J. NaskoD, VIROME: a standard operating procedure for analysis of viral metagenome sequences. Stand. Genomic Sci. 6, 421–433 (2012)Google Scholar
  120. 120.
    D. Dalevi, N.N. Ivanova, K. Mavromatis, D. HooperS, E. Szeto, P. Hugenholtz, C. KyrpidesN, M. MarkowitzV, Annotation of metagenome short reads using proxygenes. Bioinformatics 24, i7–i13 (2004)Google Scholar
  121. 121.
    R. L. Warren, G.G. Sutton, S.J.M. Jones, R.A. Holt, Assembling millions of short DNA sequences using SSAKE. Genome Anal. 23, 500–501 (2007)Google Scholar
  122. 122.
    J.C. Dohm, C. Lottaz, T. Borodina, Algorithm for de novo genomic sequencing SHARCGS, a fast and highly accurate short-read. Genome Res. 17, 1697–1706 (2007)Google Scholar
  123. 123.
    J.T. Simpson, K. Wong, S.D. Jackman, ABySS: a parallel assembler for short read sequence data. Genome Res. 19, 1117–1123 (2009)Google Scholar
  124. 124.
    R. ZerbinoD, E. Birney, Velvet: algorithms for de novo short read assembly using de Brujin graphs. Genome Res. 18, 821–829 (2008)Google Scholar
  125. 125.
    The Gene Ontology Consortium, The gene ontology in 2010: extensions and refinements. Nucleic Acids Res. 38, D331–D335 (2010)Google Scholar

Copyright information

© The Institution of Engineers (India) 2015

Authors and Affiliations

  • Vinay Kumar
    • 1
  • S. S. Maitra
    • 1
  • Rohit Nandan Shukla
    • 2
  1. 1.School of BiotechnologyJawaharlal Nehru UniversityNew DelhiIndia
  2. 2.Bionivid Technology Pvt. Ltd.BangaloreIndia

Personalised recommendations