Abstract
The accelerating pace of microbial genomics is sparking a renaissance in the field of natural products research. Researchers can now get a preview of the organism’s secondary metabolome by analyzing its genomic sequence. Combined with other -omics data, this approach may provide a cost-effective alternative to industrial high-throughput screening in drug discovery. In the last few years, several computational tools have been developed to facilitate this process by identifying genes involved in secondary metabolite biosynthesis in bacterial and fungal genomes. Here, we review seven software programs that are available for this purpose, with an emphasis on antibiotics & Secondary Metabolite Analysis SHell (antiSMASH) and Secondary Metabolite Unknown Regions Finder (SMURF), the only tools that can comprehensively detect complete secondary metabolite biosynthesis gene clusters. We also discuss five related software packages—CLUster SEquence ANalyzer (CLUSEAN), ClustScan, Structure Based Sequence Analysis of Polyketide Synthases (SBSPKS), NRPSPredictor, and Natural Product searcher (NP.searcher)—that identify secondary metabolite backbone biosynthesis genes. This chapter offers detailed protocols, suggestions, and caveats to assist researchers in using these tools most effectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Winter JM, Behnken S, Hertweck C (2011) Genomics-inspired discovery of natural products. Curr Opin Chem Biol 15(1):22–31
Keller NP, Hohn TM (1997) Metabolic pathway gene clusters in filamentous fungi. Fungal Genet Biol 21(1):17–29
Khaldi N, Seifuddin FT, Turner G, Haft D, Nierman WC, Wolfe KH, Fedorova ND (2010) SMURF: Genomic mapping of fungal secondary metabolite clusters. Fungal Genet Biol 47(9):736–741
Medema MH, Blin K, Cimermancic P, de Jager V, Zakrzewski P, Fischbach MA, Weber T, Takano E, Breitling R (2011) antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Res 39:W339–W346
Weber T, Rausch C, Lopez P, Hoof I, Gaykova V, Huson DH, Wohlleben W (2009) CLUSEAN: a computer-based framework for the automated analysis of bacterial secondary metabolite biosynthetic gene clusters. J Biotechnol 140(1–2):13–17
Starcevic A, Zucko J, Simunkovic J, Long PF, Cullum J, Hranueli D (2008) ClustScan: an integrated program package for the semi-automatic annotation of modular biosynthetic gene clusters and in silico prediction of novel chemical structures. Nucleic Acids Res 36(21):6882–6892
Anand S, Prasad MV, Yadav G, Kumar N, Shehara J, Ansari MZ, Mohanty D (2010) SBSPKS: structure based sequence analysis of polyketide synthases. Nucleic Acids Res 38:W487–W496
Li MH, Ung PM, Zajkowski J, Garneau-Tsodikova S, Sherman DH (2009) Automated genome mining for natural products. BMC Bioinformatics 10:185
Rausch C, Weber T, Kohlbacher O, Wohlleben W, Huson DH (2005) Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs). Nucleic Acids Res 33(18):5799–5808
Röttig M, Medema MH, Blin K, Weber T, Rausch C, Kohlbacher O (2011) NRPSpredictor2—a web server for predicting NRPS adenylation domain specificity. Nucleic Acids Res 39(2):W362–W367
Lansini G, Demain AL (1999) Biology of the prokaryotes. Georg Thieme, Stuttgart
Majoros WH, Pertea M, Salzberg SL (2004) TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20(16):2878–2879
Delcher AL, Bratke KA, Powers EC, Salzberg SL (2007) Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23(6):673–679
Cantarel BL, Korf I, Robb SM, Parra G, Ross E, Moore B, Holt C, Sanchez Alvarado A, Yandell M (2008) MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res 18(1):188–196
Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M et al (2008) The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9:75
Sonnhammer EL, Eddy SR, Birney E, Bateman A, Durbin R (1998) Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res 26(1):320–322
Haft DH, Selengut JD, White O (2003) The TIGRFAMs database of protein families. Nucleic Acids Res 31(1):371–373
Starcevic A, Diminic J, Zucko J, Elbekali M, Schlosser T, Lisfi M, Vukelic A, Long PF, Hranueli D, Cullum J (2011) A novel docking domain interface model predicting recombination between homoeologous modular biosynthetic gene clusters. J Ind Microbiol Biotechnol 38(9):1295–1304. doi:10.1007/s10295-10010-10909-10290
Wortman JR, Gilsenan JM, Joardar V, Deegan J, Clutterbuck J, Andersen MR, Archer D, Bencina M, Braus G, Coutinho P et al (2009) The 2008 update of the Aspergillus nidulans genome annotation: a community effort. Fungal Genet Biol 46(Suppl 1):S2–S13
Ma L-J, Fedorova ND (2010) A practical guide to fungal genome projects. Mycol Int J Fungal Biol 1(1):9–24
Pati A, Ivanova NN, Mikhailova N, Ovchinnikova G, Hooper SD, Lykidis A, Kyrpides NC (2010) GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes. Nat Methods 7(6):455–457
Nicholson MJ, Koulman A, Monahan BJ, Pritchard BL, Payne GA, Scott B (2009) Identification of two aflatrem biosynthesis gene loci in Aspergillus flavus and metabolic engineering of Penicillium paxilli to elucidate their function. Appl Environ Microbiol 75(23):7469–7481
Acknowledgements
We thank Suman Pakala and Bill Nierman at JCVI for critical suggestions and comments. The work of MHM was supported by the Dutch Technology Foundation, which is the applied-science division of The Netherlands Organisation for Scientific Research and the Technology Programme of the Ministry of Economic Affairs under STW grant number 10463. This project has been funded in part with federal funds to NDF from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under contract numbers N01-AI-30071 and HHSN272200900007C.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Fedorova, N.D., Moktali, V., Medema, M.H. (2012). Bioinformatics Approaches and Software for Detection of Secondary Metabolic Gene Clusters. In: Keller, N., Turner, G. (eds) Fungal Secondary Metabolism. Methods in Molecular Biology, vol 944. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-122-6_2
Download citation
DOI: https://doi.org/10.1007/978-1-62703-122-6_2
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-121-9
Online ISBN: 978-1-62703-122-6
eBook Packages: Springer Protocols