Bioinformatics Approaches and Software for Detection of Secondary Metabolic Gene Clusters

Fedorova, Natalie D.; Moktali, Venkatesh; Medema, Marnix H.

doi:10.1007/978-1-62703-122-6_2

Natalie D. Fedorova³,
Venkatesh Moktali³ &
Marnix H. Medema⁴

Part of the book series: Methods in Molecular Biology ((MIMB,volume 944))

4070 Accesses
39 Citations

Abstract

The accelerating pace of microbial genomics is sparking a renaissance in the field of natural products research. Researchers can now get a preview of the organism’s secondary metabolome by analyzing its genomic sequence. Combined with other -omics data, this approach may provide a cost-effective alternative to industrial high-throughput screening in drug discovery. In the last few years, several computational tools have been developed to facilitate this process by identifying genes involved in secondary metabolite biosynthesis in bacterial and fungal genomes. Here, we review seven software programs that are available for this purpose, with an emphasis on antibiotics & Secondary Metabolite Analysis SHell (antiSMASH) and Secondary Metabolite Unknown Regions Finder (SMURF), the only tools that can comprehensively detect complete secondary metabolite biosynthesis gene clusters. We also discuss five related software packages—CLUster SEquence ANalyzer (CLUSEAN), ClustScan, Structure Based Sequence Analysis of Polyketide Synthases (SBSPKS), NRPSPredictor, and Natural Product searcher (NP.searcher)—that identify secondary metabolite backbone biosynthesis genes. This chapter offers detailed protocols, suggestions, and caveats to assist researchers in using these tools most effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Winter JM, Behnken S, Hertweck C (2011) Genomics-inspired discovery of natural products. Curr Opin Chem Biol 15(1):22–31
Article PubMed CAS Google Scholar
Keller NP, Hohn TM (1997) Metabolic pathway gene clusters in filamentous fungi. Fungal Genet Biol 21(1):17–29
Article CAS Google Scholar
Khaldi N, Seifuddin FT, Turner G, Haft D, Nierman WC, Wolfe KH, Fedorova ND (2010) SMURF: Genomic mapping of fungal secondary metabolite clusters. Fungal Genet Biol 47(9):736–741
Article PubMed CAS Google Scholar
Medema MH, Blin K, Cimermancic P, de Jager V, Zakrzewski P, Fischbach MA, Weber T, Takano E, Breitling R (2011) antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Res 39:W339–W346
Article PubMed CAS Google Scholar
Weber T, Rausch C, Lopez P, Hoof I, Gaykova V, Huson DH, Wohlleben W (2009) CLUSEAN: a computer-based framework for the automated analysis of bacterial secondary metabolite biosynthetic gene clusters. J Biotechnol 140(1–2):13–17
Article PubMed CAS Google Scholar
Starcevic A, Zucko J, Simunkovic J, Long PF, Cullum J, Hranueli D (2008) ClustScan: an integrated program package for the semi-automatic annotation of modular biosynthetic gene clusters and in silico prediction of novel chemical structures. Nucleic Acids Res 36(21):6882–6892
Article PubMed CAS Google Scholar
Anand S, Prasad MV, Yadav G, Kumar N, Shehara J, Ansari MZ, Mohanty D (2010) SBSPKS: structure based sequence analysis of polyketide synthases. Nucleic Acids Res 38:W487–W496
Article PubMed CAS Google Scholar
Li MH, Ung PM, Zajkowski J, Garneau-Tsodikova S, Sherman DH (2009) Automated genome mining for natural products. BMC Bioinformatics 10:185
Article PubMed Google Scholar
Rausch C, Weber T, Kohlbacher O, Wohlleben W, Huson DH (2005) Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs). Nucleic Acids Res 33(18):5799–5808
Article PubMed CAS Google Scholar
Röttig M, Medema MH, Blin K, Weber T, Rausch C, Kohlbacher O (2011) NRPSpredictor2—a web server for predicting NRPS adenylation domain specificity. Nucleic Acids Res 39(2):W362–W367
Article PubMed Google Scholar
Lansini G, Demain AL (1999) Biology of the prokaryotes. Georg Thieme, Stuttgart
Google Scholar
Majoros WH, Pertea M, Salzberg SL (2004) TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20(16):2878–2879
Article PubMed CAS Google Scholar
Delcher AL, Bratke KA, Powers EC, Salzberg SL (2007) Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23(6):673–679
Article PubMed CAS Google Scholar
Cantarel BL, Korf I, Robb SM, Parra G, Ross E, Moore B, Holt C, Sanchez Alvarado A, Yandell M (2008) MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res 18(1):188–196
Article PubMed CAS Google Scholar
Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M et al (2008) The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9:75
Article PubMed Google Scholar
Sonnhammer EL, Eddy SR, Birney E, Bateman A, Durbin R (1998) Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res 26(1):320–322
Article PubMed CAS Google Scholar
Haft DH, Selengut JD, White O (2003) The TIGRFAMs database of protein families. Nucleic Acids Res 31(1):371–373
Article PubMed CAS Google Scholar
Starcevic A, Diminic J, Zucko J, Elbekali M, Schlosser T, Lisfi M, Vukelic A, Long PF, Hranueli D, Cullum J (2011) A novel docking domain interface model predicting recombination between homoeologous modular biosynthetic gene clusters. J Ind Microbiol Biotechnol 38(9):1295–1304. doi:10.1007/s10295-10010-10909-10290
Article PubMed CAS Google Scholar
Wortman JR, Gilsenan JM, Joardar V, Deegan J, Clutterbuck J, Andersen MR, Archer D, Bencina M, Braus G, Coutinho P et al (2009) The 2008 update of the Aspergillus nidulans genome annotation: a community effort. Fungal Genet Biol 46(Suppl 1):S2–S13
Article PubMed CAS Google Scholar
Ma L-J, Fedorova ND (2010) A practical guide to fungal genome projects. Mycol Int J Fungal Biol 1(1):9–24
Article CAS Google Scholar
Pati A, Ivanova NN, Mikhailova N, Ovchinnikova G, Hooper SD, Lykidis A, Kyrpides NC (2010) GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes. Nat Methods 7(6):455–457
Article PubMed CAS Google Scholar
Nicholson MJ, Koulman A, Monahan BJ, Pritchard BL, Payne GA, Scott B (2009) Identification of two aflatrem biosynthesis gene loci in Aspergillus flavus and metabolic engineering of Penicillium paxilli to elucidate their function. Appl Environ Microbiol 75(23):7469–7481
Article PubMed CAS Google Scholar

Download references

Acknowledgements

We thank Suman Pakala and Bill Nierman at JCVI for critical suggestions and comments. The work of MHM was supported by the Dutch Technology Foundation, which is the applied-science division of The Netherlands Organisation for Scientific Research and the Technology Programme of the Ministry of Economic Affairs under STW grant number 10463. This project has been funded in part with federal funds to NDF from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under contract numbers N01-AI-30071 and HHSN272200900007C.

Author information

Authors and Affiliations

The J. Craig Venter Institute, Rockville, MD, USA
Natalie D. Fedorova & Venkatesh Moktali
Groningen Bioinformatics Centre and Department of Microbial Physiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands
Marnix H. Medema

Authors

Natalie D. Fedorova
View author publications
You can also search for this author in PubMed Google Scholar
Venkatesh Moktali
View author publications
You can also search for this author in PubMed Google Scholar
Marnix H. Medema
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Natalie D. Fedorova .

Editor information

Editors and Affiliations

, Dept. of Medical Microbiology & Immunolo, University of Wisconsin-Madison, Linden Drive 1550, Madison, 53706, Wisconsin, USA
Nancy P. Keller
, Dept. of Molecular Biology & Biotechnolo, University of Sheffield, Firth Court, Sheffield, S10 2TN, United Kingdom
Geoffrey Turner

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Fedorova, N.D., Moktali, V., Medema, M.H. (2012). Bioinformatics Approaches and Software for Detection of Secondary Metabolic Gene Clusters. In: Keller, N., Turner, G. (eds) Fungal Secondary Metabolism. Methods in Molecular Biology, vol 944. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-122-6_2

Download citation

DOI: https://doi.org/10.1007/978-1-62703-122-6_2
Published: 08 September 2012
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-121-9
Online ISBN: 978-1-62703-122-6
eBook Packages: Springer Protocols

Publish with us

Policies and ethics