Skip to main content

Bioinformatics Approaches and Software for Detection of Secondary Metabolic Gene Clusters

  • Protocol
  • First Online:
Fungal Secondary Metabolism

Part of the book series: Methods in Molecular Biology ((MIMB,volume 944))

Abstract

The accelerating pace of microbial genomics is sparking a renaissance in the field of natural products research. Researchers can now get a preview of the organism’s secondary metabolome by analyzing its genomic sequence. Combined with other -omics data, this approach may provide a cost-effective alternative to industrial high-throughput screening in drug discovery. In the last few years, several computational tools have been developed to facilitate this process by identifying genes involved in secondary metabolite biosynthesis in bacterial and fungal genomes. Here, we review seven software programs that are available for this purpose, with an emphasis on antibiotics & Secondary Metabolite Analysis SHell (antiSMASH) and Secondary Metabolite Unknown Regions Finder (SMURF), the only tools that can comprehensively detect complete secondary metabolite biosynthesis gene clusters. We also discuss five related software packages—CLUster SEquence ANalyzer (CLUSEAN), ClustScan, Structure Based Sequence Analysis of Polyketide Synthases (SBSPKS), NRPSPredictor, and Natural Product searcher (NP.searcher)—that identify secondary metabolite backbone biosynthesis genes. This chapter offers detailed protocols, suggestions, and caveats to assist researchers in using these tools most effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Winter JM, Behnken S, Hertweck C (2011) Genomics-inspired discovery of natural products. Curr Opin Chem Biol 15(1):22–31

    Article  PubMed  CAS  Google Scholar 

  2. Keller NP, Hohn TM (1997) Metabolic pathway gene clusters in filamentous fungi. Fungal Genet Biol 21(1):17–29

    Article  CAS  Google Scholar 

  3. Khaldi N, Seifuddin FT, Turner G, Haft D, Nierman WC, Wolfe KH, Fedorova ND (2010) SMURF: Genomic mapping of fungal secondary metabolite clusters. Fungal Genet Biol 47(9):736–741

    Article  PubMed  CAS  Google Scholar 

  4. Medema MH, Blin K, Cimermancic P, de Jager V, Zakrzewski P, Fischbach MA, Weber T, Takano E, Breitling R (2011) antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Res 39:W339–W346

    Article  PubMed  CAS  Google Scholar 

  5. Weber T, Rausch C, Lopez P, Hoof I, Gaykova V, Huson DH, Wohlleben W (2009) CLUSEAN: a computer-based framework for the automated analysis of bacterial secondary metabolite biosynthetic gene clusters. J Biotechnol 140(1–2):13–17

    Article  PubMed  CAS  Google Scholar 

  6. Starcevic A, Zucko J, Simunkovic J, Long PF, Cullum J, Hranueli D (2008) ClustScan: an integrated program package for the semi-automatic annotation of modular biosynthetic gene clusters and in silico prediction of novel ­chemical structures. Nucleic Acids Res 36(21):6882–6892

    Article  PubMed  CAS  Google Scholar 

  7. Anand S, Prasad MV, Yadav G, Kumar N, Shehara J, Ansari MZ, Mohanty D (2010) SBSPKS: structure based sequence analysis of polyketide synthases. Nucleic Acids Res 38:W487–W496

    Article  PubMed  CAS  Google Scholar 

  8. Li MH, Ung PM, Zajkowski J, Garneau-Tsodikova S, Sherman DH (2009) Automated genome mining for natural products. BMC Bioinformatics 10:185

    Article  PubMed  Google Scholar 

  9. Rausch C, Weber T, Kohlbacher O, Wohlleben W, Huson DH (2005) Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs). Nucleic Acids Res 33(18):5799–5808

    Article  PubMed  CAS  Google Scholar 

  10. Röttig M, Medema MH, Blin K, Weber T, Rausch C, Kohlbacher O (2011) NRPSpredictor2—a web server for predicting NRPS adenylation domain specificity. Nucleic Acids Res 39(2):W362–W367

    Article  PubMed  Google Scholar 

  11. Lansini G, Demain AL (1999) Biology of the prokaryotes. Georg Thieme, Stuttgart

    Google Scholar 

  12. Majoros WH, Pertea M, Salzberg SL (2004) TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20(16):2878–2879

    Article  PubMed  CAS  Google Scholar 

  13. Delcher AL, Bratke KA, Powers EC, Salzberg SL (2007) Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23(6):673–679

    Article  PubMed  CAS  Google Scholar 

  14. Cantarel BL, Korf I, Robb SM, Parra G, Ross E, Moore B, Holt C, Sanchez Alvarado A, Yandell M (2008) MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res 18(1):188–196

    Article  PubMed  CAS  Google Scholar 

  15. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M et al (2008) The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9:75

    Article  PubMed  Google Scholar 

  16. Sonnhammer EL, Eddy SR, Birney E, Bateman A, Durbin R (1998) Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res 26(1):320–322

    Article  PubMed  CAS  Google Scholar 

  17. Haft DH, Selengut JD, White O (2003) The TIGRFAMs database of protein families. Nucleic Acids Res 31(1):371–373

    Article  PubMed  CAS  Google Scholar 

  18. Starcevic A, Diminic J, Zucko J, Elbekali M, Schlosser T, Lisfi M, Vukelic A, Long PF, Hranueli D, Cullum J (2011) A novel docking domain interface model predicting recombination between homoeologous modular biosynthetic gene clusters. J Ind Microbiol Biotechnol 38(9):1295–1304. doi:10.1007/s10295-10010-10909-10290

    Article  PubMed  CAS  Google Scholar 

  19. Wortman JR, Gilsenan JM, Joardar V, Deegan J, Clutterbuck J, Andersen MR, Archer D, Bencina M, Braus G, Coutinho P et al (2009) The 2008 update of the Aspergillus nidulans genome annotation: a community effort. Fungal Genet Biol 46(Suppl 1):S2–S13

    Article  PubMed  CAS  Google Scholar 

  20. Ma L-J, Fedorova ND (2010) A practical guide to fungal genome projects. Mycol Int J Fungal Biol 1(1):9–24

    Article  CAS  Google Scholar 

  21. Pati A, Ivanova NN, Mikhailova N, Ovchinnikova G, Hooper SD, Lykidis A, Kyrpides NC (2010) GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes. Nat Methods 7(6):455–457

    Article  PubMed  CAS  Google Scholar 

  22. Nicholson MJ, Koulman A, Monahan BJ, Pritchard BL, Payne GA, Scott B (2009) Identification of two aflatrem biosynthesis gene loci in Aspergillus flavus and metabolic engineering of Penicillium paxilli to elucidate their function. Appl Environ Microbiol 75(23):7469–7481

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgements

We thank Suman Pakala and Bill Nierman at JCVI for critical suggestions and comments. The work of MHM was supported by the Dutch Technology Foundation, which is the applied-science division of The Netherlands Organisation for Scientific Research and the Technology Programme of the Ministry of Economic Affairs under STW grant number 10463. This project has been funded in part with federal funds to NDF from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under contract numbers N01-AI-30071 and HHSN272200900007C.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Natalie D. Fedorova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Fedorova, N.D., Moktali, V., Medema, M.H. (2012). Bioinformatics Approaches and Software for Detection of Secondary Metabolic Gene Clusters. In: Keller, N., Turner, G. (eds) Fungal Secondary Metabolism. Methods in Molecular Biology, vol 944. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-122-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-122-6_2

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-121-9

  • Online ISBN: 978-1-62703-122-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics