Genomic Identification and Analysis of Specialized Metabolite Biosynthetic Gene Clusters in Plants Using PlantiSMASH

  • Satria A. Kautsar
  • Hernando G. Suarez Duran
  • Marnix H. MedemaEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1795)


Plants produce a vast diversity of specialized metabolites, which play important roles in the interactions with their microbiome, as well as with animals and other plants. Many such molecules have valuable biological activities that render them (potentially) useful as medicines, flavors and fragrances, nutritional ingredients, or cosmetics. Recently, plant scientists have discovered that the genes for many biosynthetic pathways for the production of such specialized metabolites are physically clustered on the chromosome within biosynthetic gene clusters (BGCs). The Plant Secondary Metabolite Analysis Shell (plantiSMASH) allows for the automated identification of such plant BGCs, facilitates comparison of BGCs across genomes, and helps users to predict the functional interactions of pairs of genes within and between BGCs based on coexpression analysis. In this chapter, we provide a detailed protocol on how to install and run plantiSMASH, and how to interpret its results to draw biological conclusions that are supported by the data.

Key words

Specialized metabolite Secondary metabolite Biosynthetic gene cluster Biosynthetic pathway Plant Genomic Bioinformatics 



This work was supported by a VENI grant [863.15.002 to M.H.M.] from The Netherlands Organization for Scientific Research (NWO) and by the Graduate School for Experimental Plant Sciences (EPS).


  1. 1.
    Medema MH, Osbourn A (2016) Computational genomic identification and functional reconstitution of plant natural product biosynthetic pathways. Nat Prod Rep 33:951–962. CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Nützmann H-W, Osbourn A (2014) Gene clustering in plant specialized metabolism. Curr Opin Biotechnol 26:91–99. CrossRefPubMedGoogle Scholar
  3. 3.
    Boycheva S, Daviet L, Wolfender J-L, Fitzpatrick TB (2014) The rise of operon-like gene clusters in plants. Trends Plant Sci 19:447–459. CrossRefPubMedGoogle Scholar
  4. 4.
    Nützmann HW, Huang A, Osbourn A (2016) Plant metabolic gene clusters—from genetics to genomics. New Phytol 211:771–789. CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Kautsar SA, Suarez Duran HG, Blin K et al (2017) plantiSMASH: automated identification, annotation and expression analysis of plant biosynthetic gene clusters. Nucleic Acids Res 45:W55–W63. CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Schläpfer P, Zhang P, Wang C et al (2017) Genome-wide prediction of metabolic enzymes, pathways, and gene clusters in plants. Plant Physiol 173:2041–2059. CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Töpfer N, Fuchs L-M, Aharoni A (2017) The PhytoClust tool for metabolic gene clusters discovery in plant genomes. Nucleic Acids Res 45:7049–7063. CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Wisecaver JH, Borowsky AT, Tzin V et al (2017) A global coexpression network approach for connecting genes to specialized metabolic pathways in plants. Plant Cell 29:944–959. CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Medema MH, Blin K, Cimermancic P et al (2011) antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Res 39:W339–W346. CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Mallona I, Peinado MA (2017) Truke, a web tool to check for and handle excel misidentified gene symbols. BMC Genomics 18:242. CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Fu L, Niu B, Zhu Z et al (2012) CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28:3150–3152. CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Serin EAR, Nijveen H, Hilhorst HWM, Ligterink W (2016) Learning from co-expression networks: possibilities and challenges. Front Plant Sci 7:444. CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Itkin M, Heinig U, Tzfadia O et al (2013) Biosynthesis of antinutritional alkaloids in solanaceous crops is mediated by clustered genes. Science 341:175–179. CrossRefPubMedGoogle Scholar
  14. 14.
    Boutanaev AM, Moses T, Zi J et al (2015) Investigation of terpene diversification across multiple sequenced plant genomes. Proc Natl Acad Sci 112:E81–E88. CrossRefPubMedGoogle Scholar
  15. 15.
    Miyamoto K, Fujita M, Shenton MR et al (2016) Evolutionary trajectory of phytoalexin biosynthetic gene clusters in rice. Plant J 87:293–304. CrossRefPubMedGoogle Scholar
  16. 16.
    Finn RD, Coggill P, Eberhardt RY et al (2016) The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res 44:D279–D285. CrossRefPubMedGoogle Scholar
  17. 17.
    Carver T, Harris SR, Berriman M et al (2012) Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data. Bioinformatics 28:464–469. CrossRefPubMedGoogle Scholar
  18. 18.
    Thorvaldsdottir H, Robinson JT, Mesirov JP (2013) Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14:178–192. CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Satria A. Kautsar
    • 1
  • Hernando G. Suarez Duran
    • 1
  • Marnix H. Medema
    • 1
    Email author
  1. 1.Bioinformatics GroupWageningen UniversityWageningenThe Netherlands

Personalised recommendations