Skip to main content

Predicting Monoterpene Indole Alkaloid-Related Genes from Expression Data with Artificial Neural Networks

  • Protocol
  • First Online:
Catharanthus roseus

Abstract

Elucidation of biological pathways leading to specialized metabolites remains a complex task. It is however a mandatory step to allow bioproduction into heterologous hosts. Many steps have already been identified using conventional approaches, enlarging the space of known possible chemical steps. In the recent past years, identification of missing steps has been fueled by the generation of genomic and transcriptomic data for nonmodel species. The analysis of gene expression profiles has revealed that in many cases, genes encoding enzymes involved in the same biosynthetic pathways are coexpressed across different tissue types and environmental conditions. Hence, coexpressed studies, either in the form of differential gene expression, gene coexpression network, or unsupervised clustering methods, have helped deciphering missing steps to complete knowledge on biosynthetic pathways. Already identified biosynthetic steps can be used as baits to capture the remaining unknown steps. The present protocol shows how supervised machine learning in the form of artificial neural networks (ANNs) can efficiently classify genes as specialized metabolism related or not according to their expression levels. Using Catharanthus roseus as an example, we show that ANN trained on a minimal set of bait genes results in many true positives (correctly predicted genes) while keeping false positives low (containing possible candidate genes).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. O’Connor SE, Maresh JJ (2006) Chemistry and biology of monoterpene indole alkaloid biosynthesis. Nat Prod Rep 23:532–547. https://doi.org/10.1039/b512615k

    Article  CAS  PubMed  Google Scholar 

  2. Pyne ME, Narcross L, Martin VJJ (2019) Engineering plant secondary metabolism in microbial systems. Plant Physiol 179:844–861. https://doi.org/10.1104/pp.18.01291

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Qu Y, Easson MEAM, Simionescu R et al (2018) Solution of the multistep pathway for assembly of corynanthean, strychnos, iboga, and aspidosperma monoterpenoid indole alkaloids from 19E-geissoschizine. Proc Natl Acad Sci U S A 115:3180–3185. https://doi.org/10.1073/pnas.1719979115

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Caputi L, Franke J, Farrow SC et al (2018) Missing enzymes in the biosynthesis of the anticancer drug vinblastine in Madagascar periwinkle. Science 360:1235–1239. https://doi.org/10.1126/science.aat4100

    Article  CAS  PubMed  Google Scholar 

  5. Szabó LF (2008) Rigorous biogenetic network for a group of indole alkaloids derived from strictosidine. Molecules 13:1875–1896. https://doi.org/10.3390/molecules13081875

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Dugé de Bernonville T, Papon N, Clastre M et al (2020) Identifying missing biosynthesis enzymes of plant natural products. Trends Pharmacol Sci 41:142–146. https://doi.org/10.1016/j.tips.2019.12.006

    Article  CAS  PubMed  Google Scholar 

  7. Payne RME, Xu D, Foureau E et al (2017) An NPF transporter exports a central monoterpene indole alkaloid intermediate from the vacuole. Nat Plants 3:16208. https://doi.org/10.1038/nplants.2016.208

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Baranwal M, Magner A, Elvati P et al (2020) A deep learning architecture for metabolic pathway prediction. Bioinformatics 36(8):2547–2553. https://doi.org/10.1093/bioinformatics/btz954

    Article  CAS  PubMed  Google Scholar 

  9. Kulmanov M, Hoehndorf R (2020) DeepGOPlus: improved protein function prediction from sequence. Bioinformatics 36:422–429. https://doi.org/10.1093/bioinformatics/btz595

    Article  CAS  PubMed  Google Scholar 

  10. Peng J, Xue H, Wei Z et al (2021) Integrating multi-network topology for gene function prediction using deep neural networks. Brief Bioinform 22(2):2096–2105. https://doi.org/10.1093/bib/bbaa036

    Article  CAS  PubMed  Google Scholar 

  11. Eetemadi A, Tagkopoulos I (2019) Genetic Neural Networks: an artificial neural network architecture for capturing gene expression relationships. Bioinformatics 35:2226–2234. https://doi.org/10.1093/bioinformatics/bty945

    Article  CAS  PubMed  Google Scholar 

  12. Gandomi AH, Roke DA (2015) Assessment of artificial neural network and genetic programming as predictive tools. Adv Eng Softw 88:63–72. https://doi.org/10.1016/j.advengsoft.2015.05.007

    Article  Google Scholar 

  13. Orr GB, Müller K-R (1998) Neural networks: tricks of the trade. Springer, Berlin Heidelberg

    Book  Google Scholar 

  14. Khan SH, Hayat M, Porikli F (2019) Regularization of deep neural networks with spectral dropout. Neural Netw 110:82–90. https://doi.org/10.1016/j.neunet.2018.09.009

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Dugé de Bernonville .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Dugé de Bernonville, T., Amor Stander, E., Dugé de Bernonville, G., Besseau, S., Courdavault, V. (2022). Predicting Monoterpene Indole Alkaloid-Related Genes from Expression Data with Artificial Neural Networks. In: Courdavault, V., Besseau, S. (eds) Catharanthus roseus. Methods in Molecular Biology, vol 2505. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2349-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-2349-7_10

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-2348-0

  • Online ISBN: 978-1-0716-2349-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics