Abstract
Identifying essential proteins is vital to understanding the minimum requirements for maintaining life. The correct identification of essential proteins contributes to guide the diagnosis of diseases and to identify new drug targets. The continuous advances of experimental methods contribute to generate and accumulate gene essentiality data that facilitates computational methods. Machine learning methods focused on predicting essential proteins have gained much traction in recent years. Among them, we can highlight artificial neural networks. Most of these methods make use of sequence and network-based features. In this chapter, we initially present a background related to artificial neural networks, which encompasses different neural network architectures. Then, we review the research papers that used artificial neural networks as a machine learning method for predicting essential proteins.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Acencio ML, Lemke N (2009) Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information. BMC Bioinform 10:290. https://doi.org/10.1186/1471-2105-10-290
Ashtiani M, Salehzadeh-Yazdi A, Razaghi-Moghadam Z, Hennig H, Wolkenhauer O, Mirzaie M, Jafari M (2018) A systematic survey of centrality measures for protein-protein interaction networks. BMC Syst Biol 12(1). https://doi.org/10.1186/s12918-018-0598-2
Bakar S, Taheri J, Zomaya A (2011) Identifying hub proteins and their essentiality from protein-protein interaction network. In: Proceedings—2011 11th IEEE international conference on bioinformatics and bioengineering, BIBE 2011, pp 266–269. https://doi.org/10.1109/BIBE.2011.67
Bakar S, Taheri J, Zomaya A (2014) Characterization of essential proteins based on network topology in proteins interaction networks. AIP Conf Proc 1602:36–42. https://doi.org/10.1063/1.4882463
Burkov A (2019) The hundred-page machine learning book. Andriy Burkov, http://themlbook.com/wiki/doku.php
Campos T, Korhonen P, Gasser R, Young N (2019) An evaluation of machine learning approaches for the prediction of essential genes in eukaryotes using protein sequence-derived features. Comput Struct Biotechnol J 17:785–796. https://doi.org/10.1016/j.csbj.2019.05.008
Chen WH, Minguez P, Lercher MJ, Bork P (2011) Ogee: an online gene essentiality database. NuclC Acids Res 40(D1):D901–D906
Cullen LM, Arndt GM (2005) Genome-wide screening for gene function using RNAi in mammalian cells. Immunol Cell Biol 83(3):217–223. https://doi.org/10.1111/j.1440-1711.2005.01332.x
Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control, Signals, Syst (MCSS) 2(4):303–314. https://doi.org/10.1007/BF02551274
Dai W, Chang Q, Peng W, Zhong J, Li Y (2020) Network embedding the protein–protein interaction network for human essential genes identification. Genes 11(2). https://doi.org/10.3390/genes11020153
Deng J, Deng L, Su S, Zhang M, Lin X, Wei L, Minai AA, Hassett DJ, Lu LJ (2011) Investigating the predictability of essential genes across distantly related organisms using an integrative approach. Nucl Acids Res 39(3):795–807. https://doi.org/10.1093/nar/gkq784
Duvenaud DK, Maclaurin D, Iparraguirre J, Bombarell R, Hirzel T, Aspuru-Guzik A, Adams RP (2015) Convolutional networks on graphs for learning molecular fingerprints. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems 28, Curran associates, Inc., pp 2224–2232. http://papers.nips.cc/paper/5954-convolutional-networks-on-graphs-for-learning-molecular-fingerprints.pdf
Evers B, Jastrzebski K, Heijmans J, Grernrum W, Beijersbergen R, Bernards R (2016) CRISPR knockout screening outperforms shRNA and CRISPRi in identifying essential genes. Nat Biotechnol 34(6):631–633. https://doi.org/10.1038/nbt.3536
Gers FA, Schmidhuber J, Cummins F (1999) Learning to forget: continual prediction with LSTM. In: IET conference proceedings, vol 5, pp 850–855
Gehring J, Auli M, Grangier D, Yarats D, Dauphin YN (2017) Convolutional sequence to sequence learning. Int Conf Mach Learn 3:2029–2042
Gong X, Fan S, Bilderbeck A, Li M, Pang H, Tao S (2008) Comparative analysis of essential genes and nonessential genes in Escherichia coli K12. Mol Genet Genomics 279(1):87–94. https://doi.org/10.1007/s00438-007-0298-x
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press. http://www.deeplearningbook.org
Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining—KDD ’16. ACM Press, San Francisco, California, USA, pp 855–864. https://doi.org/10.1145/2939672.2939754
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Jansen R, Greenbaum D, Gerstein M (2002) Relating whole-genome expression data with protein-protein interactions. Genome Res 12(1):37–46. https://doi.org/10.1101/gr.205602
King Jordan I, Rogozin IB, Wolf YI, Koonin EV (2002) Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res 12(6):962–968. https://doi.org/10.1101/gr.87702. Article published online before print in May 2002
Kondrashov F, Ogurtsov A, Kondrashov A (2004) Bioinformatical assay of human gene morbidity. Nucl Acids Res 32(5):1731–1737. https://doi.org/10.1093/nar/gkh330
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th international conference on neural information processing systems (NeurIPS). Curran Associates Inc., pp 1097–1105
LeCun Y, Bengio Y (1995) Convolutional networks for images, speech, and time series. In: Arbib MA (ed) The handbook of brain theory and neural networks. MIT Press, Cambridge, MA, USA, pp 1–14
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444. https://doi.org/10.1038/nature14539
Leevy JL, Khoshgoftaar TM, Bauder RA, Seliya N (2018) A survey on addressing high-class imbalance in big data. J Big Data 5(1):42. https://doi.org/10.1186/s40537-018-0151-6
Lei X, Yang X (2018) A new method for predicting essential proteins based on participation degree in protein complex and subgraph density. PLOS ONE 13(6):e0198,998. https://doi.org/10.1371/journal.pone.0198998
Lu Y, Deng J, Carson M, Lu H, Lu L (2014) Computational methods for the prediction of microbial essential genes. Curr Bioinform 9(2):89–101. https://doi.org/10.2174/1574893608999140109113434
Mobegi F, Zomer A, de Jonge M, van Hijum S (2017) Advances and perspectives in computational prediction of microbial gene essentiality. Brief Funct Genomics 16(2):70–79. https://doi.org/10.1093/bfgp/elv063
Mori H, Baba T, Yokoyama K, Takeuchi R, Nomura W, Makishi K, Otsuka Y, Dose H, Wanner BL (2015) Identification of essential genes and synthetic lethal gene combinations in Escherichia coli K-12. Methods Mol Biol 1279:45–65. https://doi.org/10.1007/978-1-4939-2398-4_4
Nigatu D, Sobetzko P, Yousef M, Henkel W (2017) Sequence-based information-theoretic features for gene essentiality prediction. BMC Bioinform 18(1):1–11. https://doi.org/10.1186/s12859-017-1884-5
Palaniappan K, Mukherjee S (2011) Predicting “essential” genes across microbial genomes: a machine learning approach. In: Proceedings—10th international conference on machine learning and applications, ICMLA 2011, vol 2, pp 189–194. https://doi.org/10.1109/ICMLA.2011.114
Peng C, Gao F (2014) Protein localization analysis of essential genes in prokaryotes. Scientific reports 4. https://doi.org/10.1038/srep06001
Peng C, Lin Y, Luo H, Gao F (2017) A comprehensive overview of online resources to identify and predict bacterial essential genes. Front Microbiol 8:1–13. https://doi.org/10.3389/fmicb.2017.02331
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536. https://doi.org/10.1038/323533a0
Sahoo G, Kumar Y (2012) Analysis of parametric & non parametric classifiers for classification technique using weka. Int J Inf Technol Comput Sci (IJITCS) 4(7):43
Seringhaus M, Paccanaro A, Borneman A, Snyder M, Gerstein M (2006) Predicting essential genes in fungal genomes. Genome Res 16(9):1126–1135. https://doi.org/10.1101/gr.5144106
Sobel I, Feldman G (1968) A 3 \(\times \) 3 isotropic gradient operator for image processing. In: Pattern classification and scene analysis. Wiley, pp 271–272
Specht DF (1990) Probabilistic neural networks. Neural Netw 3(1):109–118. https://doi.org/10.1016/0893-6080(90)90049-Q
Sun S, Cao Z, Zhu H, Zhao J (2019) A survey of optimization methods from a machine learning perspective. IEEE Trans Cybern 1–14
Tran D, Wang H, Torresani L, Ray J, LeCun Y, Paluri M (2018) A closer look at spatiotemporal convolutions for action recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, IEEE computer society, pp 6450–6459. https://doi.org/10.1109/CVPR.2018.00675
Veličković P, Karazija L, Lane ND, Bhattacharya S, Liberis E, Liò P, Chieh A, Bellahsen O, Vegreville M (2018) Cross-modal recurrent models for weight objective prediction from multimodal time-series data. In: Proceedings of the 12th EAI international conference on pervasive computing technologies for healthcare, association for computing machinery. New York, NY, USA, PervasiveHealth ’18, pp 178–186. https://doi.org/10.1145/3240925.3240937
Wang J, Peng W, Wu FX (2013) Computational approaches to predicting essential proteins: a survey. Proteomics-Clin Appl 7(1–2):181–192. https://doi.org/10.1002/prca.201200068
Xu L, Guo Z, Liu X (2020) Prediction of essential genes in prokaryote based on artificial neural network. Genes Genomics 42(1):97–106. https://doi.org/10.1007/s13258-019-00884-w
Yang L, Wang J, Wang H, Lv Y, Zuo Y, Li X, Jiang W (2014) Analysis and identification of essential genes in humans using topological properties and biological information. Gene 551(2):138–151. https://doi.org/10.1016/j.gene.2014.08.046
Zeng M, Li M, Wu FX, Li Y, Pan Y (2019) DeepEP: a deep learning framework for identifying essential proteins. BMC Bioinform 20. https://doi.org/10.1186/s12859-019-3076-y
Zhang R, Ou HY, Zhang CT (2004) Deg: a database of essential genes. Nucl Acids Res 32(suppl_1):D271–D272
Zhu J, Gong R, Zhu Q, He Q, Xu N, Xu Y, Cai M, Zhou X, Zhang Y, Zhou M (2018) Genome-wide determination of gene essentiality by transposon insertion sequencing in yeast pichia pastoris. Sci Rep 8(1). https://doi.org/10.1038/s41598-018-28217-z
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Belloze, K., Campos, L., Matias, R., Luques, I., Bezerra, E. (2020). A Review of Artificial Neural Networks for the Prediction of Essential Proteins. In: da Silva, F.A.B., Carels, N., Trindade dos Santos, M., Lopes, F.J.P. (eds) Networks in Systems Biology. Computational Biology, vol 32. Springer, Cham. https://doi.org/10.1007/978-3-030-51862-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-51862-2_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-51861-5
Online ISBN: 978-3-030-51862-2
eBook Packages: Computer ScienceComputer Science (R0)