Skip to main content

The Algorithms of Predicting Bacterial Essential Genes and NcRNAs by Machine Learning

  • Conference paper
  • First Online:
Proceedings of the 11th International Conference on Computer Engineering and Networks

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 808))

  • 2255 Accesses

Abstract

Essential genes are indispensable for biological survival. Thus it is of great significance to identify and study essential genes. A machine learning method, K-Nearest Neighbor, is used for development of predicting essential bacterial genes. The homologous features, including sequence homology and functional homology, of the bacterial genomes are extracted for determining essential genes. Based on the features, we use K-Nearest Neighbor algorithm for determining of gene function. And we tune the minimum matching parameter (K) in the essential gene predicted model for building an optimal model of the Escherichia coli specificity model. The corresponding optimal parameter (K) is then extended to other bacterial essential genes predicting models. After cross validation, the highest accuracy is 0.89 while K between 5 and 7. Therefore, the features we extracted can increase the accuracy of the bacterial essential gene prediction. In the premise, we found that the prediction accuracy of the prediction model based on K-Nearest Neighbor was not significantly different in different evolutionary distances between organisms in the database and the investigated species. That means the machine learning model can be extended to more distant species. It wills have a better predictive performance for predicting essential genes of distant species than the usual sequence-based methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 469.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 599.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 599.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Juhas, M., Eberl, L., Glass, J.I.: Essence of life: Essential genes of minimal genomes. Trends Cell Biol. 21(10), 562–568 (2011)

    Article  Google Scholar 

  2. Hu, W., Sillaots, S., Lemieux, S., et al.: Essential gene identification and drug target prioritization in aspergillus fumigatus. PLoS Pathog. 3(3), e24 (2007)

    Article  Google Scholar 

  3. Wu, G., Yan, Q., Jones, J.A., et al.: Metabolic burden: cornerstones in synthetic biology and metabolic engineering applications. Trends Biotechnol. 34(8), 652–664 (2016)

    Article  Google Scholar 

  4. Koonin, E.V.: Comparative genomics, minimal gene-sets and the last universal common ancestor. Nat. Rev. Microbiol. 1(2), 127–136 (2003)

    Article  MathSciNet  Google Scholar 

  5. Luo, H., Lin, Y., Liu, T., et al.: DEG 15, an update of the database of essential genes that includes built-in analysis tools. Nucleic Acids Res. 49(D1), 677–686 (2020)

    Article  Google Scholar 

  6. Rancati, G., Moffat, J., Typas, A., et al.: Emerging and evolving concepts in gene essentiality. Nat. Rev. Genet. 19(1), 34–49 (2018)

    Article  Google Scholar 

  7. Salama, N.R., Shepherd, B., Falkow, S.: Global transposon mutagenesis and essential gene analysis of helicobacter pylori. J. Bacteriol. 186(23), 7926–7935 (2004)

    Article  Google Scholar 

  8. Gerdes, S.Y., Scholle, M.D., Campbell, J.W., et al.: Experimental determination and system level analysis of essential genes in Escherichia Coli MG1655. J. Bacteriol. 19(185), 5673–5684 (2003)

    Article  Google Scholar 

  9. Juhas, M., Stark, M., von Mering, C., et al.: High confidence prediction of essential genes in burkholderia cenocepacia. PLoS ONE 6(7), e40064 (2012)

    Article  Google Scholar 

  10. Aromolaran, O., Beder, T., Oswald, M., Oyelade, J., et al.: Essential gene prediction in drosophila melanogaster using machine learning approaches based on sequence and functional features. Comput. Struct. Biotechnol. 18, 612–621 (2020)

    Article  Google Scholar 

  11. Nigatu, D., Sobetzko, P., Yousef, M., Henkel, W.: Sequence-based information-theoretic features for gene essentiality prediction. BMC Bioinf. 1(18), 473 (2017)

    Article  Google Scholar 

  12. Lei, X., Yang, X., Fujita, H.: Random walk based method to identify essential proteins by integrating network topology and biological characteristics. Knowl-Based Syst. 167, 53–67 (2019)

    Article  Google Scholar 

  13. Wei, W., Ning, L.W., Ye, Y.N., et al.: Geptop: a gene essentiality prediction tool for sequenced bacterial genomes based on orthology and phylogeny. PLoS ONE 8(8), e72343 (2013)

    Article  Google Scholar 

Download references

Acknowledgement

This study was jointly funded by the National Natural Science Foundation of China (61803112), the Science and Technology Foundation of Guizhou Province (2018–1133, 2019–2811), the Science and Technology Foundation of Guiyang (2017–30-15), the Science and Technology Fund project of Guizhou Health Commission (gzwjkj2019–1-40), and the Cell and Gene Engineering Innovative Research Groups of Guizhou Province (KY-2016–031).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuannong Ye .

Editor information

Editors and Affiliations

Ethics declarations

Conflicts of Interest

The authors declare that they have no conflicts of interest to report regarding the present study.

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ye, Y., Liang, D., Zeng, Z. (2022). The Algorithms of Predicting Bacterial Essential Genes and NcRNAs by Machine Learning. In: Liu, Q., Liu, X., Chen, B., Zhang, Y., Peng, J. (eds) Proceedings of the 11th International Conference on Computer Engineering and Networks. Lecture Notes in Electrical Engineering, vol 808. Springer, Singapore. https://doi.org/10.1007/978-981-16-6554-7_54

Download citation

Publish with us

Policies and ethics