Abstract
Identifying which genes are involved in particular biological processes is relevant to understand the structure and function of a genome. A number of techniques have been proposed that aim to annotate genes, i.e., identify unknown biological associations between biological processes and genes. The ultimate goal of these techniques is to narrow down the search for promising candidates to carry out further studies through in-vivo experiments. This paper presents an approach for the in-silico prediction of functional gene annotations. It uses existing knowledge body of gene annotations of a given genome and the topological properties of its gene co-expression network, to train a supervised machine learning model that is designed to discover unknown annotations. The approach is applied to Oryza Sativa Japonica (a variety of rice). Our results show that the topological properties help in obtaining a more precise prediction for annotating genes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abeysinghe, S., Wu, J., Sooriyabandara, M., Abeysekera, M., Xu, T., Wang, C.: Topological properties of medium voltage electricity distribution networks. Appl. Energy 210, 1101–1112 (2018)
Alanis Lobato, G.: Exploitation of complex network topology for link prediction in biological interactomes (2014)
Alanis-Lobato, G., Cannistraci, C.V., Ravasi, T.: Exploitation of genetic interaction network topology for the prediction of epistatic behavior. Genomics 102(4), 202–208 (2013)
Aoki, Y., Okamura, Y., Tadaka, S., Kinoshita, K., Obayashi, T.: ATTED-II in 2016: a plant coexpression database towards lineage-specific coexpression. Plant Cell Physiol. 57(1) (2016)
Barabási, A.-L., Gulbahce, N., Loscalzo, J.: Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 12(1), 56–68 (2011)
Benstead-Hume, G., Wooller, S.K., Dias, S., Woodbine, L., Carr, A.M., Pearl, F.M.G.: Biological network topology features predict gene dependencies in cancer cell lines. Systems Biology (2019, preprint)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)
Jiang, B., Claramunt, C.: Topological analysis of urban street networks. Environ. Plan. 31(1), 151–162 (2004)
Mudge, J.M., Harrow, J.: The state of play in higher eukaryote gene annotation. Nat. Rev. Genet. 17(12), 758–772 (2016)
Naaman, R., Cohen, K., Louzoun, Y.: Edge sign prediction based on a combination of network structural topology and sign propagation. J. Complex Netw. 7(1), 54–66 (2019)
Obayashi, T., Aoki, Y., Tadaka, S., Kagaya, Y., Kinoshita, K.: ATTED-II in 2018: a plant coexpression database based on investigation of the statistical property of the mutual rank index. Plant Cell Physiol. 59(1) (2018)
Obayashi, T., Kinoshita, K.: Rank of correlation coefficient as a comparable measure for biological significance of gene coexpression. DNA Res. 16(5), 249–260 (2009)
Obayashi, T., Kinoshita, K.: COXPRESdb: a database to compare gene coexpression in seven model animals. Nucleic Acids Res. 39(Database), D1016–D1022 (2011)
Obayashi, T., Okamura, Y., Ito, S., Tadaka, S., Aoki, Y., Shirota, M., Kinoshita, K.: ATTED-II in 2014: evaluation of gene coexpression in agriculturally important plants. Plant Cell Physiol. 55(1) (2014)
Oti, M., van Reeuwijk, J., Huynen, M.A., Brunner, H.G.: Conserved co-expression for candidate disease gene prioritization. BMC Bioinform. 9(1), 208 (2008)
Ranganathan, S., Gribskov, M.R., Nakai, K., Schönbach, C.: Encyclopedia of Bioinformatics and Computational Biology (2019). OCLC: 1052465484
Rust, A.G., Mongin, E., Birney, E.: Genome annotation techniques: new approaches and challenges. Drug Discov. Today 7(11), S70–S76 (2002)
Sakai, H., Lee, S.S., Tanaka, T., Numa, H., Kim, J., Kawahara, Y., Wakimoto, H., Yang, C., Iwamoto, M., Abe, T., Yamada, Y., Muto, A., Inokuchi, H., Ikemura, T., Matsumoto, T., Sasaki, T., Itoh, T.: Rice annotation project database (RAP-DB): an integrative and interactive database for rice genomics. Plant Cell Physiol. 54(2) (2013)
Santolini, M., Barabási, A.-L.: Predicting perturbation patterns from the topology of biologicalnetworks. Proc. Natl. Acad. Sci. 115(27), E6375–E6383 (2018)
Shannon, P.: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13(11), 2498–2504 (2003)
Stuart, J.M.: A gene-coexpression network for global discovery of conserved genetic modules. Science 302(5643), 249–255 (2003)
Tan, F., Xia, Y., Zhu, B.: Link prediction in complex networks: a mutual information perspective. PLoS ONE 9(9), e107056 (2014)
van Dam, S., Võsa, U., van der Graaf, A., Franke, L., de Magalhães, J.P.: Gene co-expression analysis for functional classification and gene–disease predictions. Briefings Bioinform. (2017)
Vandepoele, K., Quimbaya, M., Casneuf, T., De Veylder, L., Van de Peer, Y.: Unraveling transcriptional control in arabidopsis using cis-regulatory elements and coexpression networks. Plant Physiol. 150(2), 535–546 (2009)
Yandell, M., Ence, D.: A beginner’s guide to eukaryotic genome annotation. Nat. Rev. Genet. 13(5), 329–342 (2012)
Zhang, H., Zhao, P., Gao, J., Yao, X.-M.: The analysis of the properties of bus network topology in Beijing basing on complex networks. Math. Problems Eng. 1–6, 2013 (2013)
Zhou, Y., Young, J.A., Santrosyan, A., Chen, K., Yan, S.F., Winzeler, E.A.: In silico gene function prediction using ontology-based pattern identification. Bioinformatics 21(7), 1237–1245 (2005)
Acknowledgments
The authors would like to thank the anonymous referees for their helpful comments. This work was funded by the OMICAS program: Optimización Multiescala In-silico de Cultivos AgrÃcolas Sostenibles (Infraestructura y Validación en Arroz y Caña de Azúcar), sponsored within the Colombian Scientific Ecosystem by The World Bank, Colciencias, Icetex, the Colombian Ministry of Education and the Colombian Ministry of Industry and Turism under Grant FP44842-217-2018.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Romero, M., Finke, J., Quimbaya, M., Rocha, C. (2020). In-silico Gene Annotation Prediction Using the Co-expression Network Structure. In: Cherifi, H., Gaito, S., Mendes, J., Moro, E., Rocha, L. (eds) Complex Networks and Their Applications VIII. COMPLEX NETWORKS 2019. Studies in Computational Intelligence, vol 882. Springer, Cham. https://doi.org/10.1007/978-3-030-36683-4_64
Download citation
DOI: https://doi.org/10.1007/978-3-030-36683-4_64
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36682-7
Online ISBN: 978-3-030-36683-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)