Abstract
RNA pseudouridine modification exists in different RNA types of many species, and it has a significant role in regulating the expression of biological processes. To understand the functional mechanisms for RNA pseudouridine sites, accurate identification of pseudouridine sites in RNA sequences is essential. Although several fast and inexpensive computational methods have been proposed, the challenge of improving recognition accuracy and generalization is still existing. This study proposed a novel ensemble predictor called PsoStack-PseU for improved RNA pseudouridine sites prediction. After analyzing the nucleotide composition preferences between RNA pseudouridine site sequences, two feature representations were determined and fed into the stacking ensemble framework. Then using five tree-based machine learning classifiers as base-classifiers construct 30-dimensional RNA profiles to represent RNA sequences, and using the PSO algorithm searches the weights of RNA profiles to further enhance the representation. A Logistic Regression classifier was used as a meta-classifier to complete the final predictions. Compared to the most advanced predictors, the performance of PsoStack-PseU is superior on both cross-validation and the independent test. Based on the PsoStack-PseU predictor, a free and easy-to-operate web server has been established, which will be a powerful tool for pseudouridine site identification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ge, J., Yu, Y.-T.: RNA Pseudouridylation: new Insights into an Old Modification. Trends Biochem. Sci. 38, 210–218 (2013)
Davis, D.R., Veltri, C.A., Nielsen, L.: An RNA model system for investigation of Pseudouridine stabilization of the codon-anticodon interaction in TRNA Lys, TRNA His and TRNA Tyr. J. Biomol. Struct. Dyn. 15, 1121–1132 (1998)
Lovejoy, A.F., Riordan, D.P., Brown, P.O.: Transcriptome-wide mapping of Pseudouridines: Pseudouridine synthases modify specific MRNAs in S Cerevisiae. PLoS ONE 9, e110799 (2014)
Carlile, T.M., Rojas-Duran, M.F., Zinshteyn, B., Shin, H., Bartoli, K.M., Gilbert, W.V.: Pseudouridine profiling reveals regulated MRNA Pseudouridylation in yeast and human cells. Nature 515, 143–146 (2014)
Schwartz, S., et al.: Transcriptome-wide mapping reveals widespread dynamic-regulated Pseudouridylation of NcRNA and MRNA. Cell 159, 148–162 (2014)
Li, X., et al.: Chemical pulldown reveals dynamic Pseudouridylation of the mammalian transcriptome. Nat. Chem. Biol. 11, 592–597 (2015)
Li, Y.-H., Zhang, G., Cui, Q.: PPUS: a web server to predict PUS-specific Pseudouridine sites: table 1. Bioinformatics 31, 3362–3364 (2015)
Chen, W., Tang, H., Ye, J., Lin, H., Chou, K.-C.: IRNA-PseU: identifying RNA pseudouridine sites. Molecular Therapy Nucleic Acids 5, e332 (2016)
He, J., Fang, T., Zhang, Z., Huang, B., Zhu, X., Xiong, Y.: PseUI: Pseudouridine sites identification based on RNA sequence information. BMC Bioinformatics 19, 306 (2018)
Tahir, M., Tayara, H., Chong, K.T.: IPseU-CNN: identifying RNA Pseudouridine sites using convolutional neural networks. Molecular Therapy - Nucleic Acids 16, 463–470 (2019)
Liu, K., Chen, W., Lin, H.: XG-PseU: an extreme gradient boosting based method for identifying Pseudouridine sites. Mol. Genet. Genomics 295, 13–21 (2020)
Lv, Z., Zhang, J., Ding, H., Zou, Q.: RF-PseU: a random forest predictor for RNA Pseudouridine sites. Front. Bioeng. Biotechnol. 8, 134 (2020)
Wang, D., Tan, D., Liu, L.: Particle swarm optimization algorithm: an overview. Soft. Comput. 22, 387–408 (2018)
Zhang, Y., Wang, S., Ji, G.: A comprehensive survey on particle swarm optimization algorithm and its applications. Math. Probl. Eng. 2015, 1–38 (2015)
Chen, W., Lv, H., Nie, F., Lin, H.: I6mA-Pred: identifying DNA N6-methyladenine sites in the rice genome. Bioinformatics 35, 2796–2800 (2019)
Funding
This work was supported in part by funds from the Key Research Project of Colleges and Universities of Henan Province (No.22A520013, No. 23B520004), the Key Science and Technology Development Program of Henan Province (No. 232102210020, No. 202102210144), the Training Program of Young Backbone Teachers in Colleges and Universities of Henan Province (No. 2019GGJS132).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, X., Li, P., Han, L., Wang, R. (2023). A Stacking-Based Ensemble Learning Predictor Combined with Particle Swarm Optimizer for Identifying RNA Pseudouridine Sites. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science, vol 14088. Springer, Singapore. https://doi.org/10.1007/978-981-99-4749-2_44
Download citation
DOI: https://doi.org/10.1007/978-981-99-4749-2_44
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4748-5
Online ISBN: 978-981-99-4749-2
eBook Packages: Computer ScienceComputer Science (R0)