Abstract
With an increasing amount of data of protein-nucleic acid interactions, several machine learning-based methods have been developed to predict protein-nucleic acid interactions. However, most of these methods are classification models either for finding binding sites within a sequence or for determining whether a pair of sequences interacts. In this paper we propose a generative model for constructing nucleic acids binding to a target protein using a long short-term memory (LSTM) neural network. Nucleic acid sequences generated by the model showed high affinity for several target proteins. The generative model will be useful for constructing an initial library of nucleic acid sequences for in vitro selection of nucleic acid sequences that bind to a target protein with high affinity and specificity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Walia, R.R., Xue, L.C., Wilkins, K., El-Manzalawy, Y., Dobbs, D., Honavar, V.: RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins. PLoS ONE 9, e97725 (2014)
Tuvshinjargal, N., Lee, W., Park, B., Han, K.: PRIdictor: protein-RNA interaction predictor. BioSystems 139, 17–22 (2016)
Choi, D., Park, B., Chae, H., Lee, W., Han, K.: Predicting protein-binding regions in RNA using nucleotide profiles and compositions. BMC Syst. Biol. 11(Suppl. 2), 16 (2017)
Alipanahi, B., Delong, A., Weirauch, M.T., Frey, B.J.: Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–838 (2015)
Hassanzadeh, H.R., Wang, M.D.: DeeperBind: enhancing prediction of sequence specificities of DNA binding proteins. In: Proceedings of IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 178–183 (2016)
Jolma, A., Yan, J., Whitington, T., et al.: DNA-Binding specificities of human transcription factors. Cell 152, 327–339 (2013)
Acknowledgments
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Ministry of Science and ICT (2015R1A1A3A04001243, 2017R1E1A1A03069921) and the Ministry of Education (2016R1A6A3A11931497).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Im, J., Park, B., Han, K. (2018). Finding Protein-Binding Nucleic Acid Sequences Using a Long Short-Term Memory Neural Network. In: Huang, DS., Jo, KH., Zhang, XL. (eds) Intelligent Computing Theories and Application. ICIC 2018. Lecture Notes in Computer Science(), vol 10955. Springer, Cham. https://doi.org/10.1007/978-3-319-95933-7_91
Download citation
DOI: https://doi.org/10.1007/978-3-319-95933-7_91
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95932-0
Online ISBN: 978-3-319-95933-7
eBook Packages: Computer ScienceComputer Science (R0)