Abstract
Aptamers can be regarded as efficient substitutes for monoclonal antibodies in many diagnostic and therapeutic applications. Due to the tedious and prohibitive nature of SELEX (systematic evolution of ligands by exponential enrichment), the in silico methods have been developed to improve the enrichment processes rate. However, the majority of these methods did not show any effort in designing novel aptamers. Moreover, some target proteins may have not any binding RNA candidates in nature and a reductive mechanism is needed to generate novel aptamer pools among enormous possible combinations of nucleotide acids to be examined in vitro. We have applied a genetic algorithm (GA) with an embedded binding predictor fitness function to in silico design of RNA aptamers. As a case study of this research, all steps were accomplished to generate an aptamer pool against aminopeptidase N (CD13) biomarker. First, the model was developed based on sequential and structural features of known RNA–protein complexes. Then, utilizing RNA sequences involved in complexes with positive prediction results, as the first-generation, novel aptamers were designed and top-ranked sequences were selected. A 76-mer aptamer was identified with the highest fitness value with a 3 to 6 time higher score than parent oligonucleotides. The reliability of obtained sequences was confirmed utilizing docking and molecular dynamic simulation. The proposed method provides an important simplified contribution to the oligonucleotide–aptamer design process. Also, it can be an underlying ground to design novel aptamers against a wide range of biomarkers.
Similar content being viewed by others
Availability of data and materials
All the features and the dataset are available in the following link: http://rpinbase.com
Abbreviations
- GA :
-
Genetic algorithm
- CD13 :
-
Aminopeptidase N
- SELEX :
-
Systematic evolution of ligands by exponential enrichment
- MD :
-
Molecular dynamics
- PseAAC :
-
Pseudo-amino acid composition
- SSE :
-
Secondary structure elements
- PS :
-
Parallel beta Sheets
- APS :
-
Antiparallel beta sheets
- PDB :
-
The Protein Data Bank
- DSSP :
-
Define secondary structure of proteins
- JPred :
-
The Protein Secondary Structure Prediction Server
- AUC :
-
Area under the curve
- AUPRC :
-
Area under the precision–recall curve
- ACC :
-
Accuracy
- F1-score :
-
Weighted average of precision and recall
- SD :
-
Standard deviation
- FORNA :
-
Force-directed RNA
- SN :
-
Sensitivity
- SP :
-
Specificity
- PSA :
-
Prostate-specific antigen
- HIV-1 :
-
Human immunodeficiency virus type 1
- ERα :
-
Estrogen receptor alpha
- HGF :
-
Hepatocyte growth factor
References
Shui LJ, Meng Y, Huang C et al (2019) Aminopeptidase N expression in the endometrium could affect endometrial receptivity. Biochem Biophys Res Commun 514:469–474. https://doi.org/10.1016/j.bbrc.2019.04.174
Schreiber CL, Smith BD (2018) Molecular imaging of aminopeptidase N in cancer and angiogenesis. Contrast Media Mol Imaging 2018:15. https://doi.org/10.1155/2018/5315172
Amin SA, Adhikari N, Jha T (2018) Design of aminopeptidase N Inhibitors as anti-cancer agents. J Med Chem 61:6468–6490. https://doi.org/10.1021/acs.jmedchem.7b00782
Wickström M, Larsson R, Nygren P, Gullbo J (2011) Aminopeptidase N (CD13) as a target for cancer chemotherapy. Cancer Sci 102:501–508. https://doi.org/10.1111/j.1349-7006.2010.01826.x
Gold L, Polisky B, Uhlenbeck O, Yarus M (1995) Diversify of oligonucleotide functions. Annu Rev Biochem 64:763–797. https://doi.org/10.1146/annurev.bi.64.070195.003555
Ponce AT, Hong KL (2019) A mini-review: clinical development and potential of aptamers for thrombotic events treatment and monitoring. Biomedicines 7:55. https://doi.org/10.3390/biomedicines7030055
Zhang GQ, Zhong LP, Yang N, Zhao YX (2019) Screening of aptamers and their potential application in targeted diagnosis and therapy of liver cancer. World J Gastroenterol 25:3359–3369. https://doi.org/10.3748/wjg.v25.i26.3359
Ahmadi S, Rabiee N, Rabiee M (2019) Application of aptamer-based hybrid molecules in early diagnosis and treatment of diabetes mellitus: from the concepts towards the future. Curr Diabetes Rev 15:309–313. https://doi.org/10.2174/1573399814666180607075550
Ahmadi S, Arab Z, Safarkhani M et al (2020) Aptamer hybrid nanocomplexes as targeting components for antibiotic/gene delivery systems and diagnostics: a review. Int J Nanomedicine 15:4237–4256. https://doi.org/10.2147/IJN.S248736
Tuerk C, Gold L (1990) Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 249:505–510. https://doi.org/10.1126/science.2200121
Ellington AD, Szostak JW (1990) In vitro selection of RNA molecules that bind specific ligands. Nature 346:818–822. https://doi.org/10.1038/346818a0
Chen CK, Kuo TL, Chan PC, Lin LY (2007) Subtractive SELEX against two heterogeneous target samples: numerical simulations and analysis. Comput Biol Med 37:750–759. https://doi.org/10.1016/j.compbiomed.2006.06.015
Abbasi K, Razzaghi P, Poso A et al (2020) Deep learning in drug target interaction prediction: current and future perspective. Curr Med Chem 27:1–14. https://doi.org/10.2174/0929867327666200907141016
Molefe PF, Masamba P, Oyinloye BE et al (2018) Molecular application of aptamers in the diagnosis and treatment of cancer and communicable diseases. Pharmaceuticals 11:93. https://doi.org/10.3390/ph11040093
Ylera F, Lurz R, Erdmann VA, Fürste JP (2002) Selection of RNA aptamers to the alzheimer’s disease amyloid peptide. Biochem Biophys Res Commun 290:1583–1588. https://doi.org/10.1006/bbrc.2002.6354
Ulrich H, Wrenger C (2009) Disease-specific biomarker discovery by aptamers. Cytom Part A 75:727–733. https://doi.org/10.1002/cyto.a.20766
Santosh B, Yadava PK (2014) Nucleic acid aptamers: research tools in disease diagnostics and therapeutics. Biomed Res Int 2014:13. https://doi.org/10.1155/2014/540451
Patel KA, Chaudhary RK, Roy I (2018) RNA Aptamers Rescue mitochondrial dysfunction in a yeast model of huntington’s disease. Mol Ther Nucleic Acids 12:45–56. https://doi.org/10.1016/j.omtn.2018.04.010
Maghsoudi S, Shahraki BT, Rabiee N et al (2019) Recent advancements in aptamer-bioconjugates: sharpening stones for breast and prostate cancers targeting. J Drug Deliv Sci Technol 53:101146. https://doi.org/10.1016/j.jddst.2019.101146
Rabiee N, Kiani M, Bagherzadeh M, et al (2019) Aptamer-based nanostructures. Nanoparticle (NP)-Based Deliv Veh 1–7. doi https://doi.org/10.1088/2053-2571/ab01f6ch3
Hooshmand SA, Jamalkandi SA, Alavi SM, Masoudi-Nejad A (2020) Distinguishing drug/non-drug-like small molecules in drug discovery using deep belief network. Mol Divers. https://doi.org/10.1007/s11030-020-10065-7
Hooshmand SA, Zarei Ghobadi M, Hooshmand SE et al (2020) A multimodal deep learning-based drug repurposing approach for treatment of COVID-19. Mol Divers 1:3. https://doi.org/10.1007/s11030-020-10144-9
Wang Y, Mao W (2010) Featurerank: a non-linear listwise approach with clustering and boosting. Proc - 2010 IEEE youth conf information, comput telecommun YC-ICT 2010 81–84. doi https://doi.org/10.1109/YCICT.2010.5713050
Ross DT, Scherf U, Eisen MB et al (2000) Systematic variation in gene expression patterns in human cancer cell lines. Nat Genet 24:227–235. https://doi.org/10.1038/73432
Theodoridis S, Koutroumbas K (2001) Pattern recognition and neural networks. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 2049 LNAI:169–195. doi https://doi.org/10.1007/3-540-44673-7_8
Ahmed Z, Amizadeh S, Bilenko M et al (2019) Machine learning at microsoft with ML .NET. Proc ACM SIGKDD int conf knowl discov data min 2448–2458. doi https://doi.org/10.1145/3292500.3330667
Ray SS, Misra S (2019) Genetic algorithm for assigning weights to gene expressions using functional annotations. Comput Biol Med 104:149–162. https://doi.org/10.1016/j.compbiomed.2018.11.011
Kruppa J, Lepenies B, Jung K (2018) A genetic algorithm for simulating correlated binary data from biomedical research. Comput Biol Med 92:1–8. https://doi.org/10.1016/j.compbiomed.2017.10.023
El Fatmi A, Bekri MA, Benhlima S (2019) RNAknot: a new algorithm for RNA secondary structure prediction based on genetic algorithm and GRASP method. J Bioinform Comput Biol 17:17. https://doi.org/10.1142/S0219720019500318
Thomas A, Barriere S, Broseus L et al (2019) GECKO is a genetic algorithm to classify and explore high throughput sequencing data. Commun Biol 2:222. https://doi.org/10.1038/s42003-019-0456-9
Li B-Q, Zhang Y-C, Huang G-H et al (2014) Prediction of aptamer-target interacting pairs with pseudo-amino acid composition. PLoS ONE 9:e86729. https://doi.org/10.1371/journal.pone.0086729
Zhang L, Zhang C, Gao R et al (2016) Prediction of aptamer-protein interacting pairs using an ensemble classifier in combination with various protein sequence attributes. BMC Bioinform 17:225. https://doi.org/10.1186/s12859-016-1087-5
Yang Q, Jia C, Li T (2019) Prediction of aptamer–protein interacting pairs based on sparse autoencoder feature extraction and an ensemble classifier. Math Biosci 311:103–108. https://doi.org/10.1016/j.mbs.2019.01.009
Hoinka J, Berezhnoy A, Sauna ZE, et al (2014) AptaCluster—a method to cluster HT-SELEX aptamer pools and lessons from its application. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 8394 LNBI:115–128. https://doi.org/https://doi.org/10.1007/978-3-319-05269-4_9
Alam KK, Chang JL, Burke DH (2015) FASTAptamer: a bioinformatic toolkit for high-throughput sequence analysis of combinatorial selections. Mol Ther Nucleic Acids 4:e230. https://doi.org/10.1038/mtna.2015.4
Caroli J, Taccioli C, De La Fuente A et al (2016) APTANI: a computational tool to select aptamers through sequence-structure motif analysis of HT-SELEX data. Bioinformatics 32:161–164. https://doi.org/10.1093/bioinformatics/btv545
Jiang P, Meyer S, Hou Z et al (2014) MPBind: a meta-motif-based statistical framework and pipeline to predict binding potential of SELEX-derived aptamers. Bioinformatics 30:2665–2667. https://doi.org/10.1093/bioinformatics/btu348
Dao P, Hoinka J, Takahashi M et al (2016) AptaTRACE elucidates RNA sequence-structure motifs from selection trends in HT-SELEX experiments. Cell Syst 3:62–70. https://doi.org/10.1016/j.cels.2016.07.003
Hoinka J, Berezhnoy A, Dao P et al (2015) Large scale analysis of the mutational landscape in HT-SELEX improves aptamer discovery. Nucleic Acids Res 43:5699–5707. https://doi.org/10.1093/nar/gkv308
Torkamanian-Afshar M, Lanjanian H, Nematzadeh S et al (2020) RPINBASE: an online toolbox to extract features for predicting RNA-protein interactions. Genomics 112:2623–2632. https://doi.org/10.1016/j.ygeno.2020.02.013
Berman HM, Westbrook J, Feng Z et al (2000) The protein data bank. Nucleic Acids Res 28:235–242. https://doi.org/10.1093/nar/28.1.235
El-Gebali S, Mistry J, Bateman A et al (2019) The Pfam protein families database in 2019. Nucleic Acids Res 47:D427–D432. https://doi.org/10.1093/nar/gky995
Hofacker IL (2003) Vienna RNA secondary structure server. Nucleic Acids Res 31:3429–3431. https://doi.org/10.1093/nar/gkg599
Aldwairi M, Al-Hajasad B, Khamayseh Y (2014) A classifier system for predicting RNA secondary structure. Int J Bioinform Res Appl 10:307–320. https://doi.org/10.1504/IJBRA.2014.060764
Kabsch W, Sander C (1983) Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22:2577–2637. https://doi.org/10.1002/bip.360221211
Drozdetskiy A, Cole C, Procter J, Barton GJ (2015) JPred4: a protein secondary structure prediction server. Nucleic Acids Res 43:W389–W394. https://doi.org/10.1093/nar/gkv332
Kloczkowski A, Ting KL, Jernigan RL, Garnier J (2002) Combining the GOR V algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence. Proteins Struct Funct Genet 49:154–166. https://doi.org/10.1002/prot.10181
Zhang S, Ding S, Wang T (2011) High-accuracy prediction of protein structural class for low-similarity sequences based on predicted secondary structure. Biochimie 93:710–714. https://doi.org/10.1016/j.biochi.2011.01.001
Ding S, Zhang S, Li Y, Wang T (2012) A novel protein structural classes prediction method based on predicted secondary structure. Biochimie 94:1166–1171. https://doi.org/10.1016/j.biochi.2012.01.022
Liu T, Jia C (2010) A high-accuracy protein structural class prediction algorithm using predicted secondary structural information. J Theor Biol 267:272–275. https://doi.org/10.1016/j.jtbi.2010.09.007
Kösesoy I, Gök M, Öz C (2018) PROSES: a web server for sequence-based protein encoding. J Comput Biol 25:1120–1122. https://doi.org/10.1089/cmb.2018.0049
Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Springer Sci Media New York. https://doi.org/10.1007/978-1-4615-5689-3
Masoudi-Sobhanzadeh Y, Motieghader H, Masoudi-Nejad A (2019) FeatureSelect: a software for feature selection based on machine learning approaches. BMC Bioinform 20:170. https://doi.org/10.1186/s12859-019-2754-0
Yan Y, Zhang D, Zhou P et al (2017) HDOCK: a web server for protein-protein and protein-DNA/RNA docking based on a hybrid strategy. Nucleic Acids Res 45:W365–W373. https://doi.org/10.1093/nar/gkx407
Di Tommaso P, Moretti S, Xenarios I et al (2011) T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension. Nucleic Acids Res 39:W13–W17. https://doi.org/10.1093/nar/gkr245
Kerpedjiev P, Hammer S, Hofacker IL (2015) Forna (force-directed RNA): simple and effective online RNA secondary structure diagrams. Bioinformatics 31:3377–3379. https://doi.org/10.1093/bioinformatics/btv372
Belinskaia DA, Avdonin PV, Avdonin PP et al (2019) Rational in silico design of aptamers for organophosphates based on the example of paraoxon. Comput Biol Chem 80:452–462. https://doi.org/10.1016/j.compbiolchem.2019.05.004
Santini BL, Zúñiga-Bustos M, Vidal-Limon A et al (2020) In silico design of novel mutant anti-muc1 aptamers for targeted cancer therapy. J Chem Inf Model 60:786–793. https://doi.org/10.1021/acs.jcim.9b00756
Sabri MZ, Abdul Hamid AA, Sayed Hitam SM, Abdul Rahim MZ (2019) In silico screening of aptamers configuration against hepatitis B surface antigen. Adv Bioinform 2019:12. https://doi.org/10.1155/2019/6912914
Niazi S, Purohit M, Sonawani A, Niazi JH (2018) Revealing the molecular interactions of aptamers that specifically bind to the extracellular domain of HER2 cancer biomarker protein: an in silico assessment. J Mol Graph Model 83:112–121. https://doi.org/10.1016/j.jmgm.2018.06.003
Savory N, Abe K, Sode K, Ikebukuro K (2010) Selection of DNA aptamer against prostate specific antigen using a genetic algorithm and application to sensing. Biosens Bioelectron 26:1386–1391. https://doi.org/10.1016/j.bios.2010.07.057
Savory N, Lednor D, Tsukakoshi K et al (2013) In silico maturation of binding-specificity of DNA aptamers against Proteus mirabilis. Biotechnol Bioeng 110:2573–2580. https://doi.org/10.1002/bit.24922
Sánchez-Luque FJ, Stich M, Manrubia S et al (2014) Efficient HIV-1 inhibition by a 16 nt-long RNA aptamer designed by combining in vitro selection and in silico optimisation strategies. Sci Rep 4:1–10. https://doi.org/10.1038/srep06242
Ahirwar R, Nahar S, Aggarwal S et al (2016) In silico selection of an aptamer to estrogen receptor alpha using computational docking employing estrogen response elements as aptamer-alike molecules. Sci Rep 6:1–11. https://doi.org/10.1038/srep21285
Yokoyama T, Tsukakoshi K, Yoshida W et al (2017) Development of HGF-binding aptamers with the combination of G4 promoter-derived aptamer selection and in silico maturation. Biotechnol Bioeng 114:2196–2203. https://doi.org/10.1002/bit.26354
Bavi R, Liu Z, Han Z et al (2019) In silico designed RNA aptamer against epithelial cell adhesion molecule for cancer cell imaging. Biochem Biophys Res Commun 509:937–942. https://doi.org/10.1016/j.bbrc.2019.01.028
Funding
No funding.
Author information
Authors and Affiliations
Contributions
MT-A was involved in conceptualization, implementation, formal analysis, investigation, and writing, editing, and revising the manuscript. SN carried out conceptualization, implementation, formal analysis, and investigation. MT, AN, and HL took part in conceptualization and writing, editing, and revising the manuscript. AM-N performed conceptualization, supervision, project administration, and writing, editing, and revising the manuscript. All authors have read and approved the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests.
Supplementary Information
Below is the link to the electronic supplementary material.
11030_2021_10192_MOESM1_ESM.xlsx
Supplementary File 1: Macromolecules features. Excel file containing features of RNA and protein macromolecules (XLSX 100 KB)
Rights and permissions
About this article
Cite this article
Torkamanian-Afshar, M., Nematzadeh, S., Tabarzad, M. et al. In silico design of novel aptamers utilizing a hybrid method of machine learning and genetic algorithm. Mol Divers 25, 1395–1407 (2021). https://doi.org/10.1007/s11030-021-10192-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11030-021-10192-9