Skip to main content

Advertisement

Log in

In silico design of novel aptamers utilizing a hybrid method of machine learning and genetic algorithm

  • Original Article
  • Published:
Molecular Diversity Aims and scope Submit manuscript

Abstract

Aptamers can be regarded as efficient substitutes for monoclonal antibodies in many diagnostic and therapeutic applications. Due to the tedious and prohibitive nature of SELEX (systematic evolution of ligands by exponential enrichment), the in silico methods have been developed to improve the enrichment processes rate. However, the majority of these methods did not show any effort in designing novel aptamers. Moreover, some target proteins may have not any binding RNA candidates in nature and a reductive mechanism is needed to generate novel aptamer pools among enormous possible combinations of nucleotide acids to be examined in vitro. We have applied a genetic algorithm (GA) with an embedded binding predictor fitness function to in silico design of RNA aptamers. As a case study of this research, all steps were accomplished to generate an aptamer pool against aminopeptidase N (CD13) biomarker. First, the model was developed based on sequential and structural features of known RNA–protein complexes. Then, utilizing RNA sequences involved in complexes with positive prediction results, as the first-generation, novel aptamers were designed and top-ranked sequences were selected. A 76-mer aptamer was identified with the highest fitness value with a 3 to 6 time higher score than parent oligonucleotides. The reliability of obtained sequences was confirmed utilizing docking and molecular dynamic simulation. The proposed method provides an important simplified contribution to the oligonucleotide–aptamer design process. Also, it can be an underlying ground to design novel aptamers against a wide range of biomarkers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Availability of data and materials

All the features and the dataset are available in the following link: http://rpinbase.com

Abbreviations

GA :

Genetic algorithm

CD13 :

Aminopeptidase N

SELEX :

Systematic evolution of ligands by exponential enrichment

MD :

Molecular dynamics

PseAAC :

Pseudo-amino acid composition

SSE :

Secondary structure elements

PS :

Parallel beta Sheets

APS :

Antiparallel beta sheets

PDB :

The Protein Data Bank

DSSP :

Define secondary structure of proteins

JPred :

The Protein Secondary Structure Prediction Server

AUC :

Area under the curve

AUPRC :

Area under the precision–recall curve

ACC :

Accuracy

F1-score :

Weighted average of precision and recall

SD :

Standard deviation

FORNA :

Force-directed RNA

SN :

Sensitivity

SP :

Specificity

PSA :

Prostate-specific antigen

HIV-1 :

Human immunodeficiency virus type 1

ERα :

Estrogen receptor alpha

HGF :

Hepatocyte growth factor

References

  1. Shui LJ, Meng Y, Huang C et al (2019) Aminopeptidase N expression in the endometrium could affect endometrial receptivity. Biochem Biophys Res Commun 514:469–474. https://doi.org/10.1016/j.bbrc.2019.04.174

    Article  CAS  PubMed  Google Scholar 

  2. Schreiber CL, Smith BD (2018) Molecular imaging of aminopeptidase N in cancer and angiogenesis. Contrast Media Mol Imaging 2018:15. https://doi.org/10.1155/2018/5315172

    Article  CAS  Google Scholar 

  3. Amin SA, Adhikari N, Jha T (2018) Design of aminopeptidase N Inhibitors as anti-cancer agents. J Med Chem 61:6468–6490. https://doi.org/10.1021/acs.jmedchem.7b00782

    Article  CAS  PubMed  Google Scholar 

  4. Wickström M, Larsson R, Nygren P, Gullbo J (2011) Aminopeptidase N (CD13) as a target for cancer chemotherapy. Cancer Sci 102:501–508. https://doi.org/10.1111/j.1349-7006.2010.01826.x

    Article  CAS  PubMed  Google Scholar 

  5. Gold L, Polisky B, Uhlenbeck O, Yarus M (1995) Diversify of oligonucleotide functions. Annu Rev Biochem 64:763–797. https://doi.org/10.1146/annurev.bi.64.070195.003555

    Article  CAS  PubMed  Google Scholar 

  6. Ponce AT, Hong KL (2019) A mini-review: clinical development and potential of aptamers for thrombotic events treatment and monitoring. Biomedicines 7:55. https://doi.org/10.3390/biomedicines7030055

    Article  CAS  PubMed Central  Google Scholar 

  7. Zhang GQ, Zhong LP, Yang N, Zhao YX (2019) Screening of aptamers and their potential application in targeted diagnosis and therapy of liver cancer. World J Gastroenterol 25:3359–3369. https://doi.org/10.3748/wjg.v25.i26.3359

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Ahmadi S, Rabiee N, Rabiee M (2019) Application of aptamer-based hybrid molecules in early diagnosis and treatment of diabetes mellitus: from the concepts towards the future. Curr Diabetes Rev 15:309–313. https://doi.org/10.2174/1573399814666180607075550

    Article  CAS  PubMed  Google Scholar 

  9. Ahmadi S, Arab Z, Safarkhani M et al (2020) Aptamer hybrid nanocomplexes as targeting components for antibiotic/gene delivery systems and diagnostics: a review. Int J Nanomedicine 15:4237–4256. https://doi.org/10.2147/IJN.S248736

    Article  PubMed  PubMed Central  Google Scholar 

  10. Tuerk C, Gold L (1990) Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 249:505–510. https://doi.org/10.1126/science.2200121

    Article  CAS  PubMed  Google Scholar 

  11. Ellington AD, Szostak JW (1990) In vitro selection of RNA molecules that bind specific ligands. Nature 346:818–822. https://doi.org/10.1038/346818a0

    Article  CAS  PubMed  Google Scholar 

  12. Chen CK, Kuo TL, Chan PC, Lin LY (2007) Subtractive SELEX against two heterogeneous target samples: numerical simulations and analysis. Comput Biol Med 37:750–759. https://doi.org/10.1016/j.compbiomed.2006.06.015

    Article  PubMed  Google Scholar 

  13. Abbasi K, Razzaghi P, Poso A et al (2020) Deep learning in drug target interaction prediction: current and future perspective. Curr Med Chem 27:1–14. https://doi.org/10.2174/0929867327666200907141016

    Article  CAS  Google Scholar 

  14. Molefe PF, Masamba P, Oyinloye BE et al (2018) Molecular application of aptamers in the diagnosis and treatment of cancer and communicable diseases. Pharmaceuticals 11:93. https://doi.org/10.3390/ph11040093

    Article  CAS  PubMed Central  Google Scholar 

  15. Ylera F, Lurz R, Erdmann VA, Fürste JP (2002) Selection of RNA aptamers to the alzheimer’s disease amyloid peptide. Biochem Biophys Res Commun 290:1583–1588. https://doi.org/10.1006/bbrc.2002.6354

    Article  CAS  PubMed  Google Scholar 

  16. Ulrich H, Wrenger C (2009) Disease-specific biomarker discovery by aptamers. Cytom Part A 75:727–733. https://doi.org/10.1002/cyto.a.20766

    Article  CAS  Google Scholar 

  17. Santosh B, Yadava PK (2014) Nucleic acid aptamers: research tools in disease diagnostics and therapeutics. Biomed Res Int 2014:13. https://doi.org/10.1155/2014/540451

    Article  Google Scholar 

  18. Patel KA, Chaudhary RK, Roy I (2018) RNA Aptamers Rescue mitochondrial dysfunction in a yeast model of huntington’s disease. Mol Ther Nucleic Acids 12:45–56. https://doi.org/10.1016/j.omtn.2018.04.010

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Maghsoudi S, Shahraki BT, Rabiee N et al (2019) Recent advancements in aptamer-bioconjugates: sharpening stones for breast and prostate cancers targeting. J Drug Deliv Sci Technol 53:101146. https://doi.org/10.1016/j.jddst.2019.101146

    Article  CAS  Google Scholar 

  20. Rabiee N, Kiani M, Bagherzadeh M, et al (2019) Aptamer-based nanostructures. Nanoparticle (NP)-Based Deliv Veh 1–7. doi https://doi.org/10.1088/2053-2571/ab01f6ch3

  21. Hooshmand SA, Jamalkandi SA, Alavi SM, Masoudi-Nejad A (2020) Distinguishing drug/non-drug-like small molecules in drug discovery using deep belief network. Mol Divers. https://doi.org/10.1007/s11030-020-10065-7

    Article  PubMed  PubMed Central  Google Scholar 

  22. Hooshmand SA, Zarei Ghobadi M, Hooshmand SE et al (2020) A multimodal deep learning-based drug repurposing approach for treatment of COVID-19. Mol Divers 1:3. https://doi.org/10.1007/s11030-020-10144-9

    Article  CAS  Google Scholar 

  23. Wang Y, Mao W (2010) Featurerank: a non-linear listwise approach with clustering and boosting. Proc - 2010 IEEE youth conf information, comput telecommun YC-ICT 2010 81–84. doi https://doi.org/10.1109/YCICT.2010.5713050

  24. Ross DT, Scherf U, Eisen MB et al (2000) Systematic variation in gene expression patterns in human cancer cell lines. Nat Genet 24:227–235. https://doi.org/10.1038/73432

    Article  CAS  PubMed  Google Scholar 

  25. Theodoridis S, Koutroumbas K (2001) Pattern recognition and neural networks. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 2049 LNAI:169–195. doi https://doi.org/10.1007/3-540-44673-7_8

  26. Ahmed Z, Amizadeh S, Bilenko M et al (2019) Machine learning at microsoft with ML .NET. Proc ACM SIGKDD int conf knowl discov data min 2448–2458. doi https://doi.org/10.1145/3292500.3330667

  27. Ray SS, Misra S (2019) Genetic algorithm for assigning weights to gene expressions using functional annotations. Comput Biol Med 104:149–162. https://doi.org/10.1016/j.compbiomed.2018.11.011

    Article  PubMed  Google Scholar 

  28. Kruppa J, Lepenies B, Jung K (2018) A genetic algorithm for simulating correlated binary data from biomedical research. Comput Biol Med 92:1–8. https://doi.org/10.1016/j.compbiomed.2017.10.023

    Article  PubMed  Google Scholar 

  29. El Fatmi A, Bekri MA, Benhlima S (2019) RNAknot: a new algorithm for RNA secondary structure prediction based on genetic algorithm and GRASP method. J Bioinform Comput Biol 17:17. https://doi.org/10.1142/S0219720019500318

    Article  CAS  Google Scholar 

  30. Thomas A, Barriere S, Broseus L et al (2019) GECKO is a genetic algorithm to classify and explore high throughput sequencing data. Commun Biol 2:222. https://doi.org/10.1038/s42003-019-0456-9

    Article  PubMed  PubMed Central  Google Scholar 

  31. Li B-Q, Zhang Y-C, Huang G-H et al (2014) Prediction of aptamer-target interacting pairs with pseudo-amino acid composition. PLoS ONE 9:e86729. https://doi.org/10.1371/journal.pone.0086729

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Zhang L, Zhang C, Gao R et al (2016) Prediction of aptamer-protein interacting pairs using an ensemble classifier in combination with various protein sequence attributes. BMC Bioinform 17:225. https://doi.org/10.1186/s12859-016-1087-5

    Article  CAS  Google Scholar 

  33. Yang Q, Jia C, Li T (2019) Prediction of aptamer–protein interacting pairs based on sparse autoencoder feature extraction and an ensemble classifier. Math Biosci 311:103–108. https://doi.org/10.1016/j.mbs.2019.01.009

    Article  CAS  PubMed  Google Scholar 

  34. Hoinka J, Berezhnoy A, Sauna ZE, et al (2014) AptaCluster—a method to cluster HT-SELEX aptamer pools and lessons from its application. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 8394 LNBI:115–128. https://doi.org/https://doi.org/10.1007/978-3-319-05269-4_9

  35. Alam KK, Chang JL, Burke DH (2015) FASTAptamer: a bioinformatic toolkit for high-throughput sequence analysis of combinatorial selections. Mol Ther Nucleic Acids 4:e230. https://doi.org/10.1038/mtna.2015.4

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Caroli J, Taccioli C, De La Fuente A et al (2016) APTANI: a computational tool to select aptamers through sequence-structure motif analysis of HT-SELEX data. Bioinformatics 32:161–164. https://doi.org/10.1093/bioinformatics/btv545

    Article  CAS  PubMed  Google Scholar 

  37. Jiang P, Meyer S, Hou Z et al (2014) MPBind: a meta-motif-based statistical framework and pipeline to predict binding potential of SELEX-derived aptamers. Bioinformatics 30:2665–2667. https://doi.org/10.1093/bioinformatics/btu348

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Dao P, Hoinka J, Takahashi M et al (2016) AptaTRACE elucidates RNA sequence-structure motifs from selection trends in HT-SELEX experiments. Cell Syst 3:62–70. https://doi.org/10.1016/j.cels.2016.07.003

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Hoinka J, Berezhnoy A, Dao P et al (2015) Large scale analysis of the mutational landscape in HT-SELEX improves aptamer discovery. Nucleic Acids Res 43:5699–5707. https://doi.org/10.1093/nar/gkv308

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Torkamanian-Afshar M, Lanjanian H, Nematzadeh S et al (2020) RPINBASE: an online toolbox to extract features for predicting RNA-protein interactions. Genomics 112:2623–2632. https://doi.org/10.1016/j.ygeno.2020.02.013

    Article  CAS  PubMed  Google Scholar 

  41. Berman HM, Westbrook J, Feng Z et al (2000) The protein data bank. Nucleic Acids Res 28:235–242. https://doi.org/10.1093/nar/28.1.235

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. El-Gebali S, Mistry J, Bateman A et al (2019) The Pfam protein families database in 2019. Nucleic Acids Res 47:D427–D432. https://doi.org/10.1093/nar/gky995

    Article  CAS  PubMed  Google Scholar 

  43. Hofacker IL (2003) Vienna RNA secondary structure server. Nucleic Acids Res 31:3429–3431. https://doi.org/10.1093/nar/gkg599

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Aldwairi M, Al-Hajasad B, Khamayseh Y (2014) A classifier system for predicting RNA secondary structure. Int J Bioinform Res Appl 10:307–320. https://doi.org/10.1504/IJBRA.2014.060764

    Article  CAS  PubMed  Google Scholar 

  45. Kabsch W, Sander C (1983) Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22:2577–2637. https://doi.org/10.1002/bip.360221211

    Article  CAS  PubMed  Google Scholar 

  46. Drozdetskiy A, Cole C, Procter J, Barton GJ (2015) JPred4: a protein secondary structure prediction server. Nucleic Acids Res 43:W389–W394. https://doi.org/10.1093/nar/gkv332

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Kloczkowski A, Ting KL, Jernigan RL, Garnier J (2002) Combining the GOR V algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence. Proteins Struct Funct Genet 49:154–166. https://doi.org/10.1002/prot.10181

    Article  CAS  PubMed  Google Scholar 

  48. Zhang S, Ding S, Wang T (2011) High-accuracy prediction of protein structural class for low-similarity sequences based on predicted secondary structure. Biochimie 93:710–714. https://doi.org/10.1016/j.biochi.2011.01.001

    Article  CAS  PubMed  Google Scholar 

  49. Ding S, Zhang S, Li Y, Wang T (2012) A novel protein structural classes prediction method based on predicted secondary structure. Biochimie 94:1166–1171. https://doi.org/10.1016/j.biochi.2012.01.022

    Article  CAS  PubMed  Google Scholar 

  50. Liu T, Jia C (2010) A high-accuracy protein structural class prediction algorithm using predicted secondary structural information. J Theor Biol 267:272–275. https://doi.org/10.1016/j.jtbi.2010.09.007

    Article  CAS  PubMed  Google Scholar 

  51. Kösesoy I, Gök M, Öz C (2018) PROSES: a web server for sequence-based protein encoding. J Comput Biol 25:1120–1122. https://doi.org/10.1089/cmb.2018.0049

    Article  CAS  PubMed  Google Scholar 

  52. Liu H, Motoda H (1998) Feature selection for knowledge discovery and data mining. Springer Sci Media New York. https://doi.org/10.1007/978-1-4615-5689-3

    Article  Google Scholar 

  53. Masoudi-Sobhanzadeh Y, Motieghader H, Masoudi-Nejad A (2019) FeatureSelect: a software for feature selection based on machine learning approaches. BMC Bioinform 20:170. https://doi.org/10.1186/s12859-019-2754-0

    Article  Google Scholar 

  54. Yan Y, Zhang D, Zhou P et al (2017) HDOCK: a web server for protein-protein and protein-DNA/RNA docking based on a hybrid strategy. Nucleic Acids Res 45:W365–W373. https://doi.org/10.1093/nar/gkx407

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Di Tommaso P, Moretti S, Xenarios I et al (2011) T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension. Nucleic Acids Res 39:W13–W17. https://doi.org/10.1093/nar/gkr245

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Kerpedjiev P, Hammer S, Hofacker IL (2015) Forna (force-directed RNA): simple and effective online RNA secondary structure diagrams. Bioinformatics 31:3377–3379. https://doi.org/10.1093/bioinformatics/btv372

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Belinskaia DA, Avdonin PV, Avdonin PP et al (2019) Rational in silico design of aptamers for organophosphates based on the example of paraoxon. Comput Biol Chem 80:452–462. https://doi.org/10.1016/j.compbiolchem.2019.05.004

    Article  CAS  PubMed  Google Scholar 

  58. Santini BL, Zúñiga-Bustos M, Vidal-Limon A et al (2020) In silico design of novel mutant anti-muc1 aptamers for targeted cancer therapy. J Chem Inf Model 60:786–793. https://doi.org/10.1021/acs.jcim.9b00756

    Article  CAS  PubMed  Google Scholar 

  59. Sabri MZ, Abdul Hamid AA, Sayed Hitam SM, Abdul Rahim MZ (2019) In silico screening of aptamers configuration against hepatitis B surface antigen. Adv Bioinform 2019:12. https://doi.org/10.1155/2019/6912914

    Article  CAS  Google Scholar 

  60. Niazi S, Purohit M, Sonawani A, Niazi JH (2018) Revealing the molecular interactions of aptamers that specifically bind to the extracellular domain of HER2 cancer biomarker protein: an in silico assessment. J Mol Graph Model 83:112–121. https://doi.org/10.1016/j.jmgm.2018.06.003

    Article  CAS  PubMed  Google Scholar 

  61. Savory N, Abe K, Sode K, Ikebukuro K (2010) Selection of DNA aptamer against prostate specific antigen using a genetic algorithm and application to sensing. Biosens Bioelectron 26:1386–1391. https://doi.org/10.1016/j.bios.2010.07.057

    Article  CAS  PubMed  Google Scholar 

  62. Savory N, Lednor D, Tsukakoshi K et al (2013) In silico maturation of binding-specificity of DNA aptamers against Proteus mirabilis. Biotechnol Bioeng 110:2573–2580. https://doi.org/10.1002/bit.24922

    Article  CAS  PubMed  Google Scholar 

  63. Sánchez-Luque FJ, Stich M, Manrubia S et al (2014) Efficient HIV-1 inhibition by a 16 nt-long RNA aptamer designed by combining in vitro selection and in silico optimisation strategies. Sci Rep 4:1–10. https://doi.org/10.1038/srep06242

    Article  CAS  Google Scholar 

  64. Ahirwar R, Nahar S, Aggarwal S et al (2016) In silico selection of an aptamer to estrogen receptor alpha using computational docking employing estrogen response elements as aptamer-alike molecules. Sci Rep 6:1–11. https://doi.org/10.1038/srep21285

    Article  CAS  Google Scholar 

  65. Yokoyama T, Tsukakoshi K, Yoshida W et al (2017) Development of HGF-binding aptamers with the combination of G4 promoter-derived aptamer selection and in silico maturation. Biotechnol Bioeng 114:2196–2203. https://doi.org/10.1002/bit.26354

    Article  CAS  PubMed  Google Scholar 

  66. Bavi R, Liu Z, Han Z et al (2019) In silico designed RNA aptamer against epithelial cell adhesion molecule for cancer cell imaging. Biochem Biophys Res Commun 509:937–942. https://doi.org/10.1016/j.bbrc.2019.01.028

    Article  CAS  PubMed  Google Scholar 

Download references

Funding

No funding.

Author information

Authors and Affiliations

Authors

Contributions

MT-A was involved in conceptualization, implementation, formal analysis, investigation, and writing, editing, and revising the manuscript. SN carried out conceptualization, implementation, formal analysis, and investigation. MT, AN, and HL took part in conceptualization and writing, editing, and revising the manuscript. AM-N performed conceptualization, supervision, project administration, and writing, editing, and revising the manuscript. All authors have read and approved the manuscript.

Corresponding author

Correspondence to Ali Masoudi-Nejad.

Ethics declarations

Conflict of interest

The authors have no competing interests.

Supplementary Information

Below is the link to the electronic supplementary material.

11030_2021_10192_MOESM1_ESM.xlsx

Supplementary File 1: Macromolecules features. Excel file containing features of RNA and protein macromolecules (XLSX 100 KB)

Supplementary File 3: Aptamer sequences. Excel file containing RNA sequences against CD13 (XLSX 298 KB)

Supplementary File 2: ML.net results of five datasets. Word file containing results of each dataset (DOCX 913 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Torkamanian-Afshar, M., Nematzadeh, S., Tabarzad, M. et al. In silico design of novel aptamers utilizing a hybrid method of machine learning and genetic algorithm. Mol Divers 25, 1395–1407 (2021). https://doi.org/10.1007/s11030-021-10192-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11030-021-10192-9

Keywords

Navigation