Skip to main content

Advertisement

Log in

LDAEXC: LncRNA–Disease Associations Prediction with Deep Autoencoder and XGBoost Classifier

  • Original research article
  • Published:
Interdisciplinary Sciences: Computational Life Sciences Aims and scope Submit manuscript

Abstract

Numerous scientific evidences have revealed that long non-coding RNAs (lncRNAs) are involved in the progression of human complex diseases and biological life activities. Therefore, identifying novel and potential disease-related lncRNAs is helpful to diagnosis, prognosis and therapy of many human complex diseases. Since traditional laboratory experiments are cost and time-consuming, a great quantity of computer algorithms have been proposed for predicting the relationships between lncRNAs and diseases. However, there are still much room for the improvement. In this paper, we introduce an accurate framework named LDAEXC to infer LncRNA–Disease Associations with deep autoencoder and XGBoost Classifier. LDAEXC utilizes different similarity views of lncRNAs and human diseases to construct features for each data sources. Then, the reduced features are obtained by feeding the constructed feature vectors into a deep autoencoder, and at last an XGBoost classifier is leveraged to calculate the latent lncRNA–disease-associated scores using reduced features. The fivefold cross-validation experiments on four datasets showed that LDAEXC reached AUC scores of 0.9676 ± 0.0043, 0.9449 ± 0.022, 0.9375 ± 0.0331 and 0.9556 ± 0.0134, respectively, significantly higher than other advanced similar computer methods. Extensive experiment results and case studies of two complex diseases (colon and breast cancers) further indicated the practicability and excellent prediction performance of LDAEXC in inferring unknown lncRNA–disease associations.

Graphical Abstract

TLDAEXC utilizes disease semantic similarity, lncRNA expression similarity, and Gaussian interaction profile kernel similarity of lncRNAs and diseases for feature construction. The constructed features are fed to a deep autoencoder to extract reduced features, and an XGBoost classifier is used to predict the lncRNA–disease associations based on the reduced features. The fivefold and tenfold cross-validation experiments on a benchmark dataset showed that LDAEXC could achieve AUC scores of 0.9676 and 0.9682, respectively, significantly higher than other state-of-the-art similar methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

All datasets used in our experiments are downloaded from public available sources, which has been described in Sect. 2.1

References

  1. Ponting CP, Oliver PL, Reik W (2009) Evolution and functions of long noncoding rnas. Cell 136(4):629–641. https://doi.org/10.1016/j.cell.2009.02.006

    Article  CAS  PubMed  Google Scholar 

  2. Xiao B, Zhang X, Li Y, Tang Z, Yang S, Mu Y, Cui W, Ao H, Li K (2009) Identification, bioinformatic analysis and expression profiling of candidate mrna-like non-coding rnas in sus scrofa. J Genet Genom 36(12):695–702. https://doi.org/10.1016/S1673-8527(08)60162-9

    Article  CAS  Google Scholar 

  3. Chen X, Sun YZ, Guan N, Qu J, Li JQ (2019) Computational models for lncrna function prediction and functional similarity calculation. Brief Funct Genom 18(1):58–82. https://doi.org/10.1093/bfgp/ely031

    Article  CAS  Google Scholar 

  4. Lukiw WJ, Handley P, Wong L, Mclachlan DRC (1992) Bc200 rna in normal human neocortex, non-alzheimer dementia (nad), and senile dementia of the alzheimer type (ad). Neurochem Res 17(6):591–597. https://doi.org/10.1007/BF00968788

    Article  CAS  PubMed  Google Scholar 

  5. Gupta RA, Shah N, Wang KC, Kim J, Horlings HM (2010) Long non-coding rna hotair reprograms chromatin state to promote cancer metastasis. Nature 464(7291):1071–1076. https://doi.org/10.1038/nature08975

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Sun M, Xia R, Jin F, Xu T, Liu Z, De W, Liu X (2014) Downregulated long noncoding rna meg3 is associated with poor prognosis and promotes cell proliferation in gastric cancer. Tumor Biol 35:1065–1073. https://doi.org/10.1007/s13277-013-1142-z

    Article  CAS  Google Scholar 

  7. Chen X, Yan CC, Zhang X, You Z-H (2016) Long non-coding rnas and complex diseases: from experimental results to computational models. Brief Bioinform 18(4):558–576. https://doi.org/10.1093/bib/bbw060

    Article  CAS  PubMed Central  Google Scholar 

  8. Jia F, Jiang S, Wu Z, Liang Y (2022) Research on lncrna and disease associations prediction base on data mining. J Phys Conf Ser 2219(1):012029. https://doi.org/10.1088/1742-6596/2219/1/012029

    Article  Google Scholar 

  9. Lu C, Yang M, Luo F, Wu F-X, Li M, Pan Y, Li Y, Wang J (2018) Prediction of lncrna-disease associations based on inductive matrix completion. Bioinformatics 34(19):3357–3364. https://doi.org/10.1093/bioinformatics/bty327

    Article  CAS  PubMed  Google Scholar 

  10. Fu G, Wang J, Domeniconi C, Yu G (2018) Matrix factorization-based data fusion for the prediction of lncrna-disease associations. Bioinformatics 34(9):1529–1537. https://doi.org/10.1093/bioinformatics/btx794

    Article  CAS  PubMed  Google Scholar 

  11. Wang Y, Yu G, Wang J, Fu G, Guo M, Domeniconi C (2020) Weighted matrix factorization on multi-relational data for lncrna-disease association prediction. Methods 173:32–43. https://doi.org/10.1016/j.ymeth.2019.06.015

    Article  CAS  PubMed  Google Scholar 

  12. Zeng M, Lu C, Fei Z, Wu F-X, Li Y, Wang J, Li M (2021) Dmflda: a deep learning framework for predicting lncrna-disease associations. IEEE/ACM Trans Comput Biol Bioinf 18(6):2353–2363. https://doi.org/10.1109/TCBB.2020.2983958

    Article  CAS  Google Scholar 

  13. Sun J, Shi H, Wang Z, Zhang C, Liu L, Wang L, He W, Hao D, Liu S, Zhou M (2014) Inferring novel lncrna-disease associations based on a random walk model of a lncrna functional similarity network. Mol BioSyst 10(8):2074–2081. https://doi.org/10.1039/c3mb70608g

    Article  CAS  PubMed  Google Scholar 

  14. Yu G, Fu G, Lu C, Ren Y, Wang J (2017) Brwlda: bi-random walks for predicting lncrna-disease associations. Oncotarget 8(36):60429. https://doi.org/10.18632/oncotarget.19588

    Article  PubMed  PubMed Central  Google Scholar 

  15. Xie G, Huang B, Sun Y, Wu C, Han Y (2021) Rwsf-blp: a novel lncrna-disease association prediction model using random walk-based multi-similarity fusion and bidirectional label propagation. Mol Genet Genom 296:473–483. https://doi.org/10.1007/s00438-021-01764-3

    Article  CAS  Google Scholar 

  16. Wang L, Shang M, Dai Q, He P-A (2022) Prediction of lncrna-disease association based on a laplace normalized random walk with restart algorithm on heterogeneous networks. BMC Bioinform 23(1):1–20. https://doi.org/10.1186/s12859-021-04538-1

    Article  CAS  Google Scholar 

  17. Wang L, Xuan Z, Zhou S, Kuang L, Pei T (2019) A novel model for predicting lncrna-disease associations based on the lncrna-mirna-disease interactive network. Curr Bioinform 14(3):269–278. https://doi.org/10.2174/1574893613666180703105258

    Article  CAS  Google Scholar 

  18. Li J, Zhang S, Liu T, Ning C, Zhang Z, Zhou W (2020) Neural inductive matrix completion with graph convolutional networks for mirna-disease association prediction. Bioinformatics 36(8):2538–2546. https://doi.org/10.1093/bioinformatics/btz965

    Article  CAS  PubMed  Google Scholar 

  19. Wang L, You Z-H, Huang Y-A, Huang D-S, Chan KC (2020) An efficient approach based on multi-sources information to predict circrna-disease associations using deep convolutional neural network. Bioinformatics 36(13):4038–4046. https://doi.org/10.1093/bioinformatics/btz825

    Article  CAS  PubMed  Google Scholar 

  20. Chen X, Li T-H, Zhao Y, Wang C-C, Zhu C-C (2020) Deep-belief network for predicting potential mirna-disease associations. Brief Bioinform. https://doi.org/10.1093/bib/bbaa186

    Article  PubMed  PubMed Central  Google Scholar 

  21. Lan W, Wu X, Chen Q, Peng W, Wang J, Chen YP (2022) Ganlda: graph attention network for lncrna-disease associations prediction. Neurocomputing 469:384–393. https://doi.org/10.1016/j.neucom.2020.09.094

    Article  Google Scholar 

  22. Xuan P, Pan S, Zhang T, Liu Y, Sun H (2019) Graph convolutional network and convolutional neural network based method for predicting lncrna-disease associations. Cells. https://doi.org/10.3390/cells8091012

    Article  PubMed  PubMed Central  Google Scholar 

  23. Yang Q, Li X (2021) Bigan: Lncrna-disease association prediction based on bidirectional generative adversarial network. BMC Bioinform 22:357. https://doi.org/10.1186/s12859-021-04273-7

    Article  CAS  Google Scholar 

  24. Wu Q-W, Xia J-F, Ni J-C, Zheng C-H (2021) Gaerf: predicting lncrna-disease associations by graph auto-encoder and random forest. Brief Bioinform. https://doi.org/10.1093/bib/bbaa391

    Article  PubMed  PubMed Central  Google Scholar 

  25. Shi Z, Zhang H, Jin C, Quan X, Yin Y (2021) A representation learning model based on variational inference and graph autoencoder for predicting lncrna-disease associations. BMC Bioinform 22:136. https://doi.org/10.1186/s12859-021-04073-z

    Article  CAS  Google Scholar 

  26. Wang L, Zhong C (2022) ggatlda: lncrna-disease association prediction based on graph-level graph attention network. BMC Bioinform 23:11. https://doi.org/10.1186/s12859-021-04548-z

    Article  CAS  Google Scholar 

  27. Chen G, Wang Z, Wang D, Qiu C, Liu M, Chen X, Zhang Q, Yan G, Cui Q (2012) Lncrnadisease: a database for long-non-coding rna-associated diseases. Nucleic Acids Res 41:983–986. https://doi.org/10.1093/nar/gks1099

    Article  CAS  Google Scholar 

  28. Parkinson H, Sansone S-A, Sarkans U, Rocca-Serra P, Brazma A (2006) Chapter 13—arrayexpress: a public repository for microarray data, pp 95–102. https://doi.org/10.1016/B978-012164730-8/50198-2

  29. Chen X, Yan G-Y (2013) Novel human lncrna-disease association inference based on lncrna expression profiles. Bioinformatics 29(20):2617–2624. https://doi.org/10.1093/bioinformatics/btt426

    Article  CAS  PubMed  Google Scholar 

  30. Huang Y-A, Chen X, You Z-H, Huang D-S, Chan KC (2016) Ilncsim: improved lncrna functional similarity calculation model. Oncotarget 7(18):25902–25914. https://doi.org/10.18632/oncotarget.8296

    Article  PubMed  PubMed Central  Google Scholar 

  31. Ding L, Wang M, Sun D, Li A (2018) Tpglda: novel prediction of associations between lncrnas and diseases via lncrna-disease-gene tripartite graph. Sci Rep 8(1):1065. https://doi.org/10.1038/s41598-018-19357-3

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Wang D, Wang J, Lu M, Song F, Cui Q (2010) Inferring the human microrna functional similarity and functional network based on microrna-associated diseases. Bioinformatics 26(13):1644–1650. https://doi.org/10.1093/bioinformatics/btq241

    Article  CAS  PubMed  Google Scholar 

  33. Chen X, Yang J-R, Guan N-N, Li J-Q (2018) Grmda: graph regression for mirna-disease association prediction. Front Physiol 9:92. https://doi.org/10.3389/fphys.2018.00092

    Article  PubMed  PubMed Central  Google Scholar 

  34. van Laarhoven T, Nabuurs SB, Marchiori E (2011) Gaussian interaction profile kernels for predicting drug-target interaction. Bioinformatics 27(21):3036–3043. https://doi.org/10.1093/bioinformatics/btr500

    Article  CAS  PubMed  Google Scholar 

  35. Altan G, Kutlu Y (2020) Generalization performance of deep autoencoder kernels for identification of abnormalities on electrocardiograms. In: Deep learning for data analytics, pp 37–62. https://doi.org/10.1016/B978-0-12-819764-6.00004-1

  36. Altan G, Kutlu Y (2018) Hessenberg elm autoencoder kernel for deep learning. J Eng Technol Appl Sci 3(2):141–151. https://doi.org/10.30931/jetas.450252

    Article  Google Scholar 

  37. Van Trees HL, Bell KL (2007) Improved bounds on the local mean square error and the bias of parameter estimators, pp 202–203

  38. Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794. https://doi.org/10.1145/2939672.2939785

  39. Altan G, Kutlu Y, Allahverdi N (2020) Deep learning on computerized analysis of chronic obstructive pulmonary disease. IEEE J Biomed Health Inform 24(5):1344–1350. https://doi.org/10.1109/JBHI.2019.2931395

    Article  Google Scholar 

  40. Altan G (2022) Deepoct: an explainable deep learning architecture to analyze macular edema on oct images. Eng Sci Technol 34:101091. https://doi.org/10.1016/j.jestch.2021.101091

    Article  Google Scholar 

  41. Asmis TR, Saltz L (2008) Systemic therapy for colon cancer. Gastroenterol Clin N Am 37(1):287–295. https://doi.org/10.1016/j.gtc.2007.12.005

    Article  Google Scholar 

  42. Akram M, Iqbal M, Daniyal M, Khan AU (2017) Awareness and current knowledge of breast cancer. Biol Res 50(1):33. https://doi.org/10.1186/s40659-017-0140-9

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Funding

This work was supported by the National Natural Science Foundation of China under Grant 62172028 and Grant 61772197.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minzhu Xie.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, C., Xie, M. LDAEXC: LncRNA–Disease Associations Prediction with Deep Autoencoder and XGBoost Classifier. Interdiscip Sci Comput Life Sci 15, 439–451 (2023). https://doi.org/10.1007/s12539-023-00573-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12539-023-00573-z

Keywords

Navigation