Reverse Engineering Gene Regulatory Networks Using Sampling and Boosting Techniques

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10358)

Abstract

Reverse engineering gene regulatory networks (GRNs), also known as network inference, refers to the process of reconstructing GRNs from gene expression data. Biologists model a GRN as a directed graph in which nodes represent genes and links show regulatory relationships between the genes. By predicting the links to infer a GRN, biologists can gain a better understanding of regulatory circuits and functional elements in cells. Existing supervised GRN inference methods work by building a feature-based classifier from gene expression data and using the classifier to predict the links in GRNs. Observing that GRNs are sparse graphs with few links between nodes, we propose here to use under-sampling, over-sampling and boosting techniques to enhance the prediction performance. Experimental results on different datasets demonstrate the good performance of the proposed approach and its superiority over the existing methods.

Keywords

Graph mining Sampling methods Boosting techniques Supervised learning Applications in biology and medicine 

References

  1. 1.
    Alhammady, H., Ramamohanarao, K.: Using emerging patterns to construct weighted decision trees. IEEE Trans. Knowl. Data Eng. 18(7), 865–876 (2006)CrossRefGoogle Scholar
  2. 2.
    Altaf-Ul-Amin, M., Afendi, F.M., Kiboi, S.K., Kanaya, S.: Systems biology in the context of big data and networks. In: BioMed Research International 2014 (2014)Google Scholar
  3. 3.
    Altaf-Ul-Amin, M., Katsuragi, T., Sato, T., Kanaya, S.: A glimpse to background and characteristics of major molecular biological networks. In: BioMed Research International 2015 (2015)Google Scholar
  4. 4.
    Batista, G.E., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newslett. 6(1), 20–29 (2004)CrossRefGoogle Scholar
  5. 5.
    Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press, New York (1984)MATHGoogle Scholar
  6. 6.
    Bunkhumpornpat, C., Sinapiromsaran, K., Lursinsap, C.: Safe-Level-SMOTE: safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 475–482. Springer, Heidelberg (2009). doi:10.1007/978-3-642-01307-2_43 CrossRefGoogle Scholar
  7. 7.
    Deng, H., Runger, G.: Gene selection with guided regularized random forest. Pattern Recogn. 46(12), 3483–3489 (2013)CrossRefGoogle Scholar
  8. 8.
    Freund, Y., Schapire, R.E.: A desicion-theoretic generalization of on-line learning and an application to boosting. In: Vitányi, P. (ed.) EuroCOLT 1995. LNCS, vol. 904, pp. 23–37. Springer, Heidelberg (1995). doi:10.1007/3-540-59119-2_166 CrossRefGoogle Scholar
  9. 9.
    Gillani, Z., Akash, M., Rahaman, M., Chen, M.: CompareSVM: supervised, support vector machine (SVM) inference of gene regularity networks. BMC Bioinformatics 15, 395 (2014). http://dx.doi.org/10.1186/s12859-014-0395-x CrossRefGoogle Scholar
  10. 10.
    He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRefGoogle Scholar
  11. 11.
    He, H., Ma, Y.: Imbalanced Learning: Foundations, Algorithms, and Applications. Wiley, Piscataway (2013)CrossRefMATHGoogle Scholar
  12. 12.
    Ideker, T., Krogan, N.J.: Differential network biology. Mol. Syst. Biol. 8(1), 565 (2012)Google Scholar
  13. 13.
    Kibinge, N., Ono, N., Horie, M., Sato, T., Sugiura, T., Altaf-Ul-Amin, M., Saito, A., Kanaya, S.: Integrated pathway-based transcription regulation network mining and visualization based on gene expression profiles. J. Biomed. Inform. 61, 194–202 (2016)CrossRefGoogle Scholar
  14. 14.
    Leclerc, R.D.: Survival of the sparsest: robust gene networks are parsimonious. Mol. Syst. Biol. 4(1), 213 (2008)Google Scholar
  15. 15.
    Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 39(2), 539–550 (2009)CrossRefGoogle Scholar
  16. 16.
    Margolin, A.A., Wang, K., Lim, W.K., Kustagi, M., Nemenman, I., Califano, A.: Reverse engineering cellular networks. Nat. Protoc. 1(2), 662–671 (2006)CrossRefGoogle Scholar
  17. 17.
    Mordelet, F., Vert, J.P.: SIRENE: supervised inference of regulatory networks. Bioinformatics 24(16), i76–i82 (2008). http://bioinformatics.oxfordjournals.org/content/24/16/i76.abstract CrossRefGoogle Scholar
  18. 18.
    Patel, N., Wang, J.T.L.: Semi-supervised prediction of gene regulatory networks using machine learning algorithms. J. Biosci. 40(4), 731–740 (2015). http://dx.doi.org/10.1007/s12038-015-9558-9 CrossRefGoogle Scholar
  19. 19.
    Schaffter, T., Marbach, D., Floreano, D.: GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods. Bioinformatics 27(16), 2263–2270 (2011)CrossRefGoogle Scholar
  20. 20.
    Turki, T., Bassett, W., Wang, J.T.L.: A learning framework to improve unsupervised gene network inference. In: Perner, P. (ed.) MLDM 2016. LNCS (LNAI), vol. 9729, pp. 28–42. Springer, Cham (2016). doi:10.1007/978-3-319-41920-6_3 CrossRefGoogle Scholar
  21. 21.
    Turki, T., Ihsan, M., Turki, N., Zhang, J., Roshan, U., Wei, Z.: Top-k parametrized boost. In: Prasath, R., O’Reilly, P., Kathirvalavakumar, T. (eds.) MIKE 2014. LNCS (LNAI), vol. 8891, pp. 91–98. Springer, Cham (2014). doi:10.1007/978-3-319-13817-6_10 Google Scholar
  22. 22.
    Turki, T., Wang, J.T.L.: A new approach to link prediction in gene regulatory networks. In: Jackowski, K., Burduk, R., Walkowiak, K., Woźniak, M., Yin, H. (eds.) IDEAL 2015. LNCS, vol. 9375, pp. 404–415. Springer, Cham (2015). doi:10.1007/978-3-319-24834-9_47 CrossRefGoogle Scholar
  23. 23.
    Turki, T., Wang, J.T.L., Rajikhan, I.: Inferring gene regulatory networks by combining supervised and unsupervised methods. In: 15th IEEE International Conference on Machine Learning and Applications, ICMLA 2016, Anaheim, CA, USA, 18–20 December 2016Google Scholar
  24. 24.
    Wang, S., Yao, X.: Relationships between diversity of classification ensembles and single-class performance measures. IEEE Trans. Knowl. Data Eng. 25(1), 206–219 (2013). http://dx.doi.org/10.1109/TKDE.2011.207 CrossRefGoogle Scholar
  25. 25.
    Zhong, L., Wang, J.T.L., Wen, D., Aris, V., Soteropoulos, P., Shapiro, B.A.: Effective classification of microRNA precursors using feature mining and AdaBoost algorithms. OMICS: J. Integr. Biol. 17(9), 486–493 (2013)CrossRefGoogle Scholar
  26. 26.
    Zhong, L., Wang, J.T.L., Wen, D., Shapiro, B.A.: Pre-miRNA classification via combinatorial feature mining and boosting. In: 2012 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2012, Philadelphia, PA, USA, 4–7 October 2012, Proceedings, pp. 1–4 (2012). http://doi.ieeecomputersociety.org/10.1109/BIBM.2012.6392700

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceKing Abdulaziz UniversityJeddahSaudi Arabia
  2. 2.Bioinformatics Program and Department of Computer ScienceNew Jersey Institute of TechnologyUniversity Heights, NewarkUSA

Personalised recommendations