Scalable Inference of Gene Regulatory Networks with the Spark Distributed Computing Platform

  • Cristóbal Barba-González
  • José García-NietoEmail author
  • Antonio Benítez-Hidalgo
  • Antonio J. Nebro
  • José F. Aldana-Montes
Conference paper
Part of the Studies in Computational Intelligence book series (SCI, volume 798)


Inference of Gene Regulatory Networks (GRNs) remains an important open challenge in computational biology. The goal of bio-model inference is to, based on time-series of gene expression data, obtain the sparse topological structure and the parameters that quantitatively understand and reproduce the dynamics of biological system. Nevertheless, the inference of a GRN is a complex optimization problem that involve processing S-System models, which include large amount of gene expression data from hundreds (even thousands) of genes in multiple time-series (essays). This complexity, along with the amount of data managed, make the inference of GRNs to be a computationally expensive task. Therefore, the generation of parallel algorithmic proposals that operate efficiently on distributed processing platforms is a must in current reconstruction of GRNs. In this paper, a parallel multi-objective approach is proposed for the optimal inference of GRNs, since minimizing the Mean Squared Error using S-System model and Topology Regularization value. A flexible and robust multi-objective cellular evolutionary algorithm is adapted to deploy parallel tasks, in form of Spark jobs. The proposed approach has been developed using the framework jMetal, so in order to perform parallel computation, we use Spark on a cluster of distributed nodes to evaluate candidate solutions modeling the interactions of genes in biological networks.


Gene Regulatory Networks Multi-objective Metaheuristics Distributed Computing jMetal Spark 



This work was partially funded by Grants TIN2017-86049-R, TIN2014-58304 (Spanish Ministry of Education and Science) and P12-TIC-1519 (Plan Andaluz de Investigación, Desarrollo e Innovación). C. Barba-González was supported by Grant BES-2015-072209 (Spanish Ministry of Economy and Competitiveness). J. García-Nieto is the recipient of a Post-Doctoral fellowship of “Captación de Talento para la Investigación” at Universidad de Málaga.


  1. 1.
    Akutsu, T., Kuhara, S., Maruyama, O., Miyano, S.: Identification of genetic networks by strategic gene disruptions and gene overexpressions under a boolean model. Theoret. Comput. Sci. 298(1), 235–251 (2003)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Angus, T.S., Yaochu, J.: Reconstructing biological gene regulatory networks: where optimization meets big data. Evol. Intell. 7(1), 29–47 (2014)CrossRefGoogle Scholar
  3. 3.
    Barba-Gonzaléz, C., García-Nieto, J., Nebro, A.J., Aldana-Montes, J.F.: Multi-objective big data optimization with jMetal and spark. In: International Conference on Evolutionary Multi-Criterion Optimization, pp. 16–30. Springer (2017)Google Scholar
  4. 4.
    Deb, K.: Multi-Objective Optimization Using Evolutionary Algorithms. Wiley, New York (2001)zbMATHGoogle Scholar
  5. 5.
    Durillo, J.J., Nebro, A.J.: jMetal: a java framework for multi-objective optimization. Adv. Eng. Softw. 42, 760–771 (2011)CrossRefGoogle Scholar
  6. 6.
    Friedman, N., Linial, M., Nachman, I.: Using Bayesian networks to analyze expression data. J. Comput. Biol. 7, 3–4 (2004)Google Scholar
  7. 7.
    Nebro, A.J., Durillo, J.J., Luna, F., Dorronsoro, B., Alba, E.: Design issues in a multiobjective cellular genetic algorithm, pp. 126–140. Springer, Heidelberg (2007)Google Scholar
  8. 8.
    Noman, N., Iba, H.: Inferring gene regulatory networks using differential evolution with local search heuristics. TCBB 4(4), 634–647 (2007)Google Scholar
  9. 9.
    Palafox, L., Noman, N., Iba, H.: Reverse engineering of gene regulatory networks using dissipative particle swarm optimization. IEEE Trans. Evol. Comput. 17(4), 577–587 (2013)CrossRefGoogle Scholar
  10. 10.
    Prill, R.J., Marbach, D., Saez-Rodriguez, J., Sorger, P.K., Alexopoulos, L.G., Xue, X., Clarke, N.D., Altan-Bonnet, G., Stolovitzky, G.: Towards a rigorous assessment of systems biology models: the DREAM3 challenges. PLoS ONE 5(2), 1–18 (2010)CrossRefGoogle Scholar
  11. 11.
    Savageau, M.: Biochemical Systems Analysis: A Study of Function and Design in Molecular Biology. Addison-Wesley Educational Publishers Inc., Reading (2010)Google Scholar
  12. 12.
    Sirbu, A., Ruskin, H.J., Crane, M.: Comparison of evolutionary algorithms in gene regulatory network model inference. BMC Bioinfor. 11(1), 59 (2010)CrossRefGoogle Scholar
  13. 13.
    Voit, E.O.: Computational Analysis of Biochemical Systems. A Practical Guide for Biochemists and Molecular Biologists. Cambridge University Press, New York (2000)Google Scholar
  14. 14.
    Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud 2010, Berkeley, CA, USA, p. 10. USENIX Association (2010)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Cristóbal Barba-González
    • 1
  • José García-Nieto
    • 1
    Email author
  • Antonio Benítez-Hidalgo
    • 1
  • Antonio J. Nebro
    • 1
  • José F. Aldana-Montes
    • 1
  1. 1.Dept. de Lenguajes y Ciencias de la Computación, Instituto de Investigación Biomédica de Málaga (IBIMA)University of MálagaMálagaSpain

Personalised recommendations