Journal of Grid Computing

, Volume 13, Issue 3, pp 409–423 | Cite as

Enhancing Regression Models for Complex Systems Using Evolutionary Techniques for Feature Engineering

  • Patricia ArrobaEmail author
  • José L. Risco-Martín
  • Marina Zapater
  • José M. Moya
  • José L. Ayala


This work proposes an automatic methodology for modeling complex systems. Our methodology is based on the combination of Grammatical Evolution and classical regression to obtain an optimal set of features that take part of a linear and convex model. This technique provides both Feature Engineering and Symbolic Regression in order to infer accurate models with no effort or designer’s expertise requirements. As advanced Cloud services are becoming mainstream, the contribution of data centers in the overall power consumption of modern cities is growing dramatically. These facilities consume from 10 to 100 times more power per square foot than typical office buildings. Modeling the power consumption for these infrastructures is crucial to anticipate the effects of aggressive optimization policies, but accurate and fast power modeling is a complex challenge for high-end servers not yet satisfied by analytical approaches. For this case study, our methodology minimizes error in power prediction. This work has been tested using real Cloud applications resulting on an average error in power estimation of 3.98 %. Our work improves the possibilities of deriving Cloud energy efficient policies in Cloud data centers being applicable to other computing environments with similar characteristics.


Automatic modeling Complex systems Grammatical evolution Classical regression Green data centers Sustainable cloud computing Power modeling 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adiga, N., et al.: An overview of the BlueGene/L Supercomputer Supercomputing, ACM/IEEE 2002 Conference, pp. 60–60 (2002)Google Scholar
  2. 2.
    Arroba, P., Risco-Martín, J.L., Zapater, M., Moya, J.M., Ayala, J.L., Olcoz, K.: Proceedings of the 5th International Conference in Sustainability in Energy and Buildings (SEB’ 14) 2014. Accepted, to appear in (2014)Google Scholar
  3. 3.
    Back, T., Hammel, U., Schwefel, H.P.: Evolutionary computation: comments on the history and current state. Evol. Comput., IEEE Trans. on 1(1), 3–17 (1997). doi: 10.1109/4235.585888 CrossRefGoogle Scholar
  4. 4.
    Bar-Yam, Y.: Dynamics of Complex Systems. Addison-Wesley stydies in nonlinearity. Westview Press (1997)Google Scholar
  5. 5.
    Berl, A., Gelenbe, E., Di Girolamo, M., Giuliani, G., De Meer, H., Dang, M.Q., Pentikousis, K.: Energy-efficient cloud computing. Comput. J. 53(7), 1045–1051 (2010)CrossRefGoogle Scholar
  6. 6.
    Bianchi, L., Dorigo, M., Gambardella, L.M., Gutjahr, W.J.: A survey on metaheuristics for stochastic combinatorial optimization. Natural Comput. An Int. J. 8(2), 239–287 (2009). doi: 10.1007/s11047-008-9098-4 CrossRefMathSciNetzbMATHGoogle Scholar
  7. 7.
    Blum, C., Roli, A.: Metaheuristics in combinatorial optimization: Overview and conceptual comparison. ACM Comput. Surv. 35 (3), 268–308 (2003). doi: 10.1145/937503.937505 CrossRefGoogle Scholar
  8. 8.
    Boccara, N.: Modeling Complex Systems. Graduate Texts in Physics. Springer (2010)Google Scholar
  9. 9.
    Bohra, A., Chaudhary, V.: IPDPSW, pp. 1–8 (2010)Google Scholar
  10. 10.
    Brandon, J.: Going green in the data center: Practical steps for your sme to become more environmentally friendly. Processor 29(39), 1–30 (2007)Google Scholar
  11. 11.
    Buyya, R., et al.: PDPTA, p. 2010, Las Vegas (2010)Google Scholar
  12. 12.
    Chen, Q., et al.: DASC, pp. 768–775 (2011)Google Scholar
  13. 13.
    Contreras, G., Martonosi, M.: ISLPED, pp. 221–226, New York (2005)Google Scholar
  14. 14.
    El-Sayed, N., Stefanovici, I.A., Amvrosiadis, G., Hwang, A.A., Schroeder, B.: Temperature management in data centers: why some (might) like it hot. SIGMETRICS Perform. Eval. Rev. 40(1), 163–174 (2012)CrossRefGoogle Scholar
  15. 15.
    Ge, R., et al.: Performance-constrained distributed dvs scheduling for scientific applications on power-aware clusters In: Supercomputing Conference, SC ’05, pp. 34–34. IEEE Computer Society, Washington, DC (2005)Google Scholar
  16. 16.
    Goldberg, D.E.: Genetic algorithms in search, optimization, and machine learning. Addison-Wesley Professional (1989)Google Scholar
  17. 17.
    Google Data Centers: Efficiency: How we do it. Temperature control (2014).
  18. 18.
    Hemberg, E., Ho, L., O’Neill, M., Claussen, H.: A comparison of grammatical genetic programming grammars for controlling femtocell network coverage. Gen. Prog. Evol. Mach. 14(1), 65–93 (2013). doi: 10.1007/s10710-012-9171-8 CrossRefGoogle Scholar
  19. 19.
    Henning, J.L.: Spec cpu2006 benchmark descriptions. SIGARCH Comput. Archit. News 34(4), 1–17 (2006). doi: 10.1145/1186736.1186737 CrossRefMathSciNetGoogle Scholar
  20. 20.
    Hsu, C.H., Feng, W.C.: Supercomputing Conference, pp. 1–1 (2005)Google Scholar
  21. 21.
    Kaplan, J., Forrest, W., Kindler, N.: Revolutionizing data center energy efficiency. Tech. Rep. July. McKinsey & Company (2008)Google Scholar
  22. 22.
    Lewis, A., et al.: HotPower, pp. 4–4, Berkeley (2008)Google Scholar
  23. 23.
    Markoff, J., Lohr, S.: Intel’s huge bet turns iffy. New York Times Technology Section (2002)Google Scholar
  24. 24.
    Meisner, D., et al.: ISLPED, pp. 319–324, New York (2010)Google Scholar
  25. 25.
  26. 26.
    O’Neill, M., Ryan, C.: Grammatical evolution. Evolutionary Computation, IEEE Transactions on 5(4), 349–358 (2001). doi: 10.1109/4235.942529 CrossRefGoogle Scholar
  27. 27.
    Niles, P., Donovan, P.: Virtualization and Cloud Computing: Optimized Power, Cooling, and Management Maximizes Benefits. White paper 118, Revision. Tech. rep., vol. 3. Schneider Electric (2011)Google Scholar
  28. 28.
    Pelley, S., et al.: WEED (2009)Google Scholar
  29. 29.
    Ryan, C., Collins, J., Neill, M.: Grammatical evolution: Evolving programs for an arbitrary language. In: Banzhaf, W., Poli, R., Schoenauer, M., Fogarty, T. (eds.) Genetic Programming, Lecture Notes in Computer Science, vol. 1391, pp. 83–96. Springer Berlin Heidelberg (1998), doi: 10.1007/BFb0055930
  30. 30.
    Scheihing, P.: Data center facilities and engineering conference, Washington DC (2007)Google Scholar
  31. 31.
    Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Series B (Methodological) 58(1), 267–288 (1996)Google Scholar
  32. 32.
    Turner, C.R., Fuggetta, A., Lavazza, L., Wolf, A.L.: A conceptual basis for feature engineering. J. Syst. Software 49(1), 3–15 (1999). doi: 10.1016/S0164-1212(99)00062-X CrossRefGoogle Scholar
  33. 33.
    Vladislavleva, E., Smits, G., den Hertog, D.: Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. Evol. Comput., IEEE Trans. on 13(2), 333–349 (2009). doi: 10.1109/TEVC.2008.926486 CrossRefGoogle Scholar
  34. 34.
    Warkozek, G., et al.: ICIT, pp. 211–216 (2012)Google Scholar
  35. 35.
    Warren, M., et al.: Supercomputing Conference, pp. 61–61 (2002)Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  • Patricia Arroba
    • 1
    Email author
  • José L. Risco-Martín
    • 2
  • Marina Zapater
    • 3
  • José M. Moya
    • 1
  • José L. Ayala
    • 2
  1. 1.Electronic Engineering DepartmentTechnical University of MadridMadridSpain
  2. 2.DACYA, Complutense University of MadridMadridSpain
  3. 3.CEI Campus Moncloa UCM-UPMMadridSpain

Personalised recommendations