Advertisement

Springer Nature is making Coronavirus research free. View research | View latest news | Sign up for updates

Combining analytic kernel models for energy-efficient data modeling and classification

Abstract

Energy-efficient computing has now become a key challenge not only for data-center operations, but also for many other energy-driven systems, with the focus on reducing of all energy-related costs, and operational expenses, as well as its corresponding and environmental impacts. However, current intelligent data models are typically performance driven. For instance, most data-driven machine-learning approaches are often known to require high computational cost in order to find the global optima. Designing more accurate intelligent data models to satisfy the market needs will hence lead to a higher likelihood of energy waste due to the increased computational cost. This paper thus introduces an energy-efficient framework for large-scale data modeling and classification/prediction. It can achieve a predictive accuracy comparable to or better than the state-of-the-art machine-learning models, while at the same time, maintaining a low computational cost when dealing with large-scale data. The effectiveness of the proposed approaches has been demonstrated by our experiments with two large-scale KDD data sets: Mtv-1 and Mtv-2.

This is a preview of subscription content, log in to check access.

References

  1. 1.

    Gartenberg A (2011) Bringing smarter computing to big data, Smarter computing builds a Smarter Planet: 2 in a Series. Available at http://www.adamgartenberg.com/gartenberg/agartenberg.nsf/dx/bringing-smarter-computing-to-big-data1

  2. 2.

    Hopkins MS (2011) Big data analytics and the path from insights to value. MIT Sloan Manag Rev, 21–32

  3. 3.

    Tantar AA, Danoy G, Bouvry P, Khan SU (2011) Energy-efficient computing using agent-based multi-objective dynamic optimization. In: Kim JH, Lee MJ (eds) Green IT: technologies and applications. Springer, New York. ISBN 978-3-642-22178-1, Chap. 14

  4. 4.

    Pinel F, Pecero J, Bouvry P, Khan SU (2010) Memory-aware green scheduling on multi-core processors. In: the 39th IEEE international conference on parallel processing (ICPP), San Diego, CA, USA, September 2010, pp 485–488

  5. 5.

    Kliazovich D, Bouvry D, Khan SU (2010) DENS: Data center energy-efficient network-aware scheduling. In: ACM/IEEE international conference on green computing and communications (GreenCom), Hangzhou, China, December 2010, pp 69–75

  6. 6.

    Wang L, Khan SU (2011) Review of performance metrics for green data centers: a taxonomy study. J Supercomput. doi:10.1007/s11227-011-0704-3

  7. 7.

    Drucker H, Cortes C, Jackel LD, LeCun Y, Vapnik V. (1994) Boosting and other ensemble methods. Neural Comput 6(6):1289–1301

  8. 8.

    Melo JCB, Cavalcanti GDC, Guimaraes GDC (2003) PCA feature extraction for protein structure prediction. In: IEEE proc of the 2003 international joint conference on neural networks, Oregon, USA

  9. 9.

    Weinberger KQ, Blitzer J, Saul LK (2006) Distance metric learning for large margin nearest neighbor classification. In: NIPS. MIT Press, Cambridge

  10. 10.

    Zhu X, Wu X, Yang Y (2004) Dynamic classifier selection for effective mining from noisy data streams. In: IEEE int conf in data mining (ICDM’04)

  11. 11.

    Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New York

  12. 12.

    Hendrix C, Fuchs E, Grohskopf L, Clough D, Guidos A, Leal J, Wahl R (2005) Dual isotope imaging simultaneously distinguishes the distribution of microbicide and HIV surrogates in the distal colon following simulated intercourse. Presentation, Johns Hopkins University and Centers for Disease Control and Prevention

  13. 13.

    Ahmed NK, Atiya AF, ElGayar N, El-Shishiny H (2007) Tourism demand forecasting using machine learning methods. Int J Artificial Intell Mach Learn (AIML), Special issue on computational methods for the tourism industry

  14. 14.

    Yoo PD, Sikder A, Taheri J, Zhou BB, Zomaya AY (2008) DomNet: protein domain boundary prediction server. IEEE Trans NanoBiosci 7(2):172–181

  15. 15.

    Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recognit Lett 28:459–471

  16. 16.

    Srinoy S (2007) Intrusion detection model based on particle swarm optimization and support vector machine. In: IEEE symposium on CISDA, pp 186–192

  17. 17.

    Garcia-Nieto J, Talbi EG, Alba E, Jourdan E (2007) A comparison between genetic algorithm and PSO approaches for gene selection and classification of microarray data. In: Proceedings of ACM (GECCO), pp 427–429

  18. 18.

    Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. In: Proceedings of IEEE international conference on systems, man and cybernetics, pp 4104–4108

  19. 19.

    Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139

  20. 20.

    Schapire RE (1999) Theoretical views of boosting and applications, algorithm learning theory. In: Lecture notes in computer science, vol 1720. Springer, Berlin, pp 13–25

  21. 21.

    Bate A, Lindquist M, Edwards IR, Olsson S, Orre R, Lansner A, De Freitas RM (1998) A Bayesian neural network method for adverse drug reaction signal generation. Clin Pharmacol 54(4):315–321

  22. 22.

    Dietterich TG, Bakiri G (1995) Machine learning bias, statistical bias and statistical variance of decision tree algorithms. Dept Comput Sci, Oregon State Univ, Corvallies, Tech Rep

  23. 23.

    Larose DT (2005) Discovering knowledge in data. Wiley, New York

Download references

Acknowledgements

We are grateful to the Lincoln Laboratory at Massachusetts Institute of Technology (MIT) in the U.S. for providing us the Mtv-2 data set, as well as their invaluable discussions; and special thanks to the British Telecom (BT) and Etisalat BT Innovation Center (EBTIC) for their constructive criticism on this work.

Author information

Correspondence to Paul D. Yoo.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Yoo, P.D., Zomaya, A.Y. Combining analytic kernel models for energy-efficient data modeling and classification. J Supercomput 63, 790–799 (2013). https://doi.org/10.1007/s11227-012-0776-8

Download citation

Keywords

  • Energy-efficient computing
  • Machine learning
  • A large volume of data