Soft Computing

, Volume 20, Issue 5, pp 2047–2065 | Cite as

SPMoE: a novel subspace-projected mixture of experts model for multi-target regression problems

  • Esmaeil Hadavandi
  • Jamal Shahrabi
  • Yoichi Hayashi
Methodologies and Application

Abstract

In this paper, we focus on modeling multi-target regression problems with high-dimensional feature spaces and a small number of instances that are common in many real-life problems of predictive modeling. With the aim of designing an accurate prediction tool, we present a novel mixture of experts (MoE) model called subspace-projected MoE (SPMoE). Training the experts of the SPMoE is done using a boosting-like manner by a combination of ideas from subspace projection method and the negative correlation learning algorithm (NCL). Instead of using whole original input space for training the experts, we develop a new cluster-based subspace projection method to obtain projected subspaces focused on the difficult instances at each step of the boosting approach for training the diverse experts. The experts of the SPMoE are trained on the obtained subspaces using a new NCL algorithm called sequential NCL. The SPMoE is compared with the other ensemble models using three real cases of high-dimensional multi-target regression problems; the electrical discharge machining, energy efficiency and an important problem in the field of operations strategy called the practice–performance problem. The experimental results show that the prediction accuracy of the SPMoE is significantly better than the other ensemble and single models and can be considered to be a promising alternative for modeling the high-dimensional multi-target regression problems.

Keywords

Mixture of experts Boosting Subspace projection  Negative correlation learning Multi-target regression 

Notes

Acknowledgments

The authors express deep gratitude to Prof. Moattar Husseini and Dr. Hajirezaei for many constructive suggestions and supports. Also, The authors wish to express their gratitude to two anonymous referees for their helpful comments, which greatly helped us to improve our paper.

References

  1. Aho T, Zenko B, Dzeroski S, Elomaa T (2012) Multi-target regression with rule ensembles. J Mach Learn Res 13:1–48MathSciNetMATHGoogle Scholar
  2. Amoako-Gyampah K, Acquaah M (2008) Manufacturing strategy, competitive strategy and firm performance: An empirical study in a developing economy environment. Int J Prod Econ 111:575–592CrossRefGoogle Scholar
  3. Ao SI (2011) A hybrid neural network cybernetic system for quantifying cross-market dynamics and business forecasting. Soft Comput 15:1041–1053CrossRefGoogle Scholar
  4. Asadi S, Hadavandi E, Mehmanpazir F, Nakhostin MM (2012) Hybridization of evolutionary Levenberg–Marquardt neural networks and data pre-processing for stock market prediction. Knowl-Based Syst 35:245–258CrossRefGoogle Scholar
  5. Barbosa BHG, Bui LT, Abbass HA, Aguirre LA, Braga AP (2011) The use of coevolution and the artificial immune system for ensemble learning. Soft Comput 15:1735–1747CrossRefGoogle Scholar
  6. Breiman L (1996) Bagging predictors. Mach Learn 24:123–140MathSciNetMATHGoogle Scholar
  7. Brown G, Wyatt J, Harris R, Yao X (2005) Diversity creation methods: a survey and categorisation. Inf Fusion 6(1):5–20CrossRefGoogle Scholar
  8. Chen H, Yao X (2009) Regularized negative correlation learning for neural network ensembles. IEEE Trans Neural Netw/Publ IEEE Neural Netw Counc 20:1962–1979. doi: 10.1109/TNN.2009.2034144 CrossRefGoogle Scholar
  9. Dietterich TG (2000) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn 40:139–157CrossRefGoogle Scholar
  10. Ebrahimpour R, Sadeghnejad N, Arani SAAA, Mohammadi N (2013) Boost-wise pre-loaded mixture of experts for classification tasks. Neural Comput Appl 22(1):365–377CrossRefGoogle Scholar
  11. Enki DG, Trendafilov NT, Jolliffe IT (2013) A clustering approach to interpretable principal components. J Appl Stat 40:583–599MathSciNetCrossRefGoogle Scholar
  12. Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: The thirteenth international conference on machine learning (ICML’96). Morgan Kaufman, San Francisco, pp 148–156Google Scholar
  13. Fu X, Wang L (2003) Data dimensionality reduction with application to simplifying RBF network structure and improving classification performance. IEEE Trans Syst Man Cybern Part B Cybern Publ IEEE Syst Man Cybern Soc 33:399–409. doi: 10.1109/TSMCB.2003.810911 CrossRefGoogle Scholar
  14. García-Pedrajas N, Maudes-Raedo J, García-Osorio C, Rodríguez-Díez JJ (2012) Supervised subspace projections for constructing ensembles of classifiers. Inf Sci 193:1–21CrossRefGoogle Scholar
  15. Hadavandi E, Shavandi H, Ghanbari A (2010) Integration of genetic fuzzy systems and artificial neural networks for stock price forecasting. Knowl-Based Syst 23:800–808CrossRefGoogle Scholar
  16. Hajirezaie M, Moattar Husseini SM, Abdollahzadeh Barfourosh A (2010) Modeling and evaluating the strategic effects of improvement programs on the manufacturing performance using neural networks. Afr J Bus Manag 4:414–424Google Scholar
  17. Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12:993–1001CrossRefGoogle Scholar
  18. Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20:832–844CrossRefGoogle Scholar
  19. Hochberg Y (1988) A sharper bonferroni procedure for multiple tests of significance. Biometrika 75:800–803MathSciNetCrossRefMATHGoogle Scholar
  20. Jacobs RA, Jordan MJ, Nowlan SJ, Hinton GE (1991) Adaptive mixtures of local experts. Neural Comput 3:79–87CrossRefGoogle Scholar
  21. Karalič A, Bratko I (1997) First order regression. Mach Learn 26:147–176CrossRefMATHGoogle Scholar
  22. Kheradpisheh SR, Sharifizadeh F, Nowzari-Dalini A, Ganjtabesh M, Ebrahimpour R (2014) Mixture of feature specified experts. Inf Fusion 20:242–251CrossRefGoogle Scholar
  23. Kocev D, Vens C, Struyf J, Džeroskia S (2013) Tree ensembles for predicting structured outputs. Pattern Recognit 46:817–833CrossRefGoogle Scholar
  24. Kotsiantis S (2011) Combining bagging, boosting, rotation forest and random subspace methods. Artif Intell Rev 35:1–18CrossRefGoogle Scholar
  25. Liu Y, Yao X (1999) Ensemble learning via negative correlation. Neural Netw 12:1399–1404CrossRefGoogle Scholar
  26. Luengo J, García S, Herrera F (2009) A study on the use of statistical tests for experimentation with neural networks: analysis of parametric test conditions and non-parametric tests. Expert Syst Appl 36:7798–7808CrossRefGoogle Scholar
  27. Masoudnia S, Ebrahimpour R, Arani SAAA (2012) Combining features of negative correlation learning with mixture of experts in proposed ensemble methods. Appl Soft Comput 12:3539–3551CrossRefGoogle Scholar
  28. Masoudnia S, Ebrahimpour R (2014) Mixture of experts: a literature survey. Artif Intell Rev 42(2):275–293CrossRefGoogle Scholar
  29. McKay R, Abbass HA (2001) Anti-correlation: a diversity promoting mechanism in ensemble learning. Aust J Intell Inf Process Syst 7:139–149Google Scholar
  30. Mitra V, Wang C-J, Banerjee S (2006) Lidar detection of underwater objects using a neuro-SVM-based architecture. IEEE Trans Neural Netw 17:717–731Google Scholar
  31. Na García-Pedrajas, Ce García-Osorio (2007) Nonlinear boosting projections for ensemble construction. J Mach Learn Res 8:1–33Google Scholar
  32. Na García-Pedrajas, Ortiz-Boyer D (2008) Boosting random subspace method. Neural Netw 21:1344–1362CrossRefMATHGoogle Scholar
  33. Nguyen MH, Abbass HA, McKay R (2008) Analysis of CCME: coevolutionary dynamics, automatic problem decomposition and regularization. IEEE Trans Syst Man Cybern Part C 38:100–109CrossRefGoogle Scholar
  34. Nicholas J, Ledwith A, Perks H (2011) New product development best practice in SME and large organisations: theory vs practice. Eur J Innov Manag 24:227–251CrossRefGoogle Scholar
  35. Pardo C, Diez-Pastor JF, García-Osorio C, Rodríguez JJ (2013) Rotation forests for regression. Appl Math Comput 219:9914–9924MathSciNetCrossRefMATHGoogle Scholar
  36. Qannari EM, Vigneau E, Courcoux P (1997) Clustering of variables, application in consumer and sensory studies. Food Qual Prefer 8:423–428CrossRefGoogle Scholar
  37. Rodríguez JJ, Kuncheva LI (2006) Rotation forest : a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28:1619–1630CrossRefGoogle Scholar
  38. Shahrabi J, Hadavandi E, Asadi S (2013) Developing a hybrid intelligent model for forecasting problems: case study of tourism demand time series. Knowl-Based Syst 43:112–122CrossRefGoogle Scholar
  39. Sheskin DJ (2003) Handbook of parametric and non-parametric statistical procedures. CRC Press, USACrossRefMATHGoogle Scholar
  40. Soffritti G (1999) Hierarchical clustering of variables: a comparison among strategies of analysis. Commun Stat Simul Comput 28:977–999CrossRefMATHGoogle Scholar
  41. Spyromitros-Xioufis E, Tsoumakas G, Groves W, Vlahavas I (2014) Multi-label classification methods for multi-target regression. arXiv:1211.6581
  42. Tian J, Li M, Chen F, Kou J (2012) Coevolutionary learning of neural network ensemble for complex classification tasks. Pattern Recognit 45:1373–1385CrossRefMATHGoogle Scholar
  43. Tsanas A, Xifara A (2012) Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy Build 49:560–567CrossRefGoogle Scholar
  44. Verikas A, Kalsyte Z, Bacauskiene M, Gelzinis A (2010) Hybrid and ensemble-based soft computing techniques in bankruptcy prediction: a survey. Soft Comput 4:995–1010CrossRefGoogle Scholar
  45. Vigneau E, Qannari E (2003) Clustering of variables around latent components. Simul Comput 32:1131–1150MathSciNetCrossRefMATHGoogle Scholar
  46. Wang LP, Fu XJ (2005) Data mining with computational intelligence. Springer, BerlinMATHGoogle Scholar
  47. Yuksel SE, Wilson JN, Gader PD (2012) Twenty years of mixture of experts. IEEE Trans Neural Netw Learn Syst 23:1177–1193. doi: 10.1109/TNNLS.2012.2200299 CrossRefGoogle Scholar
  48. Zar JH (1999) Biostatistical analysis. Prentice Hall, USAGoogle Scholar
  49. Zhai J-h Xu, H-y Wang X-z (2012) Dynamic ensemble extreme learning machine based on sample entropy. Soft Comput 16:1493–1502CrossRefGoogle Scholar
  50. Zho Y, Karypis G (2005) Hierarchical clustering algorithms for document datasets. Data Min Knowl Discov 10:141–168MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Esmaeil Hadavandi
    • 1
  • Jamal Shahrabi
    • 1
  • Yoichi Hayashi
    • 2
  1. 1.Department of Industrial EngineeringAmirkabir University of TechnologyTehranIran
  2. 2.Department of Computer ScienceMeiji University Tama-kuKawasakiJapan

Personalised recommendations