Diversity and Locality in Multi-Component, Multi-Layer Predictive Systems: A Mutual Information Based Approach

  • Bassma Al-JubouriEmail author
  • Bogdan Gabrys
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10604)


This paper discusses the effect of locality and diversity among the base models of a Multi-Components Multi-Layer Predictive System (MCMLPS). A new ensemble method is introduced, where in the proposed architecture, the data instances are assigned to local regions using a conditional mutual information based on the similarity of their features. Furthermore, the outputs of the base models are weighted by this similarity metric. The proposed architecture has been tested on a number of data sets and its performance was compared to four benchmark algorithms. Moreover, the effect of changing three parameters of the proposed architecture has been tested and compared.


Ensemble diversity Ensemble methods Local learning Conditional mutual information Feature selection 


  1. 1.
    Al-Jubouri, B., Gabrys, B.: Local learning for multi-layer, multi-component predictive system. Procedia Comput. Sci. 96, 723–732 (2016)CrossRefGoogle Scholar
  2. 2.
    Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4), 537–550 (1994)CrossRefGoogle Scholar
  3. 3.
    Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)zbMATHGoogle Scholar
  4. 4.
    Brown, G., Pocock, A., Zhao, M.J., Luján, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 13, 27–66 (2012)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Budka, M., Gabrys, B.: Density-preserving sampling: robust and efficient alternative to cross-validation for error estimation. IEEE Trans. Neural Netw. Learn. Syst. 24(1), 22–34 (2013)CrossRefGoogle Scholar
  6. 6.
    Cunningham, P., Carney, J.: Diversity versus quality in classification ensembles based on feature selection. In: López de Mántaras, R., Plaza, E. (eds.) ECML 2000. LNCS, vol. 1810, pp. 109–116. Springer, Heidelberg (2000). doi: 10.1007/3-540-45164-1_12 CrossRefGoogle Scholar
  7. 7.
    Dasarathy, B.V., Sheela, B.V.: A composite classifier system design: concepts and methodology. Proc. IEEE 67(5), 708–713 (1979)CrossRefGoogle Scholar
  8. 8.
    Eastwood, M., Gabrys, B.: The dynamics of negative correlation learning. J. VLSI Signal Proc. 49, 251–263 (2007)CrossRefGoogle Scholar
  9. 9.
    Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.A.: Feature Extraction: Foundations and Applications, vol. 207. Springer, Heidelberg (2008)zbMATHGoogle Scholar
  10. 10.
    Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Comput. 3(1), 79–87 (1991)CrossRefGoogle Scholar
  11. 11.
    Kadlec, P., Gabrys, B.: Architecture for development of adaptive on-line prediction models. Memet. Comput. 1(4), 241–269 (2009)CrossRefGoogle Scholar
  12. 12.
    Lichman, M.: UCI machine learning repository (2013).
  13. 13.
    Polikar, R.: Ensemble based systems in decision making. IEEE Circuits Syst. Mag. 6(3), 21–45 (2006)CrossRefGoogle Scholar
  14. 14.
    Riedel, S., Gabrys, B.: Pooling for combination of multi level forecasts. IEEE Trans. Knowl. Data Eng. 12(21), 1753–1766 (2009)CrossRefGoogle Scholar
  15. 15.
    Rodriguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: a new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1619–1630 (2006)CrossRefGoogle Scholar
  16. 16.
    Ruta, D., Gabrys, B., Lemke, C.: A generic multilevel architecture for time series prediction. IEEE Trans. Knowl. Data Eng. 23(3), 350–359 (2011)CrossRefGoogle Scholar
  17. 17.
    Ruta, D., Gabrys, B.: New Measure of Classifier Dependency in Multiple Classifier Systems. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 127–136. Springer, Heidelberg (2002). doi: 10.1007/3-540-45428-4_13 CrossRefGoogle Scholar
  18. 18.
    Ruta, D., Gabrys, B.: Classifier selection for majority voting. Inf. Fusion 6(1), 63–81 (2005)CrossRefzbMATHGoogle Scholar
  19. 19.
    Schapire, R.E.: The strength of weak learnability. Mach. Learn. 5(2), 197–227 (1990)Google Scholar
  20. 20.
    Wolpert, D.H.: Stacked generalization. Neural Netw. 5(2), 241–259 (1992)CrossRefGoogle Scholar
  21. 21.
    Xue, F., Subbu, R., Bonissone, P.: Locally weighted fusion of multiple predictive models. In: International Joint Conference on Neural Networks, 2006. IJCNN’06, pp. 2137–2143. IEEE (2006)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Data Science InstituteBournemouth UniversityDorsetUK

Personalised recommendations