An Information Approach to Accuracy Comparison for Classification Schemes in an Ensemble of Data Sources

  • Mikhail LangeEmail author
  • Sergey Ganebnykh
  • Andrey Lange
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 794)


An accuracy of multiclass classifying the collections of objects taken from a given ensemble of data sources is investigated using the average mutual information between the datasets of the sources and a set of the classes. We consider two fusion schemes, namely WMV (Weighted Majority Vote) scheme based on a composition of decisions on the objects of the individual sources and GDM (General Dissimilarity Measure) scheme which uses a composition of metrics in datasets of the sources. For a given metric classification model, it is proved that the weighted mean value of the average mutual information per one source in WMV scheme is smaller to the similar mean in GDM scheme. Using a lower bound to the appropriate rate distortion function, it is shown that the lower bounded error probability in WMV scheme exceeds the similar error probability in GDM scheme. This theoretical result is confirmed by a computing experiment on face recognition of HSI color images giving the ensemble of H, S, and I sources.


Multiclass classification Ensemble of sources Fusion scheme Composition of decisions Composition of metrics Average mutual information Error probability 



The research was supported by the Russian Foundation for Basic Research, the projects 18-07-01231 and 18-07-01385.


  1. 1.
  2. 2.
    Beckenbach, E., Bellman, R.: Inequalities. Springer, Heidelberg (1961). Scholar
  3. 3.
    Dobrushin, R., Tsybakov, B.: Information transmission with additional noise. IRE Trans. Inf. Theory 8(5), 293–304 (1962). Scholar
  4. 4.
    Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. Wiley, Hoboken (2000)zbMATHGoogle Scholar
  5. 5.
    Duin, R., de Ridder, D., Tax, D.M.: Experiments with a featureless approach to pattern recognition. Pattern Recogn. Lett. 18(11–13), 1159–1166 (1997)CrossRefGoogle Scholar
  6. 6.
    Gallager, R.: Information Theory and Reliable Communication. Wiley, Hoboken (1968)zbMATHGoogle Scholar
  7. 7.
    Gradshteyn, I., Ryzhik, I.: Table of Integrals, Series, and Products, 7th edn. Elsevier, Amsterdam (2007)zbMATHGoogle Scholar
  8. 8.
    Gray, R., Neuhoff, D.: Quantization. IEEE Trans. Inf. Theor. 44(6), 2325–2383 (1998). Scholar
  9. 9.
    Kolmogorov, A., Tikhomirov, V.: \(\epsilon \)-entropy and \(\epsilon \)-capacity of sets in functional spaces. In: Shiryayev, A.N. (ed.) Selected Works of A.N. Kolmogorov: Volume III: Information Theory and the Theory of Algorithms, vol. 27, pp. 86–170. Springer, Heidelberg (1993). Scholar
  10. 10.
    Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms, 2nd edn. Wiley, Hoboken (2014). Scholar
  11. 11.
    Kuncheva, L., Whitaker, C., Shipp, C., Duin, R.: Limits on the majority vote accuracy in classifier fusion. Pattern Anal. Appl. 6(1), 22–31 (2003). Scholar
  12. 12.
    Lam, L., Suen, S.Y.: Application of majority voting to pattern recognition: an analysis of its behavior and performance. Trans. Syst. Man Cyber. Part A 27(5), 553–568 (1997). Scholar
  13. 13.
    Lange, M., Ganebnykh, S.: On fusion schemes for multiclass object classification with reject in a given ensemble of sources. JPCS 1096(012048) (2018). Scholar
  14. 14.
    Lange, M., Lange, A.: On information theoretical model for data classification. Mach. Learn. Data Anal. 4(3), 165–179 (2018). Scholar
  15. 15.
    Lange, M., Stepanov, D.: Recognition of objects given by collections of multichannel images. Pattern Recogn. Image Anal. 24(3), 431–442 (2014). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Federal Research Center “Computer Science and Control” of Russian Academy of SciencesMoscowRussia

Personalised recommendations