An Information Approach to Accuracy Comparison for Classification Schemes in an Ensemble of Data Sources
An accuracy of multiclass classifying the collections of objects taken from a given ensemble of data sources is investigated using the average mutual information between the datasets of the sources and a set of the classes. We consider two fusion schemes, namely WMV (Weighted Majority Vote) scheme based on a composition of decisions on the objects of the individual sources and GDM (General Dissimilarity Measure) scheme which uses a composition of metrics in datasets of the sources. For a given metric classification model, it is proved that the weighted mean value of the average mutual information per one source in WMV scheme is smaller to the similar mean in GDM scheme. Using a lower bound to the appropriate rate distortion function, it is shown that the lower bounded error probability in WMV scheme exceeds the similar error probability in GDM scheme. This theoretical result is confirmed by a computing experiment on face recognition of HSI color images giving the ensemble of H, S, and I sources.
KeywordsMulticlass classification Ensemble of sources Fusion scheme Composition of decisions Composition of metrics Average mutual information Error probability
The research was supported by the Russian Foundation for Basic Research, the projects 18-07-01231 and 18-07-01385.
- 1.Database of face images. http://sourceforge.net/projects/colorfaces
- 9.Kolmogorov, A., Tikhomirov, V.: \(\epsilon \)-entropy and \(\epsilon \)-capacity of sets in functional spaces. In: Shiryayev, A.N. (ed.) Selected Works of A.N. Kolmogorov: Volume III: Information Theory and the Theory of Algorithms, vol. 27, pp. 86–170. Springer, Heidelberg (1993). https://doi.org/10.1007/978-94-017-2973-4_7CrossRefGoogle Scholar