Constructing ECOC based on confusion matrix for multiclass learning problems

基于混淆矩阵纠错输出编码在多类学习问题中的应用

Abstract

In the pattern recognition field, error-correcting output codes (ECOC) are a powerful tool to fuse any number of binary classifiers to model multiclass problems, and the research of encoding based on data is attracting more and more attention. In this paper, we are going to propose a new encoding method for constructing subclass Error-Correcting Output Codes, which was first introduced by Escalera et al. To achieve this goal, we first obtain the correlation between each pair of classes with the help of confusion matrix. Then, we select the most easily separated subclasses for classification by following Fisher’s principle. At last, we were able to obtain binary partitions based on subclasses. After finishing this work, a new data-driven coding matrix-Subclass ECOC will be achieved. Experimental results on University of CaliforniaIrvine data sets and three kinds of high resolution range profile data sets with logistic linear classifier and support vector machine as the binary classifiers show that our approach can provide a better performance and the robustness of classification with a little longer but acceptable code length.

创新点

提出一种新型基于数据集构造纠错输出编码解决多类分类问题策略,该方法首先利用混淆矩阵做为衡量多类样本空间中不同类别之间的相似度大小,进而得到类间离散度。其次,基于类间离散度寻找最优子空间划分并得到二类子空间划分集。最后,优化所得到的多个二类子空间(合并和拆分)并最终形成纠错输出编码,基于此编码矩阵划分样本空间集并训练即可得到最优二类分类器,最终有效提高多类分类的准确性和泛化能力。

This is a preview of subscription content, access via your institution.

References

  1. 1

    Dietterich T G, Bakiri G. Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res, 1995, 2: 263–286

    MATH  Google Scholar 

  2. 2

    Windeatt T, Ardeshir G. Boosted ECOC ensembles for face recognition. In: Proceedings of International Conference on Visual Information Engineering. California: AAAI Press, 2003. 165–168

    Google Scholar 

  3. 3

    Windeatt T, Smith R S. Weighted decoding ECOC for facial action unit classification. In: Proceedings of International Conference on Artificial Intelligence. California: AAAI Press, 2008. 26–30

    Google Scholar 

  4. 4

    Ghani R. Combining labeled and unlabeled data for text classification with a large number of categories. In: Proceedings of International Conference on Data Mining. California: AAAI Press, 2001. 597–598

    Google Scholar 

  5. 5

    Escalera S, Pujol O, Radeva P. On the decoding process in ternary errorcorrecting output codes. IEEE Trans Patt Anal Mach Intell, 2010, 32: 120–134

    Article  Google Scholar 

  6. 6

    Pujol O, Radeva P, Vitria J. Discriminant ECOC: a heuristic method for application dependent design of error correcting output codes. IEEE Trans Patt Anal Mach Intell, 2006, 28: 1001–1007

    Article  Google Scholar 

  7. 7

    Alpaydin E, Mayoraz E. Learning error-correcting output codes from data. In: Proceedings of International Conference on Artificial Neural Networks. California: AAAI Press, 1999. 743–748

    Google Scholar 

  8. 8

    Utschick W, Weichselberger W. Stochastic organization of output codes in multiclass learning problems. Neural Comput, 2001, 13: 1065–1102

    Article  MATH  Google Scholar 

  9. 9

    Escalera S, Tax D M J, Pujol O, et al. Subclass problem-dependent design of error-correcting output codes. IEEE Trans Patt Anal Mach Intell, 2008, 30: 1–14

    Article  Google Scholar 

  10. 10

    Escalera S, Masip D, Puertas E, et al. Online error correcting output codes. Patt Recognit Lett, 2011, 32: 458–467

    Article  Google Scholar 

  11. 11

    Simeone P, Marrocco C, Tortorella F. Design of reject rules for ECOC classification systems. Patt Recognit, 2012, 2: 863–875

    Google Scholar 

  12. 12

    García-Pedrajas N, Ortiz-Boyer D. An empirical study of binary classifier fusion methods for multiclass classification. Inf Fusion, 2011, 2: 111–130

    Article  Google Scholar 

  13. 13

    Zhou J, Suen C. Unconstrained numeral pair recognition using enhanced error correcting output coding: a holistic approach. In: Proceedings of International Conference on Document Analysis and Recognition. California: AAAI Press, 2005. 484–488

    Google Scholar 

  14. 14

    Allwein E, Schapire R, Singer Y. Reducing multiclass to binary: a unifying approach for margin classifiers. Mach Learn Res, 2002, 1: 113–141

    MathSciNet  MATH  Google Scholar 

  15. 15

    Ruda H, Snorrason M, Shue D. Framework for automatic target recognition optimization. Cambridge Charles River Analytics Technical Report No. R96451. 1997

    Google Scholar 

  16. 16

    Jain A K, Duin R P W, Mao J C. Statistical pattern recognition: a review. IEEE Trans Patt Anal Mach Intell, 2000, 22: 4–37

    Article  Google Scholar 

  17. 17

    Sun J X. Modern Pattern Recognition (in Chinese). Changsha: National Defense Industry Press, 2002. 112–118

    Google Scholar 

  18. 18

    Song R, Zhang J, Xia S P, et al. An adaptive classification method of BP-NN group based classification system and its application (in Chinese). Acta Electron Sin, 2001, 29: 1950–1953

    Google Scholar 

  19. 19

    Logan J D. Applied Mathematics. 2nd ed. Hoboken: Wiley-Interscience Ltd., 1996. 54–60

    Google Scholar 

  20. 20

    UCI Machine Learning Repository. School of Information and Computer Sciences, University of California, Irvine, 2007

  21. 21

    Zhang Y X, Wang X D, Yao X, et al. HRRP recognition for polarization radar based on Bagging-SVM dynamic ensemble (in Chinese). Syst Eng Electron, 2012, 34: 1366–1372

    Google Scholar 

  22. 22

    Hastie T, Tibshirani R. Classification by pairwise grouping. In: Jordan M I, Kearns M J, Solla S A, eds. Advances in Neural Information Processing Systems 10. Cambridge: MIT Press, 1998. 451–471

    Google Scholar 

  23. 23

    Chapelle O, Vapnik V, Bousquet O, et al. Choosing multiple parameters for support machines. Mach Learn, 2002, 46: 131–159

    Article  MATH  Google Scholar 

  24. 24

    Demsar J. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res, 2006, 2: 1–30

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jindeng Zhou.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhou, J., Yang, Y., Zhang, M. et al. Constructing ECOC based on confusion matrix for multiclass learning problems. Sci. China Inf. Sci. 59, 1–14 (2016). https://doi.org/10.1007/s11432-015-5321-y

Download citation

Keywords

  • machine learning
  • multiclass classification
  • error correcting output codes
  • subclass partition
  • confusion matrix
  • 012107

关键词

  • 机器学习
  • 多类分类
  • 纠错输出编码
  • 子类划分
  • 混淆矩阵