Advertisement

A Cascade Multiple Classifier System for Document Categorization

  • Jian-Wu Xu
  • Vartika Singh
  • Venu Govindaraju
  • Depankar Neogi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5519)

Abstract

A novel cascade multiple classifier system (MCS) for document image classification is presented in the paper. It consists of two different classifiers with different feature sets. The proceeding classifier uses image features, learns physical representation of the document, and outputs a set of candidate class labels for the second classifier. The succeeding classifier is a hierarchical classification model based on textual features. The candidate labels set from the first classifier provides subtrees for the second classifier to search in the hierarchical tree and derive a final classification decision. Hence, it reduces the computational complexity and improves classification accuracy for the second classifier. We test the proposed cascade MCS on a large scale set of tax document classification. The experimental results show improvement of classification performance over individual classifiers.

Keywords

Document Classification Multiple-classifiers Classifier Combination 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Chen, N., Blostein, D.: A survey of document image classification: problem statement, classifier architecture and performance evaluation. Int. J. Doc. Anal. Recognit. 10, 1–16 (2007)CrossRefGoogle Scholar
  2. 2.
    Héroux, P., Diana, S., Ribert, A., Trupin, E.: Classification method study for automatic form class identification. In: Proc. Intl. Conf. on Pattern Recognition (ICPR), Brisbane, Australia, pp. 926–929 (1998)Google Scholar
  3. 3.
    Wenzel, C., Baumann, S., Jäger, T.: Advances in document classification by voting of competitive approaches. In: Proc. of Intl. Asso. for Pattern Recognition Workshop on Doc. Anal. Syst. (DAS), Malvern, USA, Octber 1996, pp. 352–372 (1996)Google Scholar
  4. 4.
    Alpaydin, E., Kaynak, C.: Cascading classifiers. Kybernetika 34, 369–374 (1998)zbMATHGoogle Scholar
  5. 5.
    Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20(3), 226–239 (1998)CrossRefGoogle Scholar
  6. 6.
    Xu, L., Krzyzak, A., Suen, C.Y.: Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans. Syst., Man and Cybern. 22(3), 418–435 (1992)CrossRefGoogle Scholar
  7. 7.
    Kittler, J., Matas, G., Jonsson, K., Sánchez, M.: Combining evidence in personal identity verification systems. Pattern Recog. Lett. 18(9), 845–852 (1997)CrossRefGoogle Scholar
  8. 8.
    Huang, Y.S., Suen, C.Y.: A method of combining multiple experts for the recognition of unconstrained handwritten numerals. IEEE Trans. Pattern Anal. Mach. Intell. 17(1) (1995)Google Scholar
  9. 9.
    Woods, K., Kegelmeyer, W.P., Bowyer, K.: Combination of multiple classifiers using local accuracy estimates. IEEE Trans. Pattern Anal. Mach. Intell. 19(4), 405–410 (1997)CrossRefGoogle Scholar
  10. 10.
    Larkey, L.S., Croft, W.B.: Combining classifiers in text categorization. In: Proc. of ACM SIGIR, pp. 289–297 (1996)Google Scholar
  11. 11.
    Hull, D., Pedersen, J., Schuetze, H.: Method combination for document filtering. In: Proc. of ACM SIGIR, pp. 279–287 (1996)Google Scholar
  12. 12.
    Yang, Y., Ault, T., Pierce, T.: Combining multiple learning strategies for effective cross validation. In: Proc. Intl. Conf. on Mach. Learn. (ICML), pp. 1167–1182 (2000)Google Scholar
  13. 13.
    Bennett, P.N., Dumais, S., Horvitz, E.: Probabilistic combination of text classifier using reliability indicators: Models and results. In: Proc. of ACM SIGIR, pp. 207–214 (2002)Google Scholar
  14. 14.
    Sarkar, P.: Image classification: classifying distributions of visual features. In: Proc. Intl. Conf. on Pattern Recognition (ICPR), Hong Kong, pp. 472–475 (2006)Google Scholar
  15. 15.
    Shin, C., Doermann, D., Rosenfeld, A.: Classification of document pages using structure-based features. Int. J. Doc. Anal. Recognit. 3(4), 232–247 (2001)CrossRefGoogle Scholar
  16. 16.
    Xu, J., Singh, V., Govindaraju, V., Neogi, D.: A hierarchical classification model for document categorization. In: Proc. Intl. Conf. on Doc. Anal. Recognit (ICDAR), Barcelona, Spain (July 2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Jian-Wu Xu
    • 1
  • Vartika Singh
    • 2
  • Venu Govindaraju
    • 2
  • Depankar Neogi
    • 1
  1. 1.Copanion Inc.AndoverUSA
  2. 2.Center for Unified Biometrics and SensorsUniversity at BuffaloUSA

Personalised recommendations