Advertisement

Classifier Adaptation with Non-representative Training Data

  • Sriharsha Veeramachaneni
  • George Nagy
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2423)

Abstract

We propose an adaptive methodology to tune the decision boundaries of a classifier trained on non-representative data to the statistics of the test data to improve accuracy. Specifically, for machine printed and handprinted digit recognition we demonstrate that adapting the class means alone can provide considerable gains in recognition. On machineprinted digits we adapt to the typeface, on hand-print to the writer. We recognize the digits with a Gaussian quadratic classifier when the style of the test set is represented by a subset of the training set, and also when it is not represented in the training set. We compare unsupervised adaptation and style-constrained classification on isogenous test sets of five machine-printed and two hand-printed NIST data sets. Both estimating mean and imposing style constraints reduce the error-rate in almost every case, and neither ever results in signi.cant loss. They are comparable under the first scenario (specialization), but adaptation is better under the second (new style). Adaptation is bene.cial when the test is large enough (even if only ten samples of each class by one writer in a 100- dimensional feature space), but style conscious classification is the only option with fields of only two or three digits.

Keywords

Error Count Maximum Likelihood Linear Regression Handwritten Numeral Hierarchical Bayesian Approach Adaptive Methodology 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    C. J. Leggetter and P. C. Woodland. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markovmo dels. Computer Speech and Language, 9(2):171–185, April 1995.CrossRefGoogle Scholar
  2. 2.
    V. Castelli and T. M. Cover. The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter. IEEE Transactions on Information Theory, 42:2102–2117, November 1996.Google Scholar
  3. 3.
    G. Nagy and G. L. Shelton Jr. Self-corrective character recognition system. IEEE Transactions on Information Theory, IT-12(2):215–222, April 1966.CrossRefGoogle Scholar
  4. 4.
    H. S. Baird and G. Nagy. A self-correcting 100-font classifier. In L. Vincent and T. Pavlidis, editors, Document Recognition, Proceedings of the SPIE, volume 2181, pages 106–115, 1994.Google Scholar
  5. 5.
    P. Sarkar. Style consistency in pattern fields. PhD thesis, Rensselaer Polytechnic Institute, Troy, NY, 2000.Google Scholar
  6. 6.
    P. Sarkar and G. Nagy. Style consistency in isogenous patterns. In Proceedings of the Sixth International Conference on Document Analysis and Recognition, pages 1169–1174, 2001.Google Scholar
  7. 7.
    S. Veeramachaneni, H. Fujisawa, C.-L. Liu, and G. Nagy. Style-conscious quadratic field classifier. In Proceedings of the Sixteenth International Conference on Pattern Recognition, 2002. (Accepted).Google Scholar
  8. 8.
    C. Mathis and T. Breuel. Classification using a Hierarchical Bayesian Approach. Submitted for publication.Google Scholar
  9. 9.
    C.L. Liu, H. Sako, and H. Fujisawa. Performance evaluation of pattern classifiers for handwritten character recognition. International Journal on Document Analysis and Recognition, 4(3):191–204, 2002.CrossRefGoogle Scholar
  10. 10.
    P. Grother. Handprinted forms and character database, NIST special database 19, March 1995. Technical Report and CDROM.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Sriharsha Veeramachaneni
    • 1
  • George Nagy
    • 1
  1. 1.Rensselaer Polytechnic InstituteTroyUSA

Personalised recommendations