Data Representations in Learning
This paper examines the effect of varying the coarse-ness (or fine-ness) in a data representation upon the learning or recognition accuracy achievable. This accuracy is quantified by the least probability of error in recognition or the Bayes error rate, for a finite-class pattern recognition problem. We examine variation in recognition accuracy as a function of resolution, by modeling the granularity variation of the representation as a refinement of the underlying probability structure of the data. Specifically, refining the data representation leads to improved bounds on the probability of error. Indeed, this confirms the intuitive notion that more information can lead to improved decision-making. This analysis may be extended to multiresolution methods where coarse-to-fine and fineto-coarse variations in representations are possible.
We also discuss a general method to examine the effects of image resolution on recognizer performance. Empirical results in a 840-class Japanese optical character recognition task are presented. Considerable improvements in performance are observed as resolution increases from 40 to 200 ppi. However, diminshed performance improvements are observed at resolutions higher than 200 ppi. These results are useful in the design of optical character recognizers. We suggest that our results may be relevant to human letter recognition studies, where such an objective evaluation of the task is required.
KeywordsMutual Information Error Probability Recognition Accuracy Data Representation Character Recognition
Unable to display preview. Download preview PDF.
- H. S. Baird. Document image defect models and their uses. In Proceedings of ICDAR, 1993, 1993.Google Scholar
- R. E. Blahut. Principles and Practice of Information Theory. Addison-Wesley, 1990.Google Scholar
- T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley, 1991.Google Scholar
- R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. John Wiley, 1973.Google Scholar
- M. Feder and N. Merhay. Relations between entropy and error probability. IEEE Transactions on Information Theory, 1994.Google Scholar
- A. K. Jain and B. Chandrasekaran. Dimensionality and sample size considerations in pattern recognition practice. Handbook of Statistics-Classification, Pattern Recognition and Reduction of Dimensionality, Ed. P. R. Krishnaiah L. N. Kanal, 2: 835–855, 1982.Google Scholar
- A. V. Oppenheim and R. W. Schaeffer. Discrete-time Signal Processing. Prentice-Hall, 1989.Google Scholar
- P. Palumbo, S. N. Srihari, J. Soh, R. Sridhar, and V. Demjanenko. Postal address block location in real time. IEEE Computer, pages 34–42, 1992.Google Scholar
- S. N. Srihari. High-performance reading machines. Proceedings of the IEEE, 80: 1120 1132, 1992.Google Scholar
- S. N. Srihari and J. J. Hull. Character recognition. Encyclopaedia of Artificial Intelligence, 1, 1992.Google Scholar
- G. Srikantan. Image Sampling Rate and Image Pattern Recognition. Doctoral Dissertation, Department of Computer Science, SUNY at Buffalo, 1994.Google Scholar
- G. Srikantan and S. N. Srihari. A study relating image sampling rate and image pattern recognition. In CVPR-94. IEEE Press, 1994.Google Scholar
- L. Wang and T. Pavlidis. Direct gray-scale extraction of features for character recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15: 1053 1067, 1993.Google Scholar