The State of the Art of Document Image Degradation Modelling

Baird, Henry S.

doi:10.1007/978-1-84628-726-8_12

Henry S. Baird³

Part of the book series: Advances in Pattern Recognition ((ACVPR))

1260 Accesses
20 Citations

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Applied Image, Inc., 1653 East Main St, Rochester, NY 14609.
Google Scholar
Association for Information and Image Management, 1100 Wayne Avenue, Silver Spring, MD 20910. (Formerly the National Micrographics Association.)
Google Scholar
AIM USA, 634 Alpha Drive, Pittsburgh, PA 15238-2802. (Trade association for automatic identification and keyless data entry technologies.)
Google Scholar
American National Standards Institute, 11 W 42 St, New York City, NY 10036.
Google Scholar
American Society for Testing and Materials, 1916 Race Street, Philadelphia, PA 19103.
Google Scholar
Baird, H.S. (1987). The skew angle of printed documents. Proceedings of the 1987 Conference of the Society of Photographic Scientists and Engineers, Rochester, New York, May 20-21, 1987.
Google Scholar
Baird, H.S. (1988). Feature identification for hybrid structural/statistical pattern classification. Computer Vision, Graphics, and Image Processing, 42(3), pp. 318-333.
Article Google Scholar
Baird, H.S. (1990). Document image defect models. Proceedings of the IAPR Workshop on Syntactic and Structural Pattern Recognition, Murray Hill, NJ, June 13-15, 1990. Reprinted in H.S. Baird, H. Bunke, and K. Yamamoto (Eds.), Structured Document Image Analysis, Springer: New York, pp. 546-556,1992.
Google Scholar
Baird, H.S. (1993). Calibration of document image defect models. Proceed-ings, Second Annual Symposium on Document Analysis and Information Retrieval, Caesar’s Palace Hotel, Las Vegas, Nevada, April 26-28, 1993, pp. 1-16.
Google Scholar
Baird, H.S. (1993). Document image defect models and their uses. Proceed-ings, Second International Conference on Document Analysis and Recogni-tion, Tsukuba Science City, Japan, October 20-22, 1993, pp. 62-67.
Google Scholar
Baird, H.S. (1999). Document image quality: making fine discriminations. Proceedings of IAPR 1999 International Conference on Document Analysis and Recognition, Bangalore, India, September 20-22, 1999.
Google Scholar
Baird, H.S. and Fossey, R. (1991). A 100-font classifier. Proceedings of IAPR First ICDAR, St. Malo, France, September 30-October 2, 1991.
Google Scholar
Charles, E. and Biss, PSC, Inc., 770 Basket Road, P.O. Box 448, Webster, NY 14580-0448. (Chair, ANSI X3A1.3 Working Group on Image Quality.)
Google Scholar
Bloss, R. (1993). Personal communication, UNISYS, 41100 Plymouth Rd, Plymouth, MI 48170, March 1993.
Google Scholar
Blando, L., Kanai, J., and Nartker, T.A. (1995). Prediction of OCR accuracy using simple image features. Proceedings of IAPR International Conference on Document Analysis and Recognition, Montreal, Canada, pp. 319-322.
Google Scholar
Barney Smith, E.H. (2001). Scanner parameter estimation using bilevel scans of star charts. Proceedings of IAPR International Conference on Document Analysis and Recognition, Seattle, WA, pp. 1164-1168.
Google Scholar
Barney Smith, E.H. (2001). Estimating scanning characteristics from cor-ners in bilevel images. Proceedings of IS&T/SPIE Conference on Document Recognition and Retrieval VIII, Volume 4307, San Jose, CA, January 2001, pp. 176-183.
Google Scholar
Barney Smith, E.H. (1998). Optical Scanner Characterization Methods Using Bilevel Scans. Ph.D. dissertation. Rennselaer, Troy, NY: ECSE Department, December 1998. (Thesis advisor: G. Nagy.)
Google Scholar
Buntine, W. (1992). Learning classification trees. Statistics and Computing, 2, pp. 63-73.
Article Google Scholar
Clearwave Electronics, 8701 Buffalo Avenue, Niagara Falls, NY 14304.
Google Scholar
Cannon, M., Kelly, P., Iyengar, S.S., and Brener, N. (1997). An auto-mated system for numerically rating document image quality. Proceedings of IS&T/SPIE Conference on Document Recognition IV, San Jose, CA, January 12-13, 1997, pp. 161-167.
Google Scholar
Casey, R.G. and Nagy, G. (1984). Decision tree design using a probabilistic model. IEEE Transactions on Information Theory, 30(1), pp. 94-99.
Article Google Scholar
Dennis, S. and Phillips, I. (1999). Ground truthing: real or synthetic data - a panel discussion. Fifth International Conference on Document Analysis and Recognition, Bangalore, India, September 20-22, 1999.
Google Scholar
Edinger, J.R., Jr. (1987). The image analyzer—a tool for the evaluation of electrophotographic text quality. Journal of Imaging Science, 31(4), pp. 177-183.
Google Scholar
Allan Gilligan (1993). Personal communication. West Long Branch, NJ:AT&T Bell Laboratories.
Google Scholar
Ho, T.K. and Baird, H.S. (1993). Perfect metrics. Proceedings of IAPR Second ICDAR, Tsukuba, Japan, October 20-22, 1993.
Google Scholar
Ho, T.K. and Baird, H.S. (1992). Large-scale simulation studies in image pattern recognition. IEEE Transactions on PAMI, 19(10), pp. 1067-1079.
Google Scholar
Proceedings of the IEEE, Special Issue on OCR, July 1992.
Google Scholar
Society for Imaging Science and Technology, 7003 Kilworth Lane, Springfield, VA 22151.
Google Scholar
Jenkins, F. (1993). The Use of Synthesized Images to Evaluate the Per-formance of OCR Devices and Algorithms. Master’s Thesis. Las Vegas: University of Nevada.
Google Scholar
Kanungo, T. Haralick, R.M., and Phillips, I. (1993). Global and local doc-ument degradation models. Proceedings of IAPR Second ICDAR, Tsukuba, Japan, October 20-22, 1993.
Google Scholar
Kanungo, T., Haralick, R.M., Baird, H.S., Stuetzle, W., and Madigan, D. (1994). Document degradation models: parameter estimation and model validation. Proceedings of International Workshop on Machine Vision Applications, Kawasaki, Japan, December 13-15, 1994.
Google Scholar
Kanungo, T., Haralick, R.M., and Baird, H.S. (1995). Validation and estima-tion of document degradation models. Proceedings, Fourth Annual Sympo-sium on Document Analysis and Information Retrieval, Las Vegas, Nevada, April 24-26, 1995, pp. 217-225.
Google Scholar
Kanungo, T., Haralick, R.M., and Baird, H.S. (1995). Power functions and their use in selecting distance functions for document degradation model val-idation. Proceedings of IAPR Third International Conference on Document Analysis & Recognition, Montreal, Canada, August 14-16, 1995.
Google Scholar
Kanungo, T. (1996). Document degradation models and methodology for degradation model validation. Ph.D. dissertation. Department of Electrical Engineering, University of Washington [Supervisor: Prof. R.M. Haralick].
Google Scholar
Knuth, D.E. (1986). Computer Modern Typefaces. Reading, MA: Addison Wesley.
MATH Google Scholar
Kopec, G. and Chou, P. (1994). Document image decoding using Markov source models. IEEE Transactions on PAMI, 16, pp. 602-617.
Google Scholar
Robert Loce. (1993). Personal communication. Xerox Webster Research Center, 800 Phillips Road, Webster, NY 14580.
Google Scholar
Li, Y., Lopresti, D., Nagy, G., and Tompkins, A. (1996). Validation of image defect models for optical character recognition. IEEE Transactions on PAMI, 18(2), pp. 99-107.
Google Scholar
Lopresti, D., Zhou, J., Nagy, G., and Sarkar, P. (1995). Spatial sampling effects in optical character recognition. Proceedings of Second IAPR Interna-tional Conference on Document Analysis and Recognition, Montreal, Canada.
Google Scholar
Maltz, M. (1983). Light scattering in xerographic images. Journal of Applied Photographic Engineering, 9(3), pp. 83-89.
Google Scholar
Marshall, G.F. (Ed.) (1991). Optical Scanning. New York: Marcel Dekker.
Google Scholar
Mallows, C.L. and Baird, H.S. (1997). The evolution of a problem. Statistica Sinica, 7(1), pp. 211-220. Special issue in honor of H. Robbins.
MATH MathSciNet Google Scholar
Macbeth Corp., P.O. Box 230, Newburgh, NY 12551-0230.
Google Scholar
McLean, R. (1988). The Thames and Hudson Manual of Typography. London: Thames and Hudson.
Google Scholar
Maltz, M. and Szczepanik, J. (1988). MTF analysis of xerographic develop-ment and transfer. Journal of Imaging Science, 32(1), pp. 11-15.
Google Scholar
National Printing Equipment and Supply Association, 1899 Preston White Drive, Reston, VA 22091.
Google Scholar
Pavlidis, T. (1982). Algorithms for Graphics and Image Processing. Rockville, MD: Computer Science Press.
Google Scholar
Pavlidis, T. (1983). Effects of distortions on the recognition rate of a struc-tural OCR system. Proceedings of IEEE Computer Vision and Pattern Recognition Conference (CVPR’83), Washington, DC, June 21-23, 1983, pp. 303-309.
Google Scholar
Phillip, I.T., Chen, S., Ha, J., and Haralick, R.M. (1993). English docu-ment database design and implementation methodology. Proceedings, Sec-ond Annual Symposium on Document Analysis and Information Retrieval, Caesar’s Palace Hotel, Las Vegas, Nevada, April 26-28, 1993, pp. 65-104.
Google Scholar
Phillips, A. (1968). Computer Peripherals and Typesetting. London: Her Majesty’s Stationery Office.
Google Scholar
Porter, G. (1993). Personal communication, Xerox Webster Research Center, 800 Phillips Road, Webster, NY 14580.
Google Scholar
Rice, S.V., Nagy, G., and Nartker, T.A. (1999). OCR: An Illustrated Guide to the Frontier. Dordrecht: Kluwer, 1999.
Google Scholar
RDM Corp., 608 Weber St N., Waterloo, Ontario N2V 1K4, Canada.
Google Scholar
Rice, S.V., Kanai, J., and Nartker, T.A. (1992). A report on the accuracy of ocr devices, ISRI Technical Report TR-92-02, University of Nevada Las Vegas, Las Vegas, Nevada, 1992.
Google Scholar
Sarkar, P. and Baird, H.S. (2004). Decoder banks: versatility, automation, and high accuracy without supervised training. Proceedings of IAPR Sev-enteenth International Conference on Pattern Recognition, Cambridge, UK, August 23-26, 2004, Volume II, pp. 646-649.
Google Scholar
Sarkar, P., Baird, H.S., and Zhang, Q. (2003). Training on severely degraded text-line images. Proceedings of IAPR Seventh International Conference on Document Analysis and Recognition, Edinburgh, Scotland, August 3-6, 2003.
Google Scholar
Sabourin, M., Mitiche, A., Thomas, D., and Nagy, G. (1993). Hand-printed digit recognition using nearest neighbour classifiers. Proceedings of the Sec-ond Annual Symposium on Document Analysis and Information Retrieval, Caesar’s Palace Hotel, Las Vegas, Nevada, April 26-28, 1993, pp. 397-409.
Google Scholar
Schreiber, W.F. (1986). Fundamentals of Electronic Imaging Systems. Berlin: Springer.
Google Scholar
Seybold, J.W. (1984). The World of Digital Typesetting. Seybold Publica-tions, P.O. Box 644, Media, PA 19063, 1984.
Google Scholar
Barney Smith, E.H. (1998). Optical scanner characterization methods using bilevel scans, Ph.D. dissertation. Computer and Systems Engineering De-partment, Rennselaer Polytechnic Institute [Supervisor: Prof. G. Nagy].
Google Scholar
Society of Photo-Optical Instrumentation Engineers,100020th St, Bellingham, Washington, 98225.
Google Scholar
Summers, K. (2003). Document image improvement for OCR as a classifica-tion problem. Proceedings of IS&T/SPIE Conference on Document Recogni-tion and Retrieval X, Santa Clara, CA, January 22-24, 2003, SPIE Volume 5010, pp. 73-83.
Google Scholar
TAPPI, 15 Technology Parkway South, Norcross, GA 30092.
Google Scholar
Kanungo, T., Haralick, R.M., and Phillips, I. (1993). Global and local docu-ment degradation models, Submitted to IAPR Second International Confer-ence on Document Analysis and Recognition, Tsukuba, Japan, 1993.
Google Scholar
Tredwell, T. (1993). Personal communication, Head, Imaging Electronics Lab, Eastman Kodak Research Labs, Rochester, NY, March, 1993.
Google Scholar
Wilkenson, R.A., et al. (1992). The first census optical character recognition systems conference. NIST Internal Report, Gaithersburg, Maryland, 1992.
Google Scholar
Wang, Q.R. and Suen, C.Y. (1987). Large tree classifier with heuristic search and global training. IEEE Transactions on PAMI, 9(1), pp. 91-102.
Google Scholar
Yam, H.S. and Barney Smith, E.H. (2003). Estimating degradation model parameters from character images. Proceedings of IAPR International Con-ference on Document Analysis and Recognition (ICDAR’03), Edinburgh, Scotland, August 3-6, 2003.
Google Scholar
Zeirler, N. (1969). Primitive trinomials whose degree is a mersenne exponent. Inf. Control.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Engineering Department, Lehigh University, 19 Memorial Drive West, 18015, PA, Bethlehem, USA
Henry S. Baird

Authors

Henry S. Baird
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Indian Statistical Institute, Kolkata, India
Bidyut B. Chaudhuri PhD

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Baird, H.S. (2007). The State of the Art of Document Image Degradation Modelling. In: Chaudhuri, B.B. (eds) Digital Document Processing. Advances in Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-84628-726-8_12

Download citation

DOI: https://doi.org/10.1007/978-1-84628-726-8_12
Publisher Name: Springer, London
Print ISBN: 978-1-84628-501-1
Online ISBN: 978-1-84628-726-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics