Skip to main content

The State of the Art of Document Image Degradation Modelling

  • Chapter
Digital Document Processing

Part of the book series: Advances in Pattern Recognition ((ACVPR))

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Applied Image, Inc., 1653 East Main St, Rochester, NY 14609.

    Google Scholar 

  2. Association for Information and Image Management, 1100 Wayne Avenue, Silver Spring, MD 20910. (Formerly the National Micrographics Association.)

    Google Scholar 

  3. AIM USA, 634 Alpha Drive, Pittsburgh, PA 15238-2802. (Trade association for automatic identification and keyless data entry technologies.)

    Google Scholar 

  4. American National Standards Institute, 11 W 42 St, New York City, NY 10036.

    Google Scholar 

  5. American Society for Testing and Materials, 1916 Race Street, Philadelphia, PA 19103.

    Google Scholar 

  6. Baird, H.S. (1987). The skew angle of printed documents. Proceedings of the 1987 Conference of the Society of Photographic Scientists and Engineers, Rochester, New York, May 20-21, 1987.

    Google Scholar 

  7. Baird, H.S. (1988). Feature identification for hybrid structural/statistical pattern classification. Computer Vision, Graphics, and Image Processing, 42(3), pp. 318-333.

    Article  Google Scholar 

  8. Baird, H.S. (1990). Document image defect models. Proceedings of the IAPR Workshop on Syntactic and Structural Pattern Recognition, Murray Hill, NJ, June 13-15, 1990. Reprinted in H.S. Baird, H. Bunke, and K. Yamamoto (Eds.), Structured Document Image Analysis, Springer: New York, pp. 546-556,1992.

    Google Scholar 

  9. Baird, H.S. (1993). Calibration of document image defect models. Proceed-ings, Second Annual Symposium on Document Analysis and Information Retrieval, Caesar’s Palace Hotel, Las Vegas, Nevada, April 26-28, 1993, pp. 1-16.

    Google Scholar 

  10. Baird, H.S. (1993). Document image defect models and their uses. Proceed-ings, Second International Conference on Document Analysis and Recogni-tion, Tsukuba Science City, Japan, October 20-22, 1993, pp. 62-67.

    Google Scholar 

  11. Baird, H.S. (1999). Document image quality: making fine discriminations. Proceedings of IAPR 1999 International Conference on Document Analysis and Recognition, Bangalore, India, September 20-22, 1999.

    Google Scholar 

  12. Baird, H.S. and Fossey, R. (1991). A 100-font classifier. Proceedings of IAPR First ICDAR, St. Malo, France, September 30-October 2, 1991.

    Google Scholar 

  13. Charles, E. and Biss, PSC, Inc., 770 Basket Road, P.O. Box 448, Webster, NY 14580-0448. (Chair, ANSI X3A1.3 Working Group on Image Quality.)

    Google Scholar 

  14. Bloss, R. (1993). Personal communication, UNISYS, 41100 Plymouth Rd, Plymouth, MI 48170, March 1993.

    Google Scholar 

  15. Blando, L., Kanai, J., and Nartker, T.A. (1995). Prediction of OCR accuracy using simple image features. Proceedings of IAPR International Conference on Document Analysis and Recognition, Montreal, Canada, pp. 319-322.

    Google Scholar 

  16. Barney Smith, E.H. (2001). Scanner parameter estimation using bilevel scans of star charts. Proceedings of IAPR International Conference on Document Analysis and Recognition, Seattle, WA, pp. 1164-1168.

    Google Scholar 

  17. Barney Smith, E.H. (2001). Estimating scanning characteristics from cor-ners in bilevel images. Proceedings of IS&T/SPIE Conference on Document Recognition and Retrieval VIII, Volume 4307, San Jose, CA, January 2001, pp. 176-183.

    Google Scholar 

  18. Barney Smith, E.H. (1998). Optical Scanner Characterization Methods Using Bilevel Scans. Ph.D. dissertation. Rennselaer, Troy, NY: ECSE Department, December 1998. (Thesis advisor: G. Nagy.)

    Google Scholar 

  19. Buntine, W. (1992). Learning classification trees. Statistics and Computing, 2, pp. 63-73.

    Article  Google Scholar 

  20. Clearwave Electronics, 8701 Buffalo Avenue, Niagara Falls, NY 14304.

    Google Scholar 

  21. Cannon, M., Kelly, P., Iyengar, S.S., and Brener, N. (1997). An auto-mated system for numerically rating document image quality. Proceedings of IS&T/SPIE Conference on Document Recognition IV, San Jose, CA, January 12-13, 1997, pp. 161-167.

    Google Scholar 

  22. Casey, R.G. and Nagy, G. (1984). Decision tree design using a probabilistic model. IEEE Transactions on Information Theory, 30(1), pp. 94-99.

    Article  Google Scholar 

  23. Dennis, S. and Phillips, I. (1999). Ground truthing: real or synthetic data - a panel discussion. Fifth International Conference on Document Analysis and Recognition, Bangalore, India, September 20-22, 1999.

    Google Scholar 

  24. Edinger, J.R., Jr. (1987). The image analyzer—a tool for the evaluation of electrophotographic text quality. Journal of Imaging Science, 31(4), pp. 177-183.

    Google Scholar 

  25. Allan Gilligan (1993). Personal communication. West Long Branch, NJ:AT&T Bell Laboratories.

    Google Scholar 

  26. Ho, T.K. and Baird, H.S. (1993). Perfect metrics. Proceedings of IAPR Second ICDAR, Tsukuba, Japan, October 20-22, 1993.

    Google Scholar 

  27. Ho, T.K. and Baird, H.S. (1992). Large-scale simulation studies in image pattern recognition. IEEE Transactions on PAMI, 19(10), pp. 1067-1079.

    Google Scholar 

  28. Proceedings of the IEEE, Special Issue on OCR, July 1992.

    Google Scholar 

  29. Society for Imaging Science and Technology, 7003 Kilworth Lane, Springfield, VA 22151.

    Google Scholar 

  30. Jenkins, F. (1993). The Use of Synthesized Images to Evaluate the Per-formance of OCR Devices and Algorithms. Master’s Thesis. Las Vegas: University of Nevada.

    Google Scholar 

  31. Kanungo, T. Haralick, R.M., and Phillips, I. (1993). Global and local doc-ument degradation models. Proceedings of IAPR Second ICDAR, Tsukuba, Japan, October 20-22, 1993.

    Google Scholar 

  32. Kanungo, T., Haralick, R.M., Baird, H.S., Stuetzle, W., and Madigan, D. (1994). Document degradation models: parameter estimation and model validation. Proceedings of International Workshop on Machine Vision Applications, Kawasaki, Japan, December 13-15, 1994.

    Google Scholar 

  33. Kanungo, T., Haralick, R.M., and Baird, H.S. (1995). Validation and estima-tion of document degradation models. Proceedings, Fourth Annual Sympo-sium on Document Analysis and Information Retrieval, Las Vegas, Nevada, April 24-26, 1995, pp. 217-225.

    Google Scholar 

  34. Kanungo, T., Haralick, R.M., and Baird, H.S. (1995). Power functions and their use in selecting distance functions for document degradation model val-idation. Proceedings of IAPR Third International Conference on Document Analysis & Recognition, Montreal, Canada, August 14-16, 1995.

    Google Scholar 

  35. Kanungo, T. (1996). Document degradation models and methodology for degradation model validation. Ph.D. dissertation. Department of Electrical Engineering, University of Washington [Supervisor: Prof. R.M. Haralick].

    Google Scholar 

  36. Knuth, D.E. (1986). Computer Modern Typefaces. Reading, MA: Addison Wesley.

    MATH  Google Scholar 

  37. Kopec, G. and Chou, P. (1994). Document image decoding using Markov source models. IEEE Transactions on PAMI, 16, pp. 602-617.

    Google Scholar 

  38. Robert Loce. (1993). Personal communication. Xerox Webster Research Center, 800 Phillips Road, Webster, NY 14580.

    Google Scholar 

  39. Li, Y., Lopresti, D., Nagy, G., and Tompkins, A. (1996). Validation of image defect models for optical character recognition. IEEE Transactions on PAMI, 18(2), pp. 99-107.

    Google Scholar 

  40. Lopresti, D., Zhou, J., Nagy, G., and Sarkar, P. (1995). Spatial sampling effects in optical character recognition. Proceedings of Second IAPR Interna-tional Conference on Document Analysis and Recognition, Montreal, Canada.

    Google Scholar 

  41. Maltz, M. (1983). Light scattering in xerographic images. Journal of Applied Photographic Engineering, 9(3), pp. 83-89.

    Google Scholar 

  42. Marshall, G.F. (Ed.) (1991). Optical Scanning. New York: Marcel Dekker.

    Google Scholar 

  43. Mallows, C.L. and Baird, H.S. (1997). The evolution of a problem. Statistica Sinica, 7(1), pp. 211-220. Special issue in honor of H. Robbins.

    MATH  MathSciNet  Google Scholar 

  44. Macbeth Corp., P.O. Box 230, Newburgh, NY 12551-0230.

    Google Scholar 

  45. McLean, R. (1988). The Thames and Hudson Manual of Typography. London: Thames and Hudson.

    Google Scholar 

  46. Maltz, M. and Szczepanik, J. (1988). MTF analysis of xerographic develop-ment and transfer. Journal of Imaging Science, 32(1), pp. 11-15.

    Google Scholar 

  47. National Printing Equipment and Supply Association, 1899 Preston White Drive, Reston, VA 22091.

    Google Scholar 

  48. Pavlidis, T. (1982). Algorithms for Graphics and Image Processing. Rockville, MD: Computer Science Press.

    Google Scholar 

  49. Pavlidis, T. (1983). Effects of distortions on the recognition rate of a struc-tural OCR system. Proceedings of IEEE Computer Vision and Pattern Recognition Conference (CVPR’83), Washington, DC, June 21-23, 1983, pp. 303-309.

    Google Scholar 

  50. Phillip, I.T., Chen, S., Ha, J., and Haralick, R.M. (1993). English docu-ment database design and implementation methodology. Proceedings, Sec-ond Annual Symposium on Document Analysis and Information Retrieval, Caesar’s Palace Hotel, Las Vegas, Nevada, April 26-28, 1993, pp. 65-104.

    Google Scholar 

  51. Phillips, A. (1968). Computer Peripherals and Typesetting. London: Her Majesty’s Stationery Office.

    Google Scholar 

  52. Porter, G. (1993). Personal communication, Xerox Webster Research Center, 800 Phillips Road, Webster, NY 14580.

    Google Scholar 

  53. Rice, S.V., Nagy, G., and Nartker, T.A. (1999). OCR: An Illustrated Guide to the Frontier. Dordrecht: Kluwer, 1999.

    Google Scholar 

  54. RDM Corp., 608 Weber St N., Waterloo, Ontario N2V 1K4, Canada.

    Google Scholar 

  55. Rice, S.V., Kanai, J., and Nartker, T.A. (1992). A report on the accuracy of ocr devices, ISRI Technical Report TR-92-02, University of Nevada Las Vegas, Las Vegas, Nevada, 1992.

    Google Scholar 

  56. Sarkar, P. and Baird, H.S. (2004). Decoder banks: versatility, automation, and high accuracy without supervised training. Proceedings of IAPR Sev-enteenth International Conference on Pattern Recognition, Cambridge, UK, August 23-26, 2004, Volume II, pp. 646-649.

    Google Scholar 

  57. Sarkar, P., Baird, H.S., and Zhang, Q. (2003). Training on severely degraded text-line images. Proceedings of IAPR Seventh International Conference on Document Analysis and Recognition, Edinburgh, Scotland, August 3-6, 2003.

    Google Scholar 

  58. Sabourin, M., Mitiche, A., Thomas, D., and Nagy, G. (1993). Hand-printed digit recognition using nearest neighbour classifiers. Proceedings of the Sec-ond Annual Symposium on Document Analysis and Information Retrieval, Caesar’s Palace Hotel, Las Vegas, Nevada, April 26-28, 1993, pp. 397-409.

    Google Scholar 

  59. Schreiber, W.F. (1986). Fundamentals of Electronic Imaging Systems. Berlin: Springer.

    Google Scholar 

  60. Seybold, J.W. (1984). The World of Digital Typesetting. Seybold Publica-tions, P.O. Box 644, Media, PA 19063, 1984.

    Google Scholar 

  61. Barney Smith, E.H. (1998). Optical scanner characterization methods using bilevel scans, Ph.D. dissertation. Computer and Systems Engineering De-partment, Rennselaer Polytechnic Institute [Supervisor: Prof. G. Nagy].

    Google Scholar 

  62. Society of Photo-Optical Instrumentation Engineers,100020th St, Bellingham, Washington, 98225.

    Google Scholar 

  63. Summers, K. (2003). Document image improvement for OCR as a classifica-tion problem. Proceedings of IS&T/SPIE Conference on Document Recogni-tion and Retrieval X, Santa Clara, CA, January 22-24, 2003, SPIE Volume 5010, pp. 73-83.

    Google Scholar 

  64. TAPPI, 15 Technology Parkway South, Norcross, GA 30092.

    Google Scholar 

  65. Kanungo, T., Haralick, R.M., and Phillips, I. (1993). Global and local docu-ment degradation models, Submitted to IAPR Second International Confer-ence on Document Analysis and Recognition, Tsukuba, Japan, 1993.

    Google Scholar 

  66. Tredwell, T. (1993). Personal communication, Head, Imaging Electronics Lab, Eastman Kodak Research Labs, Rochester, NY, March, 1993.

    Google Scholar 

  67. Wilkenson, R.A., et al. (1992). The first census optical character recognition systems conference. NIST Internal Report, Gaithersburg, Maryland, 1992.

    Google Scholar 

  68. Wang, Q.R. and Suen, C.Y. (1987). Large tree classifier with heuristic search and global training. IEEE Transactions on PAMI, 9(1), pp. 91-102.

    Google Scholar 

  69. Yam, H.S. and Barney Smith, E.H. (2003). Estimating degradation model parameters from character images. Proceedings of IAPR International Con-ference on Document Analysis and Recognition (ICDAR’03), Edinburgh, Scotland, August 3-6, 2003.

    Google Scholar 

  70. Zeirler, N. (1969). Primitive trinomials whose degree is a mersenne exponent. Inf. Control.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag London Limited

About this chapter

Cite this chapter

Baird, H.S. (2007). The State of the Art of Document Image Degradation Modelling. In: Chaudhuri, B.B. (eds) Digital Document Processing. Advances in Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-84628-726-8_12

Download citation

  • DOI: https://doi.org/10.1007/978-1-84628-726-8_12

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84628-501-1

  • Online ISBN: 978-1-84628-726-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics