Building cameras for capturing documents

  • Stephen PollardEmail author
  • Maurizio Pilu


This paper explores those aspects of document capture that are specific to cameras. Each of them must be addressed in order to close the gap between taking a photograph of a document and capturing the document itself. We present results in five areas: (1) framing documents using structured light, (2) robustly dealing with ambient illumination when capturing glossy documents, (3) improving text quality when using mosaiced color sensors, (4) robustly and passively recovering perspective and image plane skew using text flow, and (5) measuring and undoing page curl using structured light and an applicable surface model. The ultimate success of subsequent document recognition will be heavily dependent on the successful completion of these tasks.


Color Image Processing Pattern Recognition Image Plane Surface Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ishii H, Kobayash M, Arita K, Yagi T (1997) Iterative design of collaboration media. In: Finn KE, Sellen AJ, Wilber SB (eds) Video-mediated communication, Chap 21. Erlbaum, Mahwah, NJGoogle Scholar
  2. 2.
    Brown BAT, Sellen AJ, O’Hara KP (2000) A diary study of information capture in working life. In: Proceedings of CHI 2000, The Hague, The Netherlands, pp 438-445Google Scholar
  3. 3.
    Pollard SB, Pilu M, Goris AC (2000) Framing aid for a document capture device. European Patent Application EP1128655Google Scholar
  4. 4.
    Soifer VA, Golub MA (1994) Laser beam mode selection by computer generated holograms. CRC Press, Boca Raton, FLGoogle Scholar
  5. 5.
    Frost P, Pollard S, Pilu M (1999) Framing aids to support document capture using digital cameras: a user study. HP Labs Technical Report HPL-99-146Google Scholar
  6. 6.
    Judd DB (1937) Gloss and glossiness. Am Dyest Rep 26:234-235Google Scholar
  7. 7.
    Foley J, vanDam AM, Feiner S, Hughes J (1990) Computer graphics: principles and practice. Addison Wesley, Reading, MAGoogle Scholar
  8. 8.
    Pollard SB, Pilu M (2000) Practical modelling of specularity from strobes in close-up imaging. HP Labs Technical Report HPL-2000-150Google Scholar
  9. 9.
    Pollard SB, Pilu M (2002) Digital cameras. European Patent Application EP1233606Google Scholar
  10. 10.
    Adams JE (1997) Design of practical color filter array interpolation algorithms for digital cameras. In: Proceedings of SPIE Real Time Imaging II, 3028:117-125Google Scholar
  11. 11.
    Hunter AA, Pollard SB (2002) Image mosaic data reconstruction. US Patent Application 09/906, 786Google Scholar
  12. 12.
    Gonzalez RC (1992) Digital image processing. Addison Wesley, Reading, MA, pp 196-197Google Scholar
  13. 13.
    Haralick RM (1989) Monocular vision using inverse perspective projection geometry: analytic relations. In: CVPR, pp 370-378Google Scholar
  14. 14.
    Taylor MJ, Zappala A, Newman WM, Dance CR (1999) Documents through cameras. Image Vis Comput 17(11):831-844Google Scholar
  15. 15.
    Nakano Y, Shima Y, Fujisawa H, Higashino J, Fojinawa M (1990) An algorithm for the skew normalization of document images. In: ICPR, 2:8-13Google Scholar
  16. 16.
    Hashizume A, Yeh PS, Rosenfeld A (1986) A method of detecting the orientation of aligned components. Pattern Recog Lett 4:125-132Google Scholar
  17. 17.
    Messelodi S, Modena CM (1999) Automatic identification and skew estimation of text lines in real scene images. Pattern Recog 32:791-810Google Scholar
  18. 18.
    Coughlan JM, Yuille AL (1999) Manhattan world: compass direction from single image by Bayesian inference. In: International conference on computer vision, pp 941-947Google Scholar
  19. 19.
    Kwon JS, Hong HK, Choi JS (1996) Obtaining a 3D orientation of projective textures using a morphological method. Pattern Recog 29:725-732Google Scholar
  20. 20.
    Clark P, Mirmhedi M (2000) Location and recovery of text on oriented surfaces. SPIE conference on electronic imaging: document recognition and retrieval VII, January 2000Google Scholar
  21. 21.
    Clark P, Mirmehdi M (2003) Rectifying perspective views of text in 3D scenes using vanishing points. Pattern Recog 36(11):2673-2686Google Scholar
  22. 22.
    Pilu M (2001) Extraction of illusory linear clues in perspectively skewed documents. In: CVPR, December 2001Google Scholar
  23. 23.
    Pilu M (2001) Perspective deskewing of documents from linear clues. HP Labs Technical Report HPL-2001-6, January 2001Google Scholar
  24. 24.
    Pilu M (2002) Document capture. US Patent Application US20020149808 A1Google Scholar
  25. 25.
    Bruce V, Green PR (1991) Visual perception, 2nd edn. Psychology Press, East Sussex, UKGoogle Scholar
  26. 26.
    Pilu M, Pollard S (2002) A light-weight text image processing method for handheld embedded cameras. In: British Machine Vision Conference, September 2002Google Scholar
  27. 27.
    Haralich R, Shapiro L (1992) Computer and robot vision. Addison Wesley, Reading, MAGoogle Scholar
  28. 28.
    Fischler MA, Bolles RC (1981) A RANSAC-based approach to model fitting and its application to finding cylinders in range data. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp 637-643Google Scholar
  29. 29.
    Hartley RI (1999) Theory and practice of projective rectification. Int J Comput Vis 35(2):1-16Google Scholar
  30. 30.
    Pilu M (1998) Page curl recovery with structured light. HP Labs Technical Report HPL-98-174, October 1998Google Scholar
  31. 31.
    Pilu M (2000) Document imaging system. European Patent Application EP00946058Google Scholar
  32. 32.
    Pilu M (2002) Undoing page curl using applicable surfaces. : In: CVPR, Kauai, HI, December 2001Google Scholar
  33. 33.
    Wang YF, Aggarwal JK (1998) An overview of geometric modeling using active sensing. IEEE Control Syst Mag 8(3):5-13Google Scholar
  34. 34.
    Besl PJ, Jain RC (1985) Three-dimensional object recognition. Comput Surv 17(1):75-145Google Scholar
  35. 35.
    Xerox Corp (1998) Platenless book scanning system with a general imaging geometry. US Patent 5,760,925, June 1998Google Scholar
  36. 36.
    Xerox Corp (1998) Platenless book scanner with line buffering to compensate for image skew. US Patent 5,764,383, June 1998Google Scholar
  37. 37.
    Minolta Camera Kabushiki Kaisha (1992) Document reading apparatus for detection of curvature in documents. US Patent 5,084,611, January 1992Google Scholar
  38. 38.
    Ng HN, Grimsdale L (1996) Computer graphic techniques for modeling cloth. IEEE Comput Graph Appl 16(5):28-45Google Scholar
  39. 39.
    Ma SD, Lin H (1998) Optimal texture mapping. In: Eurographics. Elsevier, AmsterdamGoogle Scholar
  40. 40.
    Do Carmo MP (1976) Differential geometry of curves and surfaces. Prentice-Hall, Upper Saddle River, NJGoogle Scholar

Copyright information

© Springer-Verlag Berlin/Heidelberg 2005

Authors and Affiliations

  1. 1.Hewlett-Packard LaboratoriesBristolUK

Personalised recommendations