Fourier–Mellin registration of line-delineated tabular document images

  • Luke A. D. HutchisonEmail author
  • William A. Barrett
Original Paper


Image registration (or alignment) is a useful preprocessing tool for assisting in manual data extraction from handwritten forms, as well as for preparing documents for batch OCR of specific page regions. A new technique is presented for fast registration of lined tabular document images in the presence of a global affine transformation, using the Discrete Fourier--Mellin Transform (DFMT). Each component of the affine transform is handled separately, which dramatically reduces the total parameter space of the problem. This method is robust and deals with all components of the affine transform in a uniform way by working in the frequency domain. The DFMT is extended to handle shear, which can approximate a small amount of perspective distortion. In order to limit registration to foreground pixels only, and to eliminate Fourier edge effects, a novel, locally adaptive foreground-background segmentation algorithm is introduced, based on the median filter, which eliminates the need for Blackman windowing as usually required by DFMT image registration. A novel information-theoretic optimization of the median filter is presented. An original method is demonstrated for automatically obtaining blank document templates from a set of registered document images.


Document image registration Deskewing Background removal Fourier–Mellin transform Transformation reversal Locally-adaptive thresholding Microfilm processing Tabular document images Family history technology 


  1. 1.
    Barrett, W., Hutchison, L., Quass, D., Nielson, H., Kennard, D.: Digital mountain: from granite archive to global access. In: Proceedings of the Document Image Analysis for Libraries. IEEE (2004)Google Scholar
  2. 2.
    Brown, L.G.: A survey of image registration techniques. ACM Comput. Surv. 24(4), 325–376 (1992)CrossRefGoogle Scholar
  3. 3.
    Nagy, G.: Twenty years of document image analysis in PAMI. Trans. Pattern Anal. Mach. Intell. 22(1), 38–62 (2000)CrossRefGoogle Scholar
  4. 4.
    Zitova, B., Flusser, J.: Image registration methods: a survey. Image Visual Comput. 21(11), 977–1000 (2003)CrossRefGoogle Scholar
  5. 5.
    Nielson, H.E., Barrett, W.A.: Consensus-based table form recognition. In: Proceedings, Seventh International Conference on Document Analysis and Recognition, August 2003, vol. II, pp. 906–910Google Scholar
  6. 6.
    Chandran, S., Kasturi, R.: Structural recognition of tabulated data. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 516–519 (1993)Google Scholar
  7. 7.
    Tang, Y., Liu, J., Li, B.F., Xi, D.: Multiresolution analysis in extraction of reference lines from documents with gray level background. Trans. Pattern Anal. Mach. Intell. 19(8) (1997)Google Scholar
  8. 8.
    Xi, D., Lee, S.: Table structure extraction from form documents based on gradient-wavelet scheme. In: Proceedings of the Document Analysis Systems: Theory and Practice: Third IAPR Workshop. International Association for Pattern Recognition (1998)Google Scholar
  9. 9.
    Chandran, S., Balasubramanian, S., Gandhi, T., Prasad, A., Kasturi, R.: Structure recognition and information extraction from tabular documents. Int. J. Imag. Syst. Tech. 7, 289–303 (1996)CrossRefGoogle Scholar
  10. 10.
    Vinciarelli, A.: A survey on off-line cursive script recognition. Pattern Recog. 35(7), 1433–1446 (2002)CrossRefzbMATHGoogle Scholar
  11. 11. News release: Facts About the 1880 U.S. Census. The Church of Jesus Christ of Latter-day Saints.,15367,3881--1--4--645,00.html (October 2002)
  12. 12.
    Doermann, D.: The indexing and retrieval of document images: a survey. Comput. Vis. Image Understand. 70(3), 287–298 (1998)CrossRefGoogle Scholar
  13. 13.
    Plamondon, R., Srihari, S.: On-Line and off-line handwriting recognition: a comprehensive survey. Trans. Pattern Anal. Mach. Intell. 22(1) (2000)Google Scholar
  14. 14.
    Steinherz, T., Rivlin, E., Intrator, N.: Offline cursive script word recognition – a survey. Int. J. Doc. Anal. Recog. 2(2/3), 90–110 (1999)Google Scholar
  15. 15.
    Kia, O., Doermann, D.: Structural compression for document analysis. In: Proceedings of the International Conference on Pattern Recognition (1996)Google Scholar
  16. 16.
    Kia, O., Doermann, D.: Integrated segmentation and clustering for enhanced compression of document images. In: Proceedings of the International Conference on Document Analysis and Recognition, vol. 1 (1997)Google Scholar
  17. 17.
    Postl, W.: Detection of linear oblique structures and skew scan in digitized documents. In: Proceedings of the International Conference on Pattern Recognition pp. 687–689 (1986)Google Scholar
  18. 18.
    Postl, W.: Method for automatic correction of character skew in the acquisition of a text original in the form of digital scan results. U.S. Patent number 4,723,297, U.S. Patent and Trademarks Office (1988)Google Scholar
  19. 19.
    Baird, H.S.: The skew angle of printed documents. In: Proceedings of the Society of Photographic Scientists and Engineers vol. 40, pp. 21–24 (1987)Google Scholar
  20. 20.
    Duda, R., Hart, P.: Transformation to detect lines and curves in pictures. Commun. ACM 15(1), 11–15 (1972)CrossRefGoogle Scholar
  21. 21.
    Lee, D.X., Thoma, G., Weschler, H.: Automated page orientation and skew angle detection for binary document images. Pattern Recog. 27(10), 1325–1344 (1994)CrossRefGoogle Scholar
  22. 22.
    Amin, A., Fischer, S., Parkinson, T., Shiu, R.: Fast algorithm for skew detection. In: Proceedings of the Symposium on Electronic Imaging, IS&T/SPIE (The International Society for Optical Engineering) (1996)Google Scholar
  23. 23.
    Hinds, S.C., Fisher, J.L., Amato, D.P.D.: A document skew detection method using run-length encoding and the hough transform. In: Proceedings of the International Conference on Pattern Recognition, pp. 464–468 (1990)Google Scholar
  24. 24.
    Perantonis, S., Gatos, B., Papamarkos, N.: Block decomposition and segmentation for fast Hough transform evaluation. Pattern Recog. 32(5), 811–824 (1999)CrossRefGoogle Scholar
  25. 25.
    Cao, Y., Wang, S., Li, H.: Skew detection and correction in document images based on straight-line fitting. Pattern Recog. Lett. 24(12), 1871–1879 (2003)CrossRefGoogle Scholar
  26. 26.
    Si, D.: Skew and slant correction for document images using gradient direction. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 142–146 (1997)Google Scholar
  27. 27.
    Okun, O., Pietikainen, M., Sauvola, J.J.: Document skew estimation without angle range restriction. Int. J. Doc. Anal. Recog. 2(2/3), 132–144 (1999)CrossRefGoogle Scholar
  28. 28.
    Steinherz, T., Intrator, N., Rivlin, E.: Skew detection via principal components analysis. In: Proceedings of the International Conference on Document Analysis and Recognition (1999)Google Scholar
  29. 29.
    Sauvola, J., Pietikäinen, M.: Skew angle detection using texture direction analysis. In: Proceedings, Scandinavian Conference on Image Analysis (1995)Google Scholar
  30. 30.
    Najman, L.: Using mathematical morphology for document skew estimation. In: Proceedings of the Symposium on Electronic Imaging: Document Recognition and Retrieval XI, IS&T/SPIE (The International Society for Optical Engineering) (2004)Google Scholar
  31. 31.
    Kavallieratou, E., Fakotakis, N., Kokkinakis, G.: Skew angle estimation for printed and handwritten documents using the Wigner–Ville distribution. Image Vis. Comput. 20, 813–824 (2002)CrossRefGoogle Scholar
  32. 32.
    Garris, M.D., Grother, P.J.: Generalized form registration using structure-based techniques. In: Proceedings of the Fifth Annual Symposium on Document Analysis and Information Retrieval, pp. 321–334 (1996)Google Scholar
  33. 33.
    Wolberg, G., Zokai, S.: Robust image registration using log-polar transform. In: Proceedings, International Conference on Image Processing, IEEE (2000)Google Scholar
  34. 34.
    Wolberg, G., Zokai, S.: Image registration for perspective deformation recovery. In: Proceedings of the Conference on Automatic Target Recognition X, IS&T/SPIE (The International Society for Optical Engineering) (2000)Google Scholar
  35. 35.
    Zhang, Z., Blum, R.S.: A hybrid image registration technique for a digital camera image fusion application. Inf. Fusion 2(2), 135–149 (2001)CrossRefGoogle Scholar
  36. 36.
    Kuglin, C.D., Hines, D.C.: The phase correlation image alignment method. In: Proceedings of the Conference on Cybernetics and Society, IEEE, pp. 163–165 (1975)Google Scholar
  37. 37.
    Castro, E.D., Morandi, C.: Registration of translated and rotated images using finite Fourier transforms. Trans. Pattern Anal. Mach. Intell. 9(5), 700–703 (1987)CrossRefGoogle Scholar
  38. 38.
    Casasent, D., Psaltis, D.: Position, rotation, and scale-invariant optical correlation. Appl. Opt. 15, 1793–1799 (1976)Google Scholar
  39. 39.
    Sheng, Y., Arsenault, H.H.: Experiments on pattern recognition using invariant Fourier–Mellin descriptors. J. Opt. Soc. Am. A 3(6), 771–776 (1986)PubMedCrossRefGoogle Scholar
  40. 40.
    McGuire, M.: An image registration technique for recovering rotation, scale and translation parameters. NEC Technical Report 98-018 (1998)Google Scholar
  41. 41.
    Reddy, B.S., Chatterji, B.N.: An FFT-based technique for translation, rotation, and scale-invariant image registration. Trans. Pattern Anal. Mach. Intell. 5(8), 1266–1271 (1996)Google Scholar
  42. 42.
    Stone, H.S.: NEC Technical Report: Fourier-Based Image Registration Techniques, NEC Research. (2002)
  43. 43.
    Lin, C.-Y., Wu, M., Bloom, J.A., Miller, M.L., Cox, I.J., Lui, Y.-M.: Rotation, scale, and translation resilient public watermarking for images. Trans. Image Process. 10(5) (2001)Google Scholar
  44. 44.
    Luo, X., Mirchandani, G.: An integrated framework for image classification. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (2000)Google Scholar
  45. 45.
    Stone, H.S., Tao, B., McGuire, M.: Analysis of image registration noise due to rotationally dependent aliasing. J. Visual Commun. Image Represent. 14(2) (2003)Google Scholar
  46. 46.
    Lévy-Vehel, J.: Utilisation de la transformée de Mellin en traitement de signaux fractals – some applications of the Mellin transform in signal processing. INRIA Research Report No. 2992 (1995–1996)Google Scholar
  47. 47.
    Derrode, S., Ghorbel, F.: Robust and efficient Fourier–Mellin transform approximations for gray-level image reconstruction and complete invariant description. Comput. Visual Image Understand. 83(1), 57–78 (2001)CrossRefzbMATHGoogle Scholar
  48. 48.
    Blackman, R.B., Tukey, J.W.: Particular Pairs of Windows. Dover, New York (1959)Google Scholar
  49. 49.
    Otsu, N.: A threshold selection method from grey-level histograms. Trans. Syst. Man Cybernet. 9(1), 62–66 (1979)CrossRefMathSciNetGoogle Scholar
  50. 50.
    Niblack, W.: An Introduction to Digital Image Processing. Prentice-Hall, Englewood Cliffs, NJ (1986)Google Scholar
  51. 51.
    Huang, J., Wang, Y., Wong, E.K.: Check image compression using a layered coding method. J. Electron. Imag., Special issue on image/video processing and compression Visual Comun. 7(3), 426–442 (1998)Google Scholar
  52. 52.
    Frigo, M., Johnson, S.G.: The fastest fourier transform in the west, version 3. Massacheusetts Institute of Technology, (2003)
  53. 53.
    Borman, S., Stevenson, R.: Spatial Resolution Enhancement of Low-Resolution Image Sequences – A Comprehensive Review with Directions for Future Research. Research Report, University of Notre Dame (1998)Google Scholar
  54. 54.
    Kia, O.E.: Document image compression and analysis. Ph.D. thesis (1997)Google Scholar
  55. 55.
    Huang, J., Wang, Y., Wong, E.K.: Check image compression: a comparision of JPEG, wavelet and layered coding methods. In: Proceedings of the International Conference on Image Processing, IEEE, pp. 694–697 (1997)Google Scholar
  56. 56.
    Devillard, N.: Fast median search: an ANSI C implementation, 1998)
  57. 57.
    Gil, J., Werman, M.: Computing 2-D min, median, and max filters. Trans. Pattern Anal. Machine Intell. 15(5), 504–507 (1993)CrossRefGoogle Scholar
  58. 58.
    Huang, T.S., Yang, G.J., Tang, G.Y.: A fast two-dimensional median filtering algorithm. Trans. Acoustics Speech Signal Process. 27(1) (1979)Google Scholar

Copyright information

© Springer-Verlag 2005

Authors and Affiliations

  1. 1.Department of Computer ScienceBrigham Young UniversityProvoUSA

Personalised recommendations