Use of Affine Invariants in Locally Likely Arrangement Hashing for Camera-Based Document Image Retrieval

  • Tomohiro Nakai
  • Koichi Kise
  • Masakazu Iwamura
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3872)


Camera-based document image retrieval is a task of searching document images from the database based on query images captured using digital cameras. For this task, it is required to solve the problem of “perspective distortion” of images,as well as to establish a way of matching document images efficiently. To solve these problems we have proposed a method called Locally Likely Arrangement Hashing (LLAH) which is characterized by both the use of a perspective invariant to cope with the distortion and the efficiency: LLAH only requires O(N) time where N is the number of feature points that describe the query image. In this paper, we introduce into LLAH an affine invariant instead of the perspective invariant so as to improve its adjustability. Experimental results show that the use of the affine invariant enables us to improve either the accuracy from 96.2% to 97.8%, or the retrieval time from 112 msec./query to 75 msec./query by selecting parameters of processing.


Feature Point Hash Table Query Image Document Image Discrimination Power 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Doermann, D.: The Indexing and Retrieval of Document Images: A Survey. Computer Vision and Image Understanding 70(3), 287–298 (1998)CrossRefGoogle Scholar
  2. 2.
    Hull, J.J.: Document image matching and retrieval with multiple distortion-invariant descriptors. In: Document Analysis Systems, pp. 379–396 (1995)Google Scholar
  3. 3.
    Doermann, D., Li, H., Kia, O.: The detection of duplicates in document image databases. In: Proc. ICDAR 1997, pp. 314–318 (1997)Google Scholar
  4. 4.
    Doermann, D., Liang, J., Li, H.: Progress in camera-based document image analysis. In: Proc. ICDAR 2003, pp. 606–616 (2003)Google Scholar
  5. 5.
    Clark, P., Mirmehdi, M.: Recognising text in real scenes. IJDAR 4, 243–257 (2002)CrossRefGoogle Scholar
  6. 6.
    Pollard, S., Pilu, M.: Building cameras for capturing documents. IJDAR 7, 123–137 (2005)CrossRefGoogle Scholar
  7. 7.
    Wolfson, H.J., Rigoutsos, I.: Geometric hashing: an overview. IEEE Computational Science & Engineering 4(4), 10–21 (1997)CrossRefGoogle Scholar
  8. 8.
    Nakai, T., Kise, K., Iwamura, M.: Hashing with Local Combinations of Feature Points and Its Application to Camera-Based Document Image Retrieval. In: Proc. CBDAR 2005, pp. 87–94 (2005)Google Scholar
  9. 9.
    Suk, T., Flusser, J.: Point-based projective invariants. Pattern Recognition 33, 251–261 (2000)CrossRefGoogle Scholar
  10. 10.
    Huet, B., Hancock, E.R.: Cartographic indexing into a database of remotely sensed images. In: WACV 1996, pp. 8–14 (1996)Google Scholar
  11. 11.
    Rothwell, C.A., Zisserman, A., Fosyth, D.A., Mundy, J.L.: Using projective invariants for constant time library indexing in model based vision. In: Proc. BMVC, pp. 62–70 (1991)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Tomohiro Nakai
    • 1
  • Koichi Kise
    • 1
  • Masakazu Iwamura
    • 1
  1. 1.Graduate School of EngineeringOsaka Prefecture UniversityOsakaJapan

Personalised recommendations