Abstract
This paper presents an efficient indexing and retrieval scheme for searching in document image databases. In many non-European languages, optical character recognizers are not very accurate. Word spotting - word image matching - may instead be used to retrieve word images in response to a word image query. The approaches used for word spotting so far, dynamic time warping and/or nearest neighbor search, tend to be slow. Here indexing is done using locality sensitive hashing (LSH) - a technique which computes multiple hashes - using word image features computed at word level. Efficiency and scalability is achieved by content-sensitive hashing implemented through approximate nearest neighbor computation. We demonstrate that the technique achieves high precision and recall (in the 90% range), using a large image corpus consisting of seven Kalidasa’s (a well known Indian poet of antiquity) books in the Telugu language. The accuracy is comparable to using dynamic time warping and nearest neighbor search while the speed is orders of magnitude better - 20000 word images can be searched in milliseconds.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Pal, U., Chaudhuri, B.: Indian script character recognition: A survey. Pattern Recognition 37, 1887–1899 (2004)
Rath, T.M., Manmatha, R.: Word image matching using dynamic time warping. In: Conference on Computer Vision and Pattern Recognition, vol. (2), pp. 521–527 (2003)
Rath, T.M., Manmatha, R.: Word spotting for historical documents. IJDAR 9(2), 139–152 (2007)
Balasubramanian, A., Meshesha, M., Jawahar, C.V.: Retrieval from document image collections. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 1–12. Springer, Heidelberg (2006)
Chan, J., Ziftci, C., Forsyth, D.A.: Searching off-line arabic documents. In: CVPR. Conference on Computer Vision and Pattern Recognition, vol. (2), pp. 1455–1462 (2006)
Lu, Z., Schwartz, R., Natarajan, P., Bazzi, I., Makhoul, J.: Advances in the bbn byblos ocr system. In: ICDAR, pp. 337–340 (1999)
Rath, T.M., Manmatha, R., Lavrenko, V.: A search engine for historical manuscript images. In: SIGIR, pp. 369–376 (2004)
Ataer, E., Duygulu, P.: Retrieval of ottoman documents. In: Multimedia Information Retrieval (MIR) workshop, pp. 155–162 (2006)
Konidaris, T., Gatos, B., Ntzios, K., Pratikakis, I., Theodoridis, S., Perantonis, S.J.: Keyword-guided word spotting in historical printed documents using synthetic data and user feedback. IJDAR 9(2), 167–177 (2007)
Sankar, K.P., Jawahar, C.V.: Probabilistic reverse annotation for large scale image retrieval. In: Conference on Computer Vision and Pattern Recognition, pp. 1–6 (2007)
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: SOTC, pp. 604–613 (1998)
Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter-sensitive hashing. In: Sebe, N., Lew, M.S., Huang, T.S. (eds.) Computer Vision in Human-Computer Interaction. LNCS, vol. 3766, pp. 750–757. Springer, Heidelberg (2005)
Matei, B., Shan, Y., Sawhney, H., Tan, Y., Kumar, R., Huber, D., Hebert, M.: Rapid object indexing using locality sensitive hashing and joint 3D-signature space estimation. IEEE Trans. PAMI 28(7), 1111–1126 (2006)
Lamdan, Y., Wolfson, H.: Geometric hashing: A general and efficient model-based recognition scheme. In: ICCV, pp. 238–249 (1988)
Nakai, T., Kise, K., Iwamura, M.: Use of affine invariants in locally likely arrangement hashing for camera-based document image retrieval. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 541–552. Springer, Heidelberg (2006)
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proceedings of the 25th VLDB conference, pp. 518–529 (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kumar, A., Jawahar, C.V., Manmatha, R. (2007). Efficient Search in Document Image Collections. In: Yagi, Y., Kang, S.B., Kweon, I.S., Zha, H. (eds) Computer Vision – ACCV 2007. ACCV 2007. Lecture Notes in Computer Science, vol 4843. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76386-4_55
Download citation
DOI: https://doi.org/10.1007/978-3-540-76386-4_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76385-7
Online ISBN: 978-3-540-76386-4
eBook Packages: Computer ScienceComputer Science (R0)