Advertisement

Graphical tools and techniques for querying document image databases

  • J. Sauvola
  • D. Doermann
  • H. Kauniskangas
  • C. Shin
  • M. Koivusaari
  • M. Pietikäinen
Oral Presentations B. Document Processing and Retrieval
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1339)

Abstract

This paper describes document models and relations for the retrieval of document images. The underlying methodology was developed for the Intelligent Document Image Retrieval System (IDIR), which aims to extend document image database query capabilities. Traditional component type and keyword features are insufficient in describing logical and structural aspects of documents nature.

We have developed the necessary object-oriented document models to carry out complex multi domain retrieval scenarios. In this paper we focus on retrieval capabilities and underlying methodology that supports different schemes. For these models and query schemes (QS), new graphical techniques are introduced. The IDIR allows complex combinations of different QS's, using the extended concept of `frame logic', developed in our earlier work for the attribute management. Furthermore, a concept of document similarity is introduced with relations to QS's, document models, structure and role of use. Examples are shown for the retrieval of document images from University of Washington Database.

Keywords

Document image retrieval document image query query scheme document similarity 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Doermann D., Sauvola J., Kauniskangas H., Shin C., Pietikäinen M. and Rosenfeld A. (1997) The development of a general framework for intelligent document image retrieval. A book chapter in Document Analysis Systems II, Series in Machine Perception and Artificial Intelligence, 28 pages.Google Scholar
  2. [2]
    Information Retrieval: Data Structures and Algorithms, William B. Frakes, and Ricardo Baeza-Yates (Eds.), Prentice Hall, Englewood Cliffs, NJ, 1992.Google Scholar
  3. [3]
    Rao B.R. (1994) Object-oriented databases: technology, applications, and products. Database Experts' Series, McGraw-Hill, 253 pages.Google Scholar
  4. [4]
    E.G.M. Petrakis and C. Faloutsos. Similarity searching in large image databases. Technical Report CS-TR-3388, University of Maryland Institute for Advanced Computer Studies and Dept. of Computer Science, Univ. of Maryland, December 1994.Google Scholar
  5. [5]
    Gerald Salton, and Michael J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, New York, 1983.Google Scholar
  6. [6]
    Christian Shin, David Doermann, and Azriel Rosenfeld, Querying Document Image Databases using Structural Similarity, Technical Report, Center for Automation Research, University of Maryland at College Park (in preparation).Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • J. Sauvola
    • 2
  • D. Doermann
    • 1
  • H. Kauniskangas
    • 2
  • C. Shin
    • 1
  • M. Koivusaari
    • 2
  • M. Pietikäinen
    • 2
  1. 1.Media Processing Team Machine Vision and Media Processing Group Infotech OuluUniversity of OuluOuluFinland
  2. 2.Language and Media Processing Lab. Center for Automation ResearchUniversity of MarylandCollege Park

Personalised recommendations