Advertisement

Speeding-Up Graph-Based Keyword Spotting in Historical Handwritten Documents

  • Michael Stauffer
  • Andreas Fischer
  • Kaspar Riesen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10310)

Abstract

The present paper is concerned with a graph-based system for Keyword Spotting (KWS) in historical documents. This particular system operates on segmented words that are in turn represented as graphs. The basic KWS process employs the cubic-time bipartite matching algorithm (BP). Yet, even though this graph matching procedure is relatively efficient, the computation time is a limiting factor for processing large volumes of historical manuscripts. In order to speed up our framework, we propose a novel fast rejection heuristic. This heuristic compares the node distribution of the query graph and the document graph in a polar coordinate system. This comparison can be accomplished in linear time. If the node distributions are similar enough, the BP matching is actually carried out (otherwise the document graph is rejected). In an experimental evaluation on two benchmark datasets we show that about 50% or more of the matchings can be omitted with this procedure while the KWS accuracy is not negatively affected.

Keywords

Handwritten keyword spotting Bipartite graph matching Fast rejection Filtering graph matching 

Notes

Acknowledgments

This work has been supported by the Hasler Foundation Switzerland.

References

  1. 1.
    Manmatha, R., Han, C., Riseman, E.: Word spotting: a new approach to indexing handwriting. In: Computer Vision and Pattern Recognition, pp. 631–637 (1996)Google Scholar
  2. 2.
    Rath, T., Manmatha, R.: Word image matching using dynamic time warping. In: Computer Vision and Pattern Recognition, vol. 2, pp. II-521–II-527 (2003)Google Scholar
  3. 3.
    Rodríguez-Serrano, J.A., Perronnin, F.: Handwritten word-spotting using hidden Markov models and universal vocabularies. Pattern Recogn. 42(9), 2106–2116 (2009)CrossRefzbMATHGoogle Scholar
  4. 4.
    Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Recogn. Lett. 33(7), 934–942 (2012)CrossRefGoogle Scholar
  5. 5.
    Rodriguez, J.A., Perronnin, F.: Local gradient histogram features for word spotting in unconstrained handwritten documents. In: International Conference on Frontiers in Handwriting Recognition, pp. 7–12 (2008)Google Scholar
  6. 6.
    Rodríguez-Serrano, J.A., Perronnin, F.: A model-based sequence similarity with application to handwritten word spotting. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2108–20 (2012)CrossRefGoogle Scholar
  7. 7.
    Perronnin, F., Rodriguez-Serrano, J.A.: Fisher kernels for handwritten word-spotting. In: International Conference on Document Analysis and Recognition, pp. 106–110 (2009)Google Scholar
  8. 8.
    Conte, D., Foggia, P., Sansone, C., Vento, M.: Thirty years of graph matching in pattern recognition. Int. J. Pattern Recogn. Artif. Intell. 18(03), 265–298 (2004)CrossRefGoogle Scholar
  9. 9.
    Riesen, K.: Structural pattern recognition with graph edit distance. Advances in Computer Vision and Pattern Recognition, Cham (2015)Google Scholar
  10. 10.
    Wang, P., Eglin, V., Garcia, C., Largeron, C., Llados, J., Fornes, A.: A novel learning-free word spotting approach based on graph representation. In: International Workshop on Document Analysis Systems, pp. 207–211 (2014)Google Scholar
  11. 11.
    Bui, Q.A., Visani, M., Mullot, R.: Unsupervised word spotting using a graph representation based on invariants. In: International Conference on Document Analysis and Recognition, pp. 616–620 (2015)Google Scholar
  12. 12.
    Riba, P., Llados, J., Fornes, A.: Handwritten word spotting by inexact matching of grapheme graphs. In: International Conference on Document Analysis and Recognition, pp. 781–785 (2015)Google Scholar
  13. 13.
    Stauffer, M., Fischer, A., Riesen, K.: Graph-based keyword spotting in historical handwritten documents. In: International Workshop on Structural, Syntactic, and Statistical Pattern Recognition (2016)Google Scholar
  14. 14.
    Stauffer, M., Fischer, A., Riesen, K.: A novel graph database for handwritten word images. In: Robles-Kelly, A., Loog, M., Biggio, B., Escolano, F., Wilson, R. (eds.) S+SSPR 2016. LNCS, vol. 10029, pp. 553–563. Springer, Cham (2016). doi: 10.1007/978-3-319-49055-7_49 CrossRefGoogle Scholar
  15. 15.
    Riesen, K., Bunke, H.: Approximate graph edit distance computation by means of bipartite graph matching. Image Vis. Comput. 27(7), 950–959 (2009)CrossRefGoogle Scholar
  16. 16.
    Shu, X., Wu, X.J.: A novel contour descriptor for 2D shape matching and its application to image retrieval. Image Vis. Comput. 29(4), 286–294 (2011)CrossRefGoogle Scholar
  17. 17.
    Serratosa, F., Sanfeliu, A.: Signatures versus histograms: definitions, distances and algorithms. Pattern Recogn. 39(5), 921–934 (2006)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Michael Stauffer
    • 1
    • 4
  • Andreas Fischer
    • 2
    • 3
  • Kaspar Riesen
    • 1
  1. 1.Institute for Information SystemsUniversity of Applied Sciences and Arts Northwestern SwitzerlandOltenSwitzerland
  2. 2.Department of InformaticsUniversity of FribourgFribourgSwitzerland
  3. 3.Institute for Complex SystemsUniversity of Applied Sciences and Arts Western SwitzerlandFribourgSwitzerland
  4. 4.Department of InformaticsUniversity of PretoriaPretoriaSouth Africa

Personalised recommendations