Advertisement

Graph-Based Keyword Spotting in Historical Handwritten Documents

  • Michael Stauffer
  • Andreas Fischer
  • Kaspar Riesen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10029)

Abstract

The amount of handwritten documents that is digitally available is rapidly increasing. However, we observe a certain lack of accessibility to these documents especially with respect to searching and browsing. This paper aims at closing this gap by means of a novel method for keyword spotting in ancient handwritten documents. The proposed system relies on a keypoint-based graph representation for individual words. Keypoints are characteristic points in a word image that are represented by nodes, while edges are employed to represent strokes between two keypoints. The basic task of keyword spotting is then conducted by a recent approximation algorithm for graph edit distance. The novel framework for graph-based keyword spotting is tested on the George Washington dataset on which a state-of-the-art reference system is clearly outperformed.

Keywords

Handwritten keyword spotting Bipartite graph matching Graph representation for words 

Notes

Acknowledgments

This work has been supported by the Hasler Foundation Switzerland.

References

  1. 1.
    Manmatha, R., Han, C., Riseman, E.: Word spotting: a new approach to indexing handwriting. In: Computer Vision and Pattern Recognition, pp. 631–637 (1996)Google Scholar
  2. 2.
    Rath, T., Manmatha, R.: Word image matching using dynamic time warping. In: Computer Vision and Pattern Recognition, vol. 2, pp. II-521–II-527 (2003)Google Scholar
  3. 3.
    Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. PAMI 34(2), 211–224 (2012)CrossRefGoogle Scholar
  4. 4.
    Kolcz, A., Alspector, J., Augusteijn, M., Carlson, R., Viorel Popescu, G.: A line-oriented approach to word spotting in handwritten documents. Pattern Anal. Appl. 3(2), 153–168 (2000)Google Scholar
  5. 5.
    Dey, S., Nicolaou, A., Llados, J., Pal, U.: Local binary pattern for word spotting in handwritten historical document. Comput. Res. Repository (2016)Google Scholar
  6. 6.
    Lavrenko, V., Rath, T., Manmatha, R.: Holistic word recognition for handwritten historical documents. In: Proceedings of the International Workshop on Document Image Analysis for Libraries, pp. 278–287 (2004)Google Scholar
  7. 7.
    Fischer, A., Riesen, K., Bunke, H.: Graph similarity features for HMM-based handwriting recognition in historical documents. In: International Conference on Frontiers in Handwriting Recognition, pp. 253–258 (2010)Google Scholar
  8. 8.
    Huang, L., Yin, F., Chen, Q.H., Liu, C.L.: Keyword spotting in offline chinese handwritten documents using a statistical model. In: International Conference on Document Analysis and Recognition, pp. 78–82 (2011)Google Scholar
  9. 9.
    Riba, P., Llados, J., Fornes, A.: Handwritten word spotting by inexact matching of grapheme graphs. In: International Conference on Document Analysis and Recognition, pp. 781–785 (2015)Google Scholar
  10. 10.
    Wang, P., Eglin, V., Garcia, C., Largeron, C., Llados, J., Fornes, A.: A novel learning-free word spotting approach based on graph representation. In: Proceedings of the International Workshop on Document Analysis for Libraries, pp. 207–211 (2014)Google Scholar
  11. 11.
    Wang, P., Eglin, V., Garcia, C., Largeron, C., Llados, J., Fornes, A.: A coarse-to-fine word spotting approach for historical handwritten documents based on graph embedding and graph edit distance. In: International Conference on Pattern Recognition, pp. 3074–3079 (2014)Google Scholar
  12. 12.
    Bui, Q.A., Visani, M., Mullot, R.: Unsupervised word spotting using a graph representation based on invariants. In: International Conference on Document Analysis and Recognition, pp. 616–620 (2015)Google Scholar
  13. 13.
    Riesen, K., Bunke, H.: Approximate graph edit distance computation by means of bipartite graph matching. Image Vis. Comput. 27(7), 950–959 (2009)CrossRefGoogle Scholar
  14. 14.
    Conte, D., Foggia, P., Sansone, C., Vento, M.: Thirty years of graph matching in pattern recognition. Int. J. Pattern Rec. Artif. Intell. 18(03), 265–298 (2004)CrossRefGoogle Scholar
  15. 15.
    Foggia, P., Percannella, G., Vento, M.: Graph matching and learning in pattern recognition in the last 10 years. Int. J. Pattern Rec. Artif. Intell. 28(01), 9–42 (2014)MathSciNetGoogle Scholar
  16. 16.
    Bunke, H., Allermann, G.: Inexact graph matching for structural pattern recognition. Pattern Rec. Lett. 1(4), 245–253 (1983)CrossRefzbMATHGoogle Scholar
  17. 17.
    Hart, P., Nilsson, N., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 4(2), 100–107 (1968)CrossRefGoogle Scholar
  18. 18.
    Fischer, A., Suen, C.Y., Frinken, V., Riesen, K., Bunke, H.: Approximation of graph edit distance based on Hausdorff matching. Pattern Rec. 48(2), 331–343 (2015)CrossRefGoogle Scholar
  19. 19.
    Munkres, J.: Algorithms for the assignment and transportation problems. J. Soc. Ind. Appl. Math. 5(1), 32–38 (1957)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Fischer, A., Indermühle, E., Bunke, H., Viehhauser, G., Stolz, M.: Ground truth creation for handwriting recognition in historical documents. In: International Workshop on Document Analysis Systems, New York, USA, pp. 3–10 (2010)Google Scholar
  21. 21.
    Hull, J.: Document image skew detection: survey and annotated bibliography. In: Series in Machine Perception and Artificial Intelligence, vol. 29, pp. 40–64 (1998)Google Scholar
  22. 22.
    Guo, Z., Hall, R.W.: Parallel thinning with two-subiteration algorithms. Commun. ACM 32(3), 359–373 (1989)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Rath, T.M., Manmatha, R.: Word spotting for historical documents. Int. J. Doc. Anal. Rec. 9(2–4), 139–152 (2007)CrossRefGoogle Scholar
  24. 24.
    Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Rec. Lett. 33(7), 934–942 (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Michael Stauffer
    • 1
    • 3
  • Andreas Fischer
    • 2
  • Kaspar Riesen
    • 1
  1. 1.Institute for Information SystemsUniversity of Applied Sciences and Arts Northwestern SwitzerlandOltenSwitzerland
  2. 2.University of Fribourg and HES-SOFribourgSwitzerland
  3. 3.Department of InformaticsUniversity of PretoriaPretoriaSouth Africa

Personalised recommendations