Speeding-Up Graph-Based Keyword Spotting in Historical Handwritten Documents
The present paper is concerned with a graph-based system for Keyword Spotting (KWS) in historical documents. This particular system operates on segmented words that are in turn represented as graphs. The basic KWS process employs the cubic-time bipartite matching algorithm (BP). Yet, even though this graph matching procedure is relatively efficient, the computation time is a limiting factor for processing large volumes of historical manuscripts. In order to speed up our framework, we propose a novel fast rejection heuristic. This heuristic compares the node distribution of the query graph and the document graph in a polar coordinate system. This comparison can be accomplished in linear time. If the node distributions are similar enough, the BP matching is actually carried out (otherwise the document graph is rejected). In an experimental evaluation on two benchmark datasets we show that about 50% or more of the matchings can be omitted with this procedure while the KWS accuracy is not negatively affected.
KeywordsHandwritten keyword spotting Bipartite graph matching Fast rejection Filtering graph matching
This work has been supported by the Hasler Foundation Switzerland.
- 1.Manmatha, R., Han, C., Riseman, E.: Word spotting: a new approach to indexing handwriting. In: Computer Vision and Pattern Recognition, pp. 631–637 (1996)Google Scholar
- 2.Rath, T., Manmatha, R.: Word image matching using dynamic time warping. In: Computer Vision and Pattern Recognition, vol. 2, pp. II-521–II-527 (2003)Google Scholar
- 5.Rodriguez, J.A., Perronnin, F.: Local gradient histogram features for word spotting in unconstrained handwritten documents. In: International Conference on Frontiers in Handwriting Recognition, pp. 7–12 (2008)Google Scholar
- 7.Perronnin, F., Rodriguez-Serrano, J.A.: Fisher kernels for handwritten word-spotting. In: International Conference on Document Analysis and Recognition, pp. 106–110 (2009)Google Scholar
- 9.Riesen, K.: Structural pattern recognition with graph edit distance. Advances in Computer Vision and Pattern Recognition, Cham (2015)Google Scholar
- 10.Wang, P., Eglin, V., Garcia, C., Largeron, C., Llados, J., Fornes, A.: A novel learning-free word spotting approach based on graph representation. In: International Workshop on Document Analysis Systems, pp. 207–211 (2014)Google Scholar
- 11.Bui, Q.A., Visani, M., Mullot, R.: Unsupervised word spotting using a graph representation based on invariants. In: International Conference on Document Analysis and Recognition, pp. 616–620 (2015)Google Scholar
- 12.Riba, P., Llados, J., Fornes, A.: Handwritten word spotting by inexact matching of grapheme graphs. In: International Conference on Document Analysis and Recognition, pp. 781–785 (2015)Google Scholar
- 13.Stauffer, M., Fischer, A., Riesen, K.: Graph-based keyword spotting in historical handwritten documents. In: International Workshop on Structural, Syntactic, and Statistical Pattern Recognition (2016)Google Scholar