Abstract
In this paper, we present a method for the automatic extraction of numerical fields (ZIP codes, phone numbers, etc.) from incoming mail documents. The approach is based on a segmentation-driven recognition that aims at locating isolated and touching digits among the textual information. A syntactical analysis is then performed on each line of text in order to filter the sequences that respect a particular syntax (number of digits, presence of separators) known by the system. We evaluate the performance of our system by means of the recall precision trade-off on a real incoming mail document database.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Bridle, J.S.: Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In: Soulie, F.F., Herault, J. (eds.) Neurocomput- ing: Algorithms, Architectures and Applications, NATO ASI, pp. 227–236 (1990)
Rabiner, L.R.: A tutorial on hidden markov models and selected apllications in speech recognition. In: Readings in Speech Recognition, pp. 267–296. Kaufmann, San Francisco (1990)
Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
Congedo, G., Dimauro, G., Impedovo, S., Pirlo, G.: Segmentation of numeric strings. In: ICDAR 1995, vol. 2, pp. 1038–1041 (1995)
Likforman-Sulem, L., Faure, C.: Une méthode de résolution des con its d’alignements pour la segmentation des documents manuscrits. Traitement du signal 12, 541–549 (1995)
Bradley, A.P.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)
Dzuba, G., Filatov, A., Volgunin, A.: Handwritten zip code recognition. In: ICDAR, pp. 766–770 (1997)
Kim, G., Govindaraju, V.: A lexicon driven approach to handwritten word recognition for real-time applications. IEEE Trans. on PAMI 19(4), 366–378 (1997)
Srihari, S.N., Keubert, E.J.: Integration of handwritten address interpretation technology into the united states postal service remote computer reader system. In: ICDAR 1997, pp. 892–896 (1997)
Heutte, L., Paquet, T., Moreau, J.V., Lecourtier, Y., Olivier, C.: A structural/ statistical feature based vector for handwritten character recognition. Pattern Recognition Letters 19, 629–641 (1998)
Tax, D.M.J., Duin, R.P.W.: Combining one-class classifiers. In: Kittler, J., Roli, F. (eds.) MCS 2001. LNCS, vol. 2096, pp. 299–308. Springer, Heidelberg (2001)
Liu, C.L., Nakashima, K., Sako, H., Fujisawa, H.: Handwritten digit recognition using state-of-the-art techniques. In: IWFHR 2002, pp. 320–325 (2002)
Liu, J., Gader, P.: Neural networks with enhanced outlier rejection ability for off-line handwritten word recognition pattern recognition. Pattern Recognition 35, 2061–2071 (2002)
Pitrelli, J.F., Perrone, M.P.: Confidence-scoring post-processing for off-line handwritten-character recognition verification. In: ICDAR 2003, vol. 1, pp. 278–282 (2003)
Chatelain, C., Heutte, L., Paquet, T.: A syntax-directed method for numerical field extraction using classifier combination. In: 9th International Workshop on Frontiers in Handwriting Recognition, Tokyo, Japan, pp. 93–98 (2004)
Koch, G., Heutte, L., Paquet, T.: Numerical sequence extraction in handwritten incoming mail documents. In: ICDAR, vol. 1, pp. 369–373 (2004)
Milgram, J., Sabourin, R., Cheriet, M.: An hybrid classification system which combines model-based and discriminative approaches. In: 17th Conference on Pattern Recognition (ICPR 2004), Cambridge, U.K, pp. 155–162 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chatelain, C., Heutte, L., Paquet, T. (2006). Segmentation-Driven Recognition Applied to Numerical Field Extraction from Handwritten Incoming Mail Documents. In: Bunke, H., Spitz, A.L. (eds) Document Analysis Systems VII. DAS 2006. Lecture Notes in Computer Science, vol 3872. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11669487_50
Download citation
DOI: https://doi.org/10.1007/11669487_50
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32140-8
Online ISBN: 978-3-540-32157-6
eBook Packages: Computer ScienceComputer Science (R0)