Segmentation-Driven Recognition Applied to Numerical Field Extraction from Handwritten Incoming Mail Documents

  • Clément Chatelain
  • Laurent Heutte
  • Thierry Paquet
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3872)


In this paper, we present a method for the automatic extraction of numerical fields (ZIP codes, phone numbers, etc.) from incoming mail documents. The approach is based on a segmentation-driven recognition that aims at locating isolated and touching digits among the textual information. A syntactical analysis is then performed on each line of text in order to filter the sequences that respect a particular syntax (number of digits, presence of separators) known by the system. We evaluate the performance of our system by means of the recall precision trade-off on a real incoming mail document database.


Phone Number Text Line Handwritten Document Double Digit Recognition Hypothesis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Bridle, J.S.: Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In: Soulie, F.F., Herault, J. (eds.) Neurocomput- ing: Algorithms, Architectures and Applications, NATO ASI, pp. 227–236 (1990)Google Scholar
  2. 2.
    Rabiner, L.R.: A tutorial on hidden markov models and selected apllications in speech recognition. In: Readings in Speech Recognition, pp. 267–296. Kaufmann, San Francisco (1990)Google Scholar
  3. 3.
    Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)Google Scholar
  4. 4.
    Congedo, G., Dimauro, G., Impedovo, S., Pirlo, G.: Segmentation of numeric strings. In: ICDAR 1995, vol. 2, pp. 1038–1041 (1995)Google Scholar
  5. 5.
    Likforman-Sulem, L., Faure, C.: Une méthode de résolution des con its d’alignements pour la segmentation des documents manuscrits. Traitement du signal 12, 541–549 (1995)Google Scholar
  6. 6.
    Bradley, A.P.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)CrossRefGoogle Scholar
  7. 7.
    Dzuba, G., Filatov, A., Volgunin, A.: Handwritten zip code recognition. In: ICDAR, pp. 766–770 (1997)Google Scholar
  8. 8.
    Kim, G., Govindaraju, V.: A lexicon driven approach to handwritten word recognition for real-time applications. IEEE Trans. on PAMI 19(4), 366–378 (1997)Google Scholar
  9. 9.
    Srihari, S.N., Keubert, E.J.: Integration of handwritten address interpretation technology into the united states postal service remote computer reader system. In: ICDAR 1997, pp. 892–896 (1997)Google Scholar
  10. 10.
    Heutte, L., Paquet, T., Moreau, J.V., Lecourtier, Y., Olivier, C.: A structural/ statistical feature based vector for handwritten character recognition. Pattern Recognition Letters 19, 629–641 (1998)CrossRefGoogle Scholar
  11. 11.
    Tax, D.M.J., Duin, R.P.W.: Combining one-class classifiers. In: Kittler, J., Roli, F. (eds.) MCS 2001. LNCS, vol. 2096, pp. 299–308. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  12. 12.
    Liu, C.L., Nakashima, K., Sako, H., Fujisawa, H.: Handwritten digit recognition using state-of-the-art techniques. In: IWFHR 2002, pp. 320–325 (2002)Google Scholar
  13. 13.
    Liu, J., Gader, P.: Neural networks with enhanced outlier rejection ability for off-line handwritten word recognition pattern recognition. Pattern Recognition 35, 2061–2071 (2002)zbMATHCrossRefGoogle Scholar
  14. 14.
    Pitrelli, J.F., Perrone, M.P.: Confidence-scoring post-processing for off-line handwritten-character recognition verification. In: ICDAR 2003, vol. 1, pp. 278–282 (2003)Google Scholar
  15. 15.
    Chatelain, C., Heutte, L., Paquet, T.: A syntax-directed method for numerical field extraction using classifier combination. In: 9th International Workshop on Frontiers in Handwriting Recognition, Tokyo, Japan, pp. 93–98 (2004)Google Scholar
  16. 16.
    Koch, G., Heutte, L., Paquet, T.: Numerical sequence extraction in handwritten incoming mail documents. In: ICDAR, vol. 1, pp. 369–373 (2004)Google Scholar
  17. 17.
    Milgram, J., Sabourin, R., Cheriet, M.: An hybrid classification system which combines model-based and discriminative approaches. In: 17th Conference on Pattern Recognition (ICPR 2004), Cambridge, U.K, pp. 155–162 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Clément Chatelain
    • 1
  • Laurent Heutte
    • 1
  • Thierry Paquet
    • 1
  1. 1.Laboratoire PSI, CNRS FRE 2645Université de RouenSaint Etienne du RouvrayFrance

Personalised recommendations