Eyes Wide Open: an interactive learning method for the design of rule-based systems

  • Cérès Carton
  • Aurélie Lemaitre
  • Bertrand Coüasnon
Original Paper
  • 184 Downloads

Abstract

We present in this paper a new general method, the Eyes Wide Open method (EWO) for the design of rule-based document recognition systems. Our contribution is to introduce a learning procedure, through machine learning techniques, in interaction with the user to design the recognition system. Therefore, and unlike many approaches that are manually designed, ours can easily adapt to a new type of documents while taking advantage of the expressiveness of rule-based systems and their ability to convey the hierarchical structure of a document. The EWO method is independent of any existing recognition system. An automatic analysis of an annotated corpus, guided by the user, is made to help the adaption of the recognition system to a new kind of document. The user will then bring sense to the automatically extracted information. In this paper, we validate EWO by producing two rule-based systems: one for the Maurdor international competition, on a heterogeneous corpus of documents, containing handwritten and printed documents, written in different languages and another one for the RIMES competition corpus, a homogeneous corpus of French handwritten business letters. On the RIMES corpus, our method allows an assisted design of a grammatical description that gives better results than all the previously proposed statistical systems.

Keywords

Document layout analysis Rule inference Rule based system Clustering 

References

  1. 1.
    Brunessaux, S., Giroux, P., Grilheres, B., Manta, M., Bodin, M., Choukri, K., Galibert, O., Kahn, J.: The maurdor project: improving automatic processing of digital documents. In: Document Analysis Systems (DAS), pp. 349–354 (2014)Google Scholar
  2. 2.
    Carton, C., Lemaitre. A., Coüasnon B.: LearnPos: a new tool for interactive learning positioning. DRR–Document Recognition and Retrieval XXI, San Francisco, United States (2014)Google Scholar
  3. 3.
    Conway, A.: Page grammars and page parsing. A syntactic approach to document layout recognition. In: ICDAR, pp. 761–764 (1993)Google Scholar
  4. 4.
    Coüasnon, B.: Dmos, a generic document recognition method: application to table structure analysis in a general and in a specific way. IJDAR 8(2), 111–122 (2006)CrossRefGoogle Scholar
  5. 5.
    de la Higuera, C.: A bibliographical study of grammatical inference. Pattern Recognit. 38(9), 1332–1348 (2005)CrossRefGoogle Scholar
  6. 6.
    Estivill-Castro, V.: Why so many clustering algorithms: a position paper. SIGKDD Explor. Newsl. 4(1), 65–75 (2002)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Fraley, C., Raftery, A.E.: How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput. J. 41, 578–588 (1998)CrossRefMATHGoogle Scholar
  8. 8.
    Fred, A.L.N., Jain, A.K.: Data clustering using evidence accumulation. In: ICPR, vol. 4, pp. 276–280 (2002)Google Scholar
  9. 9.
    Grosicki, E., Carree, M., Brodin, J.-M., Geoffrois, E.: Results of the rimes evaluation campaign for handwritten mail processing. In: ICDAR, pp. 941–945 (2009)Google Scholar
  10. 10.
    Hair, J.F., Black, W.C., Babin, B.J., Anderson, R.E.: Multivariate Data Analysis, Chapter 2, 7th edn. Pearson Education Inc, Upper Saddle River (2010)Google Scholar
  11. 11.
    Ishitani, Y.: Logical structure analysis of document images based on emergent computation. In: ICDAR, pp. 189–192 (1999)Google Scholar
  12. 12.
    Jain, A., Murty, M., Flynn, P.: Data clustering: a review. ACM Computing Surveys, vol. 31, no. 3, pp. 264–323 (1999)Google Scholar
  13. 13.
    Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recognit. Lett. 31(8), 651–666 (2010)CrossRefGoogle Scholar
  14. 14.
    Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis, 9th edn. Wiley, London (1990)CrossRefMATHGoogle Scholar
  15. 15.
    Krishnamoorthy, M., Nagy, G., Seth, S., Viswanathan, M.: Syntactic segmentation and labeling of digitized pages from technical journals. IEEE Trans. PAMI 15(7), 737–747 (1993)CrossRefGoogle Scholar
  16. 16.
    Kuncheva, L.I., Hadjitodorov, S.T., Todorova, L.P.: Experimental comparison of cluster ensemble methods. In: ICIF, pp. 1–7 (2006)Google Scholar
  17. 17.
    Lemaitre, A., Camillerapp. J., Coüasnon, B.: A generic method for structure recognition of handwritten mail documents. Document Recognition and Retrieval DRR XV, San Jose, United States (2008)Google Scholar
  18. 18.
    Lemaitre, M., Grosicki, E., Geoffrois, E., Preteux, F.: Preliminary experiments in layout analysis of handwritten letters based on textural and spatial information and a 2d markovian approach. In: ICDAR, vol. 2, pp. 1023–1027 (2007)Google Scholar
  19. 19.
    Lin, C., Niwa, Y., Narita, S.: Logical structure analysis of book document images using contents information. In: ICDAR, vol. 2, pp. 1048–1054 (1997)Google Scholar
  20. 20.
    Mao, J., Jain, A.K.: A self-organizing network for hyperellipsoidal clustering (hec). IEEE Trans. Neural Netw. 7(1), 16–29 (1996)CrossRefGoogle Scholar
  21. 21.
    Mao. S., Kanungo, T.: Stochastic Language Models for Automatic Acquisition of Lexicons from Printed Bilingual Dictionaries (2001)Google Scholar
  22. 22.
    Mao, S., Rosenfeld, A., Kanungo, T.: Document structure analysis algorithms: a literature survey DRR, SPIE Proceedings. vol. 5010, pp. 197–207. SPIE (2003)Google Scholar
  23. 23.
    Maroneze, A.O., Coüasnon, B., Lemaitre, A.: Introduction of statistical information in a syntactic analyser for document image recognition. Document recognition and Retrieval XVIII–Electronic Imaging, pp. 7874 04, San Francisco, United States (2011)Google Scholar
  24. 24.
    Montreuil, F., Nicolas, S., Grosicki, E., Heutte, L.: A new hierarchical handwritten document layout extraction based on conditional random field modeling. In: ICFHR, pp. 31–36 (2010)Google Scholar
  25. 25.
    Pal, N.R., Bezdek, J.C.: On cluster validity for the fuzzy c-means model. Trans. Fuzzy Syst. 3(3), 370–379 (1995)CrossRefGoogle Scholar
  26. 26.
    Rangoni, Y., Belaid, A., Vajda, S.: Labelling logical structures of document images using a dynamic perceptive neural network. IJDAR 15(1), 45–55 (2012)CrossRefGoogle Scholar
  27. 27.
    Shetty, S., Srinivasan,H., Beal, M., Srihari, S.: Segmentation and labeling of documents using conditional random fields DRR, SPIE Proceedings, vol. 6500, p. 65000U, SPIE (2007)Google Scholar
  28. 28.
    Shilman, M., Liang, P., Viola, P.: Learning nongenerative grammatical models for document analysis. In: ICVV, vol. 2, pp. 962–969 (2005)Google Scholar
  29. 29.
    Sugar, C.A., James, G.M.: Finding the number of clusters in a data set: an information theoretic approach. J. Am. Stat. Assoc. 98, 750–763 (2003)CrossRefMATHGoogle Scholar
  30. 30.
    Tateisi, Y., Itoh, N.: Using stochastic syntactic analysis for extracting a logical structure from a document image. In: ICPR, vol. 2, pp. 391–394 (1994)Google Scholar
  31. 31.
    Tibshirani, R., Guenther, W., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc. Ser. B 63, 411–423 (2001)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2017

Authors and Affiliations

  • Cérès Carton
    • 1
  • Aurélie Lemaitre
    • 2
  • Bertrand Coüasnon
    • 1
  1. 1.IRISA-INSARennesFrance
  2. 2.IRISA-Université de Rennes 2RennesFrance

Personalised recommendations