smartFIX: A Requirements-Driven System for Document Analysis and Understanding

  • Andreas R. Dengel
  • Bertin Klein
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2423)

Abstract

Although the internet offers a wide-spread platform for information interchange, day-to-day work in large companies still means the processing of tens of thousands of printed documents every day. This paper presents the system smartFIX which is a document analysis and understanding system developed by the DFKI spin-off INSIDERS. It permits the processing of documents ranging from fixed format forms to unstructured letters of any format. Apart from the architecture, the main components and system characteristics, we also show some results when applying smartFIX to medical bills and prescriptions.

Keywords

Regular Expression Information Extraction Document Image Logical Object Medical Bill 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    A. Dengel, R. Bleisinger, R. Hoch, F. Hönes, M. Malburg and F. Fein, OfficeMAID — A System for Automatic Mail Analysis, Interpretation and Delivery, Proceedings DAS94, Int’l Association for Pattern Recognition Workshop on Document Analysis Systems, Kaiserslautern (Oct. 1994), pp. 253–276.Google Scholar
  2. 2.
    S. Baumann, M. Ben Hadj Ali, A. Dengel, T. Jäger, M. Malburg, A. Weigel, C. Wenzel, Message Extraction from Printed Documents A Complete Solution. In: Proc. of the 4.th International Conference on Document Analysis and Recognition (ICDAR), Ulm, Germany, 1997.Google Scholar
  3. 3.
    A. Dengel and K. Hinkelmann, The Specialist Board-a technology workbench for document analysis and understanding. In M. M. Tanik, F.B. Bastani, D. Gibson, and P.J. Fielding, editors, Integrated Design and Process Technology-IDPT96, Proc. of the 2nd World Conference, Austin, TX, USA, 1996.Google Scholar
  4. 4.
    G. Schreiber, H. Akkermans, A. Anjewierden, R. de Hoog, N. Shadbolt, W. Van de Velde, and B. Wielinga. Knowledge Engineering and Management-The CommonKADS Methodology. The MIT Press, Cambridge, Massachusetts, London, England, 1999.Google Scholar
  5. 6.
    B. Klein, S. Gökkus, T. Kieninger, A. Dengel, Three Approaches to “Industrial” Table Spotting. In: Proc. of the 6.th International Conference on Document Analysis and Recognition (ICDAR), Seattle, USA, 2001.Google Scholar
  6. 7.
    A. Dengel and F. Dubiel, Computer Understanding of Document Structure, International Journal of Imaging Systems & Technology (IJIST), Special Issue on Document Analysis & Recognition, Vol. 7, No. 4, 1996, pp. 271–278.Google Scholar
  7. 8.
    F. Dubiel and A. Dengel, FormClas — OCR-Free Classification of Forms, in: J.J. Hull, S. Liebowitz eds. Document Analysis Systems II, World Scientific Publishing Co. Inc., Singapore, 1998, pp. 189–208.Google Scholar
  8. 9.
    A. Fordan, Constraint Solving over OCR Graphs. In: Proc. Of the 14th International Conference on Applications of Prolog (INAP), Tokyo, Japan, 2001.Google Scholar
  9. 10.
    T. Kieninger and A. Dengel, A Paper-to-HTML Table Converting System, Proceedings DAS98, Int’l Association for Pattern Recognition Workshop on Document Analysis Systems, Nagano, Japan, Nov. 1998, pp. 356–365.Google Scholar
  10. 11.
    M. Junker and A. Dengel, Preventing overfitting in learning text patterns for document categorization, ICAPR2001, 2nd Intern’l Conference on Advances in Pattern Recognition, Rio de Janeiro, Brazil, March 2001.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Andreas R. Dengel
    • 1
  • Bertin Klein
    • 1
  1. 1.German Research Center for Artificial Intelligence (DFKI)KaiserslauternGermany

Personalised recommendations