Document Analysis System for Automating Workflows
When a user places a document in a capture device—copier, multi-functional printer [MFP], or scanner—the user expects good output to be produced regardless of the document type. There are a variety of means to achieve improved output, in which the settings on the copying device are tuned to the content characteristics of the document. These settings can be automated across the range of scanned context extremes from photo (blurring, no snapping) to fully-text (sharpening, aggressive snapping) documents. This procedure is “document auto typing”, and relies on a fast and accurate assessment of the content of the captured image. We herein describe the development of seven distinct systems for document analysis, and through the comparison of these systems arrive at an efficient and accurate document analysis system for automating the copying settings. We discuss the applicability of this method to other automated workflows in document capture.
KeywordsOptical Character Recognition Black Pixel Solid Region Projection Profile Document Analysis System
- 1.Wahl, F.M., Wong, K.Y., Casey, R.G.: Block segmentation and text extraction in mixed/image documents. Computer Vision Graphics and Image Processing 2, 375–390 (1982)Google Scholar
- 3.Lee, J.P., Simske, S.J., Dawe, J.T.: Segmenting a document into regions associated with a data type, and assigning pipelines to process such regions. U.S. Patent 6,880,122, Apr. 12 (2005)Google Scholar
- 4.Simske, S.J., Arnabat, J.: User-directed analysis of scanned images. In: Proc. DocEng 2003, Grenoble, pp. 212–221 (2003)Google Scholar