A semi-automatic method for form layout description
The paper refers on some recent research results in document processing. A semi-automatic processing tool is described, in order to produce the fixed layout description of a preprinted form. A hierarchical collection of geometric primitives is the basis for a robust description of the layout, which is opened to incorporate further extended and richer representations in the future. Such geometric structures can be solid, dashed, dotted and text lines. The description is only geometric and not semantic, that is, for example, a text description contains only information about position and dimension but not the semantic content of the text. This module takes as input an empty deskewed form, in order to compensate for irregular orientations due to the acquisition process. The method is anyway very general and allows also the definition of more complex structures obtained from arbitrary combinations of the basic structures mentioned above. Many of these research results are already partially integrated and in use in a document processing machine.
- 1.Boatto, L. et al., “An Interpretation System for Land Register Maps”, IEEE Computer, vol.25, pp. 25–34, July 1992.Google Scholar
- 2.Casey, R., Ferguson, D., Mohiudiuddin, K., Walach, E., “Intelligent Forms Processing System”, Machine Vision and Applications, 5, pp. 143–155, 1992.Google Scholar
- 3.Doermann, D.S., Rosenfeld, A., “The Processing of Form Documents”, IEEE Proceedings, 1993, pp.497–501.Google Scholar
- 4.Monagan, G., Röösli, M., “Appropriate Base Representation Using a Run Graph”, IEEE, pp. 623–626, July 1993.Google Scholar