Abstract
BigBatch is a processing environment designed to automatically process batches of millions of monochromatic images of documents generated by production line scanners. It removes noisy borders, checks and corrects orientation, calculates and compensates the skew angle, crops the image standardizing document sizes, and finally compresses it according to user defined file format. BigBatch encompasses the best and recently developed algorithms for such kind of document images. BigBatch may work either in standalone or operator assisted modes. Besides that, BigBatch in standalone mode is able to process in clusters of workstations or in grids.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ávila, B.T., Lins, R.D.: A New Algorithm for Removing Noisy Borders from Monochromatic Documents. In: AACC 2004, pp. 1219–1225 (2004)
Ávila, B.T., Lins, R.D.: Efficient Removal of Noisy Borders from Monochromatic Documents. In: Campilho, A.C., Kamel, M.S. (eds.) ICIAR 2004. LNCS, vol. 3212, pp. 249–256. Springer, Heidelberg (2004)
Ávila, B.T., Lins, R.D.: A New and Fast Orientation and Skew Detection Algorithm for Monochromatic Document Images. In: ACM Symposium on Document Engineering (2005)
Ávila, B.T., Lins, R.D., Augusto, L.: A New Rotation Algorithm for Monochromatic Images. In: ACM Symposium on Document Engineering (2005)
Buyya, R.: High Performance Cluster Computing: Architectures and Systems. Prentice-Hall, Englewood Cliffs (1999)
Lins, R.D., Ávila, B.T.: A New Algorithm for Skew Detection in Images of Documents. In: Campilho, A.C., Kamel, M.S. (eds.) ICIAR 2004. LNCS, vol. 3212, pp. 234–240. Springer, Heidelberg (2004)
Lins, R.D., Alves, N.F.: A New Technique for Assessing the Performance of OCRs. In: International Conference on Computer Applications, vol. 1, pp. 51–56 (2005)
Mowbray, M.: OurGrid: a Web-based Community Grid. In: Proc. of the IADIS International Conference on Web Based Communities (2006)
O’Gorman, L., Kasturi, R.: Document Image Analysis. IEEE Computer Society Executive Briefing (1997)
Parameswaran, M., Susarla, A., Whinston, A.B.: P2P Networking: An Information-Sharing Alternative. IEEE Computer 7(34), 31–38 (2001)
Santos-Neto, E., Cirne, W., Brasileiro, F., Lima, A.: Exploiting replication and data reuse to efficiently schedule data-intensive applications on grids. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 210–232. Springer, Heidelberg (2005)
BlackIce Document Imaging SDK 10. BlackIce Software Inc., http://www.blackice.com/
ClearImage 5. Inlite Res. Inc., http://www.inliteresearch.com
Delphi 7 and Kylix. Borland Inc., http://www.borland.com
Kodak Digital Science Scanner 1500, http://www.kodak.com/global/en/business/docimaging/1500002/
Leadtools 13. Leadtools Inc., http://www.leadtools.com
ScanFix Bitonal Image Optimizer 4.21. TMS Sequoia, http://www.tmsinc.com
Skyline Tools Corporate Suite 7. Skyline Tools Imaging, http://www.skylinetools.com
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lins, R.D., Ávila, B.T., de Araújo Formiga, A. (2006). BigBatch – An Environment for Processing Monochromatic Documents. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2006. Lecture Notes in Computer Science, vol 4142. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11867661_80
Download citation
DOI: https://doi.org/10.1007/11867661_80
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44894-5
Online ISBN: 978-3-540-44896-9
eBook Packages: Computer ScienceComputer Science (R0)