Extraction of newspaper headlines from microfilm for automatic indexing

Tan, Chew Lim; Liu, Qing Hong

doi:10.1007/s10032-003-0111-2

Extraction of newspaper headlines from microfilm for automatic indexing

Published: March 2003

Volume 6, pages 201–210, (2003)
Cite this article

Document Analysis and Recognition Aims and scope Submit manuscript

Chew Lim Tan¹ &
Qing Hong Liu²

84 Accesses
2 Citations
Explore all metrics

Abstract.

This paper proposes a document image analysis system that extracts newspaper headlines from microfilm images with a view to providing automatic indexing for news articles in microfilm. A major challenge in achieving this is the poor image quality of microfilm as most images are usually inadequately illuminated and considerably dirty. To overcome the problem we propose a new effective method for separating characters from noisy background since conventional threshold selection techniques are inadequate to deal with this kind of image. A run length smoothing algorithm is then applied to the headline extraction. Experimental results confirm the validity of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Fisher JL, Hinds SC, D’Amato DP (1990) A rule-based system for document image segmentation. In: Proceedings of the international conference on pattern recognition (ICPR), Atlantic City, NJ, June 1990, pp 567-572
Fletcher LA, Kasturi R (1988) A robust algorithm for text string separation from mixed text/graphics images. IEEE Trans Patt Analysis Mach Intell 10(6):910-918
Google Scholar
Forrester MA(1987) Evaluation of potential approach to improve digitized image quality at the patent and trademark office, MITRE Corp, Working Paper WP-87W00277, McLean, VA
Junker M, Hoch R, Dengle A (1999) On the evaluation of document analysis components by recall, precision and accuracy. In: Proceedings of the international conference on document analysis and recognition (ICDAR), Bangalore, India, September 1999, pp 713-716
Negishi H, Kato J, Hase H, Watanabe T (1999) Character extraction from noisy background for an automatic reference system. In: Proceedings of the international conference on document analysis and recognition (ICDAR), Bangalore, India, September 1999, pp 143-146
Niblack W (1986) An introduction to image processing. Prentice-Hall, Englewood Cliffs, NJ, pp 115-116
Niyogi D, Sihari SN (1997) The use of document structure analysis to retrieve information from documents in digital libraries. In: Proceedings of SPIE Document Recognition and Retrieval IV, San Jose, February 1997
Niyogi D, Sihari SN (1996) Using domain knowledge to derive the logical structure of documents. In: Proceedings of SPIE Document Recognition and Retrieval III, San Jose, January 1996
O’Gorman L (1992) Image and document processing techniques for the right pages electronic library system. In: Proceedings of the international conference on pattern recognition (ICPR), Amsterdam, August 1992, pp 260-263
O’Gorman L (1994) Binarization and multithresholding of document images using connectivity. CVGIP Graphical Model Image Process 56(6):494-506
Otsu N (1979) A threshold selection method from gray-level histogram. IEEE Trans Sys Man Cybern SMC-9(1):62-66
Google Scholar
Pavlidis T (1982) Algorithms for graphics and image processing. Computer Science Press, Rockville, MD
Takebe H, Katsuyama Y, Naoi S (1999) Character string extraction from newspaper headlines with a background design by recognizing a combination of connected component. In: Proceedings of SPIE Document Recognition and Retrieval VI, San Jose, January 1999, pp 22-29
Trier OD, Taxt T (1995) Evaluation of binarization methods for document images. IEEE Trans Patt Analysis Mach Intell 17:312-315
Google Scholar
Wong KY, Casey RG, Wahl FM (1983) Document analysis system. IBM J Res Develop 26(6):647-656
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, National University of Singapore, 3 Science Drive 2, 117543, Singapore
Chew Lim Tan
Data Storage Institute, DSI Building, Engineering Drive 1, 117608, Singapore
Qing Hong Liu

Authors

Chew Lim Tan
View author publications
You can also search for this author in PubMed Google Scholar
Qing Hong Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chew Lim Tan.

Additional information

Received: 15 November 2002, Accepted: 19 May 2003, Published online: 30 January 2004

Correspondence to: Chew Lim Tan

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tan, C.L., Liu, Q.H. Extraction of newspaper headlines from microfilm for automatic indexing. IJDAR 6, 201–210 (2003). https://doi.org/10.1007/s10032-003-0111-2

Download citation

Issue Date: March 2003
DOI: https://doi.org/10.1007/s10032-003-0111-2

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Extraction of newspaper headlines from microfilm for automatic indexing

Abstract.

Access this article

Similar content being viewed by others

Text Localization in Born-Digital Images of Advertisements

Using Scale-Space Anisotropic Smoothing for Text Line Extraction in Historical Documents

Binarization with the Local Otsu Filter

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords:

Navigation

Extraction of newspaper headlines from microfilm for automatic indexing

Abstract.

Access this article

Similar content being viewed by others

Text Localization in Born-Digital Images of Advertisements

Using Scale-Space Anisotropic Smoothing for Text Line Extraction in Historical Documents

Binarization with the Local Otsu Filter

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords:

Search

Navigation