Advertisement

GeoInformatica

, Volume 19, Issue 1, pp 1–27 | Cite as

Recognizing text in raster maps

  • Yao-Yi ChiangEmail author
  • Craig A. Knoblock
Article

Abstract

Text labels in maps provide valuable geographic information by associating place names with locations. This information from historical maps is especially important since historical maps are very often the only source of past information about the earth. Recognizing the text labels is challenging because heterogeneous raster maps have varying image quality and complex map contents. In addition, the labels within a map do not follow a fixed orientation and can have various font types and sizes. Previous approaches typically handle a specific type of map or require intensive manual work. This paper presents a general approach that requires a small amount of user effort to semi-automatically recognize text labels in heterogeneous raster maps. Our approach exploits a few examples of text areas to extract text pixels and employs cartographic labeling principles to locate individual text labels. Each text label is then rotated automatically to horizontal and processed by conventional OCR software for character recognition. We compared our approach to a state-of-art commercial OCR product using 15 raster maps from 10 sources. Our evaluation shows that our approach enabled the commercial OCR product to handle raster maps and together produced significant higher text recognition accuracy than using the commercial OCR alone.

Keywords

GIS OCR Raster maps Text recognition Map processing 

Notes

Acknowledgment

This research is based upon work supported in part by the University of Southern California under the Viterbi School of Engineering Doctoral Fellowship.

References

  1. 1.
    Adam S, Ogier J, Cariou C, Mullot R, Labiche J, Gardes J (2000) Symbol and character recognition: application to engineering drawings. Int J Doc Anal Recog 3(2):89–101CrossRefGoogle Scholar
  2. 2.
    Cao R, Tan CL (2002) Text/graphics separation in maps. In: Proceedings of the 4th IAPR international workshop on graphics recognition, pp 167–177Google Scholar
  3. 3.
    Chen C-C, Knoblock CA, Shahabi C (2008) Automatically and accurately conflating raster maps with orthoimagery. GeoInformatica 12(3):377–410CrossRefGoogle Scholar
  4. 4.
    Chen L-H, Wang J-Y (1997) A system for extracting and recognizing numeral strings on maps. In: Proceedings of the 4th international conference on document analysis and recognition, vol 1, pp 337–341Google Scholar
  5. 5.
    Chiang Y-Y, Knoblock CA, Shahabi C, Chen C-C (2009) Accurate and automatic extraction of road intersections from raster maps. GeoInformatica 13(2):121–157CrossRefGoogle Scholar
  6. 6.
    Chiang Y-Y, Knoblock CA (2010) An approach for recognizing text labels in raster maps. In: Proceedings of the 20th international conference on pattern recognition, pp 3199–3202Google Scholar
  7. 7.
    Chiang Y-Y, Knoblock CA (2011) Recognition of multi-oriented, multi-sized, and curved text. In: Proceedings of the 11th international conference of document analysis and recognition, pp 1399–1403Google Scholar
  8. 8.
    Chiang Y-Y, Knoblock CA (2013) A general approach for extracting road vector data from raster maps. Int J Doc Anal Recog 16(1):55–81CrossRefGoogle Scholar
  9. 9.
    Chiang Y-Y, Knoblock CA (2012) Generating named road vector data from raster maps. Geographic information science, lecture notes in computer science, vol 7478/2012, pp 57–71Google Scholar
  10. 10.
    Deseilligny MP, Mena HL, Stamonb G (1995) Character string recognition on maps, a rotation-invariant recognition method. Pattern Recog Lett 16(12):1297–1310CrossRefGoogle Scholar
  11. 11.
    Edmondson S, Christensen J, Marks J, Shieber SM (1996) A general cartographic labelling algorithm. Cartographica Int J Geogr Inf Geovisualization 33(4):13–24CrossRefGoogle Scholar
  12. 12.
    Fletcher LA, Kasturi R (1988) A robust algorithm for text string separation from mixed text/graphics images. IEEE Trans Pattern Anal Mach Intell 10(6):910–918CrossRefGoogle Scholar
  13. 13.
    Gelbukh A, Levachkine S, Han S-Y (2004) Resolving ambiguities in toponym recognition in cartographic maps. In: Proceedings of the 5th IAPR international workshop on graphics recognition, pp 104–112Google Scholar
  14. 14.
    Goto H, Aso H (1998) Extracting curved text lines using local linearity of the text line. Int J Doc Anal Recognit 2(2–3):111–119Google Scholar
  15. 15.
    Kanai J, Rice SV, Nartker TA, Nagy G (1995) Automated evaluation of OCR zoning. IEEE Trans Pattern Anal Mach Intell 17(1):86–90CrossRefGoogle Scholar
  16. 16.
    Leyk S, Boesch R (2010) Colors of the past: color image segmentation in historical topographic maps based on homogeneity. GeoInformatica 14(1):1–21CrossRefGoogle Scholar
  17. 17.
    Li L, Nagy G, Samal A, Seth SC, Xu Y (2000) Integrated text and line-art extraction from a topographic map. Int J Doc Anal Recog 2(4):177–185CrossRefGoogle Scholar
  18. 18.
    Li Y, Sun J, Tang C-K, Shum H-Y (2004) Lazy snapping. ACM Trans Graph 23(3):303–308CrossRefGoogle Scholar
  19. 19.
    Mao S, Rosenfeld A, Kanungo T (2003) Document structure analysis algorithms: a literature survey. In: Proceedings of the SPIE conference on document recognition and retrieval X, vol 5010, pp 197–207Google Scholar
  20. 20.
    Mori S, Suen CY, Yamamoto K (1992) Historical review of OCR research and development. Proc IEEE 80(7):1029–1058. doi: 10.1109/5.156468
  21. 21.
    Myers GK, Mulgaonkar PG, Chen C-H, DeCurtins JL, Chen E (1996) Verification-based approach for automated text and feature extraction from raster-scanned maps. In: Lecture notes in computer science, vol 1072. Springer, pp 190–203Google Scholar
  22. 22.
    Nagy G, Samal A, Seth S, Fisher T, Guthmann E, Kalafala K, Li L, Sivasubramaniam S, Xu Y (1997) Reading street names from maps - technical challenges. In: GIS/LIS conference, pp 89–97Google Scholar
  23. 23.
    Nagy GL, Nartker TA, Rice SV (2000) Optical character recognition: An illustrated guide to the frontier. In: Proceedings of the SPIE international symposium on electronic imaging science and technology, vol 3967, pp 58–69Google Scholar
  24. 24.
    Najman L (2004) Using mathematical morphology for document skew estimation. In: Proceedings of the SPIE conference on document recognition and retrieval IX, pp 182–191Google Scholar
  25. 25.
    Pal U, Sinha S, Chaudhuri BB (2003) Multi-oriented english text line identification. In: Proceedings of the 13th scandinavian conference on image analysis, pp 1146–1153Google Scholar
  26. 26.
    Pouderoux J, Gonzato JC, Pereira A, Guitton P (2007) Toponym recognition in scanned color topographic maps. In: Proceedings of the 9th international conference on document analysis and recognition, vol 1, pp 531–535Google Scholar
  27. 27.
    Rother C, Kolmogorov V, Blake A (2004) GrabCut: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23(3):309–314CrossRefGoogle Scholar
  28. 28.
    Roy PP, Pal U, Lladós J, Kimura F (2008) Multi-oriented english text line extraction using background and foreground information. In: The eighth IAPR international workshop on document analysis systems, DAS ’08, pp 315–322. doi: 10.1109/DAS.2008.83
  29. 29.
    Roy PP, Pal U, Lladós J, Delalandre M (2009) Multi-oriented and multi-sized touching character segmentation using dynamic programming. In: Proceedings of the 10th international conference on document analysis and recognition, pp 11–15Google Scholar
  30. 30.
    Velázquez A, Levachkine S (2004) Text/graphics separation and recognition in raster-scanned color cartographic maps. In: Lladós J, Kwon Y-B (eds) Graphics recognition of lecture notes in computer science, vol 3088. Springer, pp 63–74Google Scholar
  31. 31.
    Wong KY, Wahl FM (1982) Document analysis system. IBM J Res Dev 26:647–656CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Spatial Sciences InstituteUniversity of Southern CaliforniaLos AngelesUSA
  2. 2.Department of Computer Science, Information Sciences Institute, and Spatial Sciences InstituteUniversity of Southern CaliforniaMarina del ReyUSA

Personalised recommendations