Unlocking Textual Content from Historical Maps - Potentials and Applications, Trends, and Outlooks

  • Yao-Yi ChiangEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 709)


Digital map processing has been an interest in the image processing and pattern recognition community since the early 80s. With the exponential growth of available map scans in the archives and on the internet, a variety of disciplines in the natural and social sciences grow interests in using historical maps as a primary source of geographical and political information in their studies. Today, many organizations such as the United States Geological Survey, David Rumsey Map Collection,, and National Library of Scotland, store numerous historical maps in either paper or scanned format. Only a small portion of these historical maps is georeferenced, and even fewer of them have machine-readable content or comprehensive metadata. The lack of a searchable textual content including the spatial and temporal information prevents researchers from efficiently finding relevant maps for their research and using the map content in their studies. These challenges present a tremendous collaboration opportunity for the image processing and pattern recognition community to build advance map processing technologies for transforming the natural and social science studies that use historical maps. This paper presents the potentials of using historical maps in scientific research, describes the current trends and challenges in extracting and recognizing text content from historical maps, and discusses the future outlook.


Digital map processing Text recognition Optical character recognition Historical maps Geographic information system Natural science Social science Biology Spatial humanity 



This research is based upon work supported in part by the National Science Foundation under award number IIS-1564164 and in part by the University of Southern California under the Undergraduate Research Associates Program (URAP). The author thanks Travis Longcore for his input on the biology studies and the U.S. National Committee (USNC) to the International Cartographic Association (ICA) for providing travel funding to attend the 27th International Cartographic Conference (ICC).


  1. Adams, O.G.: Place Names in the North Central Counties of Missouri (Ph. D.). University of Missouri-Columbia (1928)Google Scholar
  2. Alex, B., Byrne, K., Grover, C., Tobin, R.: Adapting the Edinburgh geoparser for historical georeferencing. Int. J. Humanit. Comput. 9(1), 15–35 (2015)CrossRefGoogle Scholar
  3. Arteaga, M.G.: Historical map polygon and feature extractor. In: Proceedings of the 1st ACM SIGSPATIAL International Workshop on MapInteraction, pp. 66–71. ACM (2013)Google Scholar
  4. Bizer, C., Heath, T., Berners-Lee, T.: Linked data - the story so far. Int. J. Seman. Web Inf. Syst. 5(3), 1–22 (2009)CrossRefGoogle Scholar
  5. Chiang, Y.-Y., Knoblock, C.A.: Recognizing text in raster maps. GeoInformatica 19(1), 1–27 (2014)CrossRefGoogle Scholar
  6. Chiang, Y.-Y., Leyk, S., Knoblock, C.A.: A survey of digital map processing techniques. ACM Comput. Surv. (CSUR) 47(1), 1 (2014)CrossRefGoogle Scholar
  7. Chiang, Y.-Y., Leyk, S., Nazari, N.H., Moghaddam, S., Tan, T.X.: Assessing the impact of graphical quality on automatic text recognition in digital maps. Comput. Geosci. 93, 21–35 (2016)CrossRefGoogle Scholar
  8. Davis, C.C., Willis, C.G., Connolly, B., Kelly, C., Ellison, A.M.: Herbarium records are reliable sources of phenological change driven by climate and provide novel insights into species’ phenological cueing mechanisms. Am. J. Bot. 102(10), 1599–1609 (2015)CrossRefGoogle Scholar
  9. D’Ignazio, C., Bhargava, R., Zuckerman, E.: Cliff-clavin: determining geographic focus for news. In: NewsKDD: Data Science for News Publishing (2014)Google Scholar
  10. Garijo, D., Gil, Y., Harth, A.: Challenges for provenance analytics over geospatial data. In: Ludäscher, B., Plale, B. (eds.) IPAW 2014. LNCS, vol. 8628, pp. 261–263. Springer, Cham (2015). doi: 10.1007/978-3-319-16462-5_28 CrossRefGoogle Scholar
  11. Godfrey, B., Eveleth, H.: An adaptable approach for generating vector features from scanned historical thematic maps using image enhancement and remote sensing techniques in a in a geographic information system. J. Map Geogr. Librar. 11(1), 18–36 (2015)Google Scholar
  12. Gregory, I., Donaldson, C., Murrieta-Flores, P., Rayson, P.: Geoparsing, GIS, and textual analysis: current developments in spatial humanities research. Int. J. Humanit. Comput. 9(1), 1–14 (2015)CrossRefGoogle Scholar
  13. Gregory, I.N., Ell, P.S.: Historical GIS: Technologies, Methodologies, and Scholarship, vol. 39. Cambridge University Press, Cambridge (2007)Google Scholar
  14. Guralnick, R.P., Wieczorek, J., Beaman, R., Hijmans, R.J., Group, B.W., et al.: BioGeomancer: automated georeferencing to map the world’s biodiversity data. PLoS Biol. 4(11), e381 (2006)Google Scholar
  15. Hill, A.W., Guralnick, R., Flemons, P., Beaman, R., Wieczorek, J., Ranipeta, A., Chavan, V., Remsen, D.: Location, location, location: utilizing pipelines and services to more effectively georeference the world’s biodiversity data. BMC Bioinf. 10(Suppl 14), S3 (2009)Google Scholar
  16. Honarvar Nazari, N., Tan, T.X., Chiang, Y.-Y.: Integrating text recognition for overlapping text detection in maps. Electron. Imaging Doc. Recogn. Retrieval XXIII 17, 1–8 (2016)Google Scholar
  17. Khotanzad, A., Zink, E.: Contour line and geographic feature extraction from USGS color topographical paper maps. IEEE Trans. Pattern Anal. Mach. Intell. 25(1), 18–31 (2003)CrossRefGoogle Scholar
  18. Kurashige, L.: Rethinking anti-immigrant racism: lessons from the Los Angeles vote on the 1920 Alien Land Law. Southern Calif. Q. 95(3), 265–283 (2013)CrossRefGoogle Scholar
  19. Lavoie, C.: Biological collections in an ever changing world: herbaria as tools for biogeographical and environmental studies. Perspect. Plant Ecol. Evol. Syst. 15(1), 68–76 (2013)CrossRefGoogle Scholar
  20. Leidner, J.L., Lieberman, M.D.: Detecting geographical references in the form of place names and associated spatial natural language. Sigspatial Spec. 3(2), 5–11 (2011)CrossRefGoogle Scholar
  21. Leyk, S., Boesch, R.: Colors of the past: color image segmentation in historical topographic maps based on homogeneity. GeoInformatica 14(1), 1–21 (2009)CrossRefGoogle Scholar
  22. Leyk, S., Boesch, R., Weibel, R.: Saliency and semantic processing: extracting forest cover from historical topographic maps. Pattern Recogn. 39(5), 953–968 (2006)CrossRefGoogle Scholar
  23. Li, L., Nagy, G., Samal, A., Seth, S., Xu, Y.: Integrated text and line-art extraction from a topographic map. Int. J. Doc. Anal. Recogn. 2(4), 177–185 (2000)CrossRefGoogle Scholar
  24. Murphey, P.C., Guralnick, R.P., Glaubitz, R., Neufeld, D., Ryan, J.A.: Georeferencing of museum collections: a review of problems and automated tools, and the methodology developed by the mountain and plains spatio-temporal database-informatics initiative (Mapstedi). Phyloinformatics 1(3), 1–29 (2004)Google Scholar
  25. Nagy, G., Samal, A., Seth, S., Fisher, T.: Reading street names from maps-technical challenges. In: Proceedings of GIS/LIS (1997)Google Scholar
  26. Nanetti, A., Cattaneo, A., Cheong, S.A., Lin, C.-Y.: Maps as knowledge aggregators: from Renaissance Italy Fra mauro to web search engines. Cartographic J. 52(2), 159–167 (2015)CrossRefGoogle Scholar
  27. Newbold, T.: Applications and limitations of museum data for conservation and ecology, with particular attention to species distribution models. Prog. Phys. Geogr. 34(1), 3–22 (2010)CrossRefGoogle Scholar
  28. Ngo, V., Swift, J., Chiang, Y.-Y.: Visualizing land reclamation in Hong Kong: a web application. In: International Cartographic Conference (2015)Google Scholar
  29. Pezeshk, A., Tutwiler, R.L.: Improved multi angled parallelism for separation of text from intersecting linear features in scanned topographic maps. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1078–1081. IEEE (2010)Google Scholar
  30. Pezeshk, A., Tutwiler, R.L.: Automatic feature extraction and text recognition from scanned topographic maps. IEEE Trans. Geosci. Remote Sens. 49(12), 5047–5063 (2011). A Publication of the IEEE Geoscience and Remote Sensing SocietyCrossRefGoogle Scholar
  31. Pyke, G.H., Ehrlich, P.R.: Biological collections and ecological/environmental research: a review, some observations and a look to the future. Biol. Rev. Camb. Philos. Soc. 85(2), 247–266 (2010)CrossRefGoogle Scholar
  32. Raveaux, R., Burie, J.C., Ogier, J.M.: A colour document interpretation: application to ancient cadastral maps. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), vol. 2, pp. 1128–1132. IEEE (2007)Google Scholar
  33. Raveaux, R., Burie, J.C., Ogier, J.M.: Object extraction from colour cadastral maps. In: The Eighth IAPR International Workshop on Document Analysis Systems, DAS 2008, pp. 506–514. IEEE (2008)Google Scholar
  34. Rios, N.E., Bart, H.L.: GEOLocate (Version 3.22) [Computer software] (2010)Google Scholar
  35. Samy, G., Chavan, V., Ariño, A.H., Otegui, J., Hobern, D., Sood, R., Robles, E.: Content assessment of the primary biodiversity data published through GBIF network: status, challenges and potentials. Biodivers. Inform. 8(2) (2013).
  36. Simon, R., Barker, E., Isaksen, L.: Linking early geospatial documents, one place at a time: annotation of geographic documents with Recogito. E-Perimetron 10(2), 49–59 (2015)Google Scholar
  37. Simon, R., Pilgerstorfer, P., Isaksen, L., Barker, E.: Towards semi-automatic annotation of toponyms on old maps. E - Perimetron 9(3), 105–128 (2014)Google Scholar
  38. Simon, R., Sadilek, C., Korb, J., Baldauf, M., Haslhofer, B.: Tag clouds and old maps: annotations as linked spatiotemporal data in the cultural heritage domain. In: Workshop on Linked Spatiotemporal Data, Zurich, Switzerland (2010)Google Scholar
  39. Torr, P.H.S., Zisserman, A.: MLESAC: a new robust estimator with application to estimating image geometry. Comput. Vis. Image Underst. CVIU 78(1), 138–156 (2000)CrossRefGoogle Scholar
  40. Vellend, M., Brown, C.D., Kharouba, H.M., McCune, J.L., Myers-Smith, I.H.: Historical ecology: using unconventional data sources to test for effects of global environmental change. Am. J. Bot. 100(7), 1294–1305 (2013)CrossRefGoogle Scholar
  41. Weinman, J.: Toponym recognition in historical maps by Gazetteer alignment. In: Proceedings of the 12th International Conference on Document Analysis and Recognition, pp. 1044–1048 (2013)Google Scholar
  42. Yoshida, K., Burbano, H.A., Krause, J., Thines, M., Weigel, D., Kamoun, S.: Mining herbaria for plant pathogen genomes: back to the future. PLoS Pathog. 10(4), e1004028 (2014)CrossRefGoogle Scholar
  43. Yu, R., Luo, Z., Chiang, Y.-Y.: Recognizing text on historical maps using maps from multiple time periods. In: Proceedings of the 23rd International Conference on Pattern Recognition (2016)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2017

Authors and Affiliations

  1. 1.Spatial Sciences InstituteUniversity of Southern CaliforniaLos AngelesUSA

Personalised recommendations