One of the potential problems of volunteered geographic information (VGI) is ensuring its quality. Innocent mistakes and intentional falsehoods can reduce not only the quality of the information, but also people’s confidence in VGI as a legitimate source of data. We present a case study in VGI that addresses the quality problem by aggregating input from many different people. Specifically, we present a technique to maintain a comprehensive list of points of interest (POI) for digital maps. This is traditionally difficult, because new POI are created, because some POI are known only locally, and because some POI have multiple names. We address this problem by exploiting map annotations contributed by regular, online map users. Our institution’s mapping Web site allows users to create arbitrary collections of geographically anchored pushpins that are annotated with text. Our data mining solution finds geometric clusters of these pushpins and examines the pushpins’ text and other features for likely POI names. For instance, if a given text phrase is mentioned frequently in a cluster, but infrequently elsewhere, this increases our confidence that this phrase names a POI. We tested the quality of our results by asking 100 local residents whether or not the POI we found were correct, and our user study told us we were generally successful. We also show how we can use the same user-annotated pushpins to assess the popularity of existing POI, which is a guide for which ones to display on a map.
KeywordsPoints of interest Digital maps Online maps Data mining Volunteered geographic information
- Ahern, S., M. Naaman, et al. (2007). World Explorer: Visualizing Aggregate Data from Unstructured Text in Geo-Referenced Collections. In Seventh ACM/IEEE-CS Joint Conference on Digital Libraries, (JCDL 07). Vancouver, Canada.Google Scholar
- Anders, K.-H. (2001). Data mining for automated GIS data collection photogrammetric week 01 (pp. 263–272). Germany: Heidelberg.Google Scholar
- Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York, Chichester, Brisbane, Toronto and Singapore: Wiley.Google Scholar
- Goodchild, M. F. (2007). Citizens as voluntary sensors: Spatial data infrastructure in the World of Web 2.0. International Journal of Spatial Data Infrastructures Research, 2, 24–32.Google Scholar
- Li, D., Di, K., et al. (2000). Land use classification of remote sensing image with GIS data based on spatial data mining techniques. International Archives of Photogrammetry and Remote Sensing, 33(B3), 238–245.Google Scholar
- Miller, H. J., Han, J. (Eds.). (2001). Geographic data mining and knowledge discovery. London and New York: Taylor & Francis.Google Scholar