Geographically weighted evidence combination approaches for combining discordant and inconsistent volunteered geographical information

Abstract

There is much interest in being able to combine crowdsourced data. One of the critical issues in information sciences is how to combine data or information that are discordant or inconsistent in some way. Many previous approaches have taken a majority rules approach under the assumption that most people are correct most of the time. This paper analyses crowdsourced land cover data generated by the Geo-Wiki initiative in order to infer the land cover present at locations on a 50 km grid. It compares four evidence combination approaches (Dempster-Shafer, Bayes, Fuzzy Sets and Possibility) applied under a geographically weighted kernel with the geographically weighted average approach applied in many current Geo-Wiki analyses. A geographically weighted approach uses a moving kernel under which local analyses are undertaken. The contribution (or salience) of each data point to the analysis is weighted by its distance to the kernel centre, reflecting Tobler’s 1st law of geography. A series of analyses were undertaken using different kernel sizes (or bandwidths). Each of the geographically weighted evidence combination methods generated spatially distributed measures of belief in hypotheses associated with the presence of individual land cover classes at each location on the grid. These were compared with GlobCover, a global land cover product. The results from the geographically weighted average approach in general had higher correspondence with the reference data and this increased with bandwidth. However, for some classes other evidence combination approaches had higher correspondences possibly because of greater ambiguity over class conceptualisations and / or lower densities of crowdsourced data. The outputs also allowed the beliefs in each class to be mapped. The differences in the soft and the crisp maps are clearly associated with the logics of each evidence combination approach and of course the different questions that they ask of the data. The results show that discordant data can be combined (rather than being removed from analysis) and that data integrated in this way can be parameterised by different measures of belief uncertainty. The discussion highlights a number of critical areas for future research.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Notes

  1. 1.

    Everything is related to everything else, but near things are more related to each other

  2. 2.

    http://due.esrin.esa.int/page_globcover.php

References

  1. 1.

    Goodchild MF (2007) Citizens as sensors: the world of volunteered geography. Geojournal 69:211–221

    Article  Google Scholar 

  2. 2.

    Haklay M, Basiouka S, Antoniou V, Ather A (2010) How many volunteers does it take to map an area well? The validity of Linus’ law to volunteered geographic information. Cartogr J 47(4):315–322

    Article  Google Scholar 

  3. 3.

    Foody GM, See L, Fritz S, Van der Velde M, Perger C, Schill C, Boyd DS (2013) Assessing the accuracy of volunteered geographic information arising from multiple contributors to an internet based collaborative project. Trans GIS 17(6):847–860

    Article  Google Scholar 

  4. 4.

    Comber A, See L, Fritz S, Van der Velde M, Perger C, Foody GM (2013) Using control data to determine the reliability of volunteered geographic information about land cover. Int J Appl Earth Observation Geoinformation 23:37–48

    Article  Google Scholar 

  5. 5.

    See L, Comber AJ, Salk C, Fritz S, Van der Velde M, Perger C, Schill C, McCallum I, Kraxner F, Obersteiner M (2013) Comparing the quality of crowdsourced data contributed by expert and non-experts. PLoS ONE 8(7), e69958

    Article  Google Scholar 

  6. 6.

    Comber A, Brunsdon C, See L, Fritz S, McCallum I (2013) Comparing expert and non-expert conceptualisations of the land: an analysis of crowdsourced land cover data. Lecture Notes Comput Sci: Spatial Information Theory 8116:243–260

    Article  Google Scholar 

  7. 7.

    Tobler W (1970) A computer movie simulating urban growth in the Detroit region. Econ Geogr 46(2):234–240

    Article  Google Scholar 

  8. 8.

    Brunsdon CF, Fotheringham AS, Charlton M (1996) Geographically weighted regression - a method for exploring spatial non-stationarity. Geogr Anal 28:281–298

    Article  Google Scholar 

  9. 9.

    Comber A, Fisher P, Wadsworth R (2004) Integrating land cover data with different ontologies: identifying change from inconsistency. Int J Geogr Inf Sci 18(7):691–708

    Article  Google Scholar 

  10. 10.

    Comber AJ, Fisher PF, Wadsworth RA (2004) Assessment of a semantic statistical approach to detecting land cover change using inconsistent data sets. Photogramm Eng Remote Sens 70(8):931–938

    Article  Google Scholar 

  11. 11.

    Comber AJ, Carver S, Fritz S, McMorran R, Washtell J, Fisher P (2010) Different methods, different wilds: evaluating alternative mappings of wildness using Fuzzy MCE and Dempster Shafer MCE. Comput Environ Urban Syst 34:142–152

    Article  Google Scholar 

  12. 12.

    Foody GM, See L, Fritz S, Van der Velde M, Perger C, Schill C, Boyd DS, Comber A (2014) Accurate attribute mapping from volunteered geographic information: issues of volunteer quantity and quality. Cartogr J. doi:10.1179/1743277413Y.0000000070

    Google Scholar 

  13. 13.

    Haklay M (2013) Citizen science and volunteered geographic information – overview and typology of participation. In: Sui DZ, Elwood S, Goodchild MF (eds) Crowdsourcing geographic knowledge: volunteered geographic information (VGI) in theory and practice. Springer, Berlin, pp 105–122

    Google Scholar 

  14. 14.

    Goodchild MF, Li L (2012) Assuring the quality of volunteered geographic information. Spatial Statistics 1:110–120

    Article  Google Scholar 

  15. 15.

    Perger C, Fritz S, See L, Schill C, Van der Velde M, McCallum I, Obersteiner M (2012) A campaign to collect volunteered geographic Information on land cover and human impact. In: Jekel T, Car A, Strobl J, Griesebner G (Eds.) GI_Forum 2012: Geovizualisation, Society and Learning. Herbert Wichmann Verlag, VDE VERLAG GMBH, Berlin/Offenbach, 83–91

  16. 16.

    Fritz S, McCallum I, Schill C, Perger C, See L, Schepaschenko D, van der Velde M, Kraxner F, Obersteiner M (2012) Geo-Wiki: an online platform for improving global land cover. Environ Model Softw 31:110–123

    Article  Google Scholar 

  17. 17.

    Comber A, See L, Fritz S (2014) The impact of contributor confidence, expertise and distance on the crowdsourced land cover data quality. GI_Forum 2014-Geospatial Innovation for Society, http://goo.gl/nJnzwo

  18. 18.

    Pickles J (1995) Ground truth: the social implications of geographic information systems. Guilford Press

  19. 19.

    Mooney P (2011) The evolution and spatial volatility of VGI in OpenStreetMap. Paper presented at the Hengstberger Symposium Towards Digital Earth: 3D Spatial Data Infrastructures, Heidelberg, September 7–8

  20. 20.

    Comber A, Mooney P, Purves RS, Rocchini D, Walz A (2015) Comparing national differences in what people perceive to be there: mapping variations in crowd sourced land cover. ISPRS-Int Archives Photogrammetry, Remote Sensing Spatial Information Sci 1:71–75

    Article  Google Scholar 

  21. 21.

    Comber A, Mooney P, Purves R, Rocchini D, Walz A (2015) Comparing national differences in what the people perceive to be there: Mapping variations in crowd sourced land cover. In Proceedings of International Symposium on Spatial Data Quality, Montpellier 29-30th

  22. 22.

    Elwood S, Goodchild MF, Sui D (2013) Prospects for VGI research and the emerging fourth paradigm. In Crowdsourcing geographic knowledge. Springer, Netherlands, pp 361–375

    Google Scholar 

  23. 23.

    Comber AJ, Fisher PF, Wadsworth RA (2005) What is land cover? Environ Planning B 32:199–209

    Article  Google Scholar 

  24. 24.

    Zook M, Graham M, Shelton T, Gorman S (2010) Volunteered geographic information and crowdsourcing disaster relief: a case study of the Haitian earthquake. World Med Health Policy 2(2):7–33

    Article  Google Scholar 

  25. 25.

    Welinder P, Perona P (2010) Online crowdsourcing: rating annotators and obtaining cost-effective labels. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 I.E. Computer Society Conference (pp. 25–32). IEEE

  26. 26.

    McCann R, Shen W, Doan A (2008) Matching schemas in online communities: A web 2.0 approach. In Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference (pp. 110–119). IEEE

  27. 27.

    McCann R, Doan A, Varadaran V, Kramnik A, Zhai C (2003) Building data integration systems: A mass collaboration approach. In Sixth International Workshop on Web and Databases (WebDB 2003) (pp. 25–30)

  28. 28.

    Richardson M, Domingos P (2003) Building large knowledge bases by mass collaboration. In Proceedings of the 2nd international conference on Knowledge capture (pp. 129–137)

  29. 29.

    Doan A, Ramakrishnan R, Halevy AY (2011) Crowdsourcing systems on the world-wide web. Commun ACM 54(4):86–96

    Article  Google Scholar 

  30. 30.

    Cohen PR (1985) Heuristic reasoning about uncertainty: an artificial intelligence approach. Univ. of Massachusetts

  31. 31.

    Klir GJ, Yuan B (1995) Fuzzy sets and fuzzy logic: theory and applications. Prentice Hall, Englewood Cliff

    Google Scholar 

  32. 32.

    Shafer G, Pearl J (1990) Readings in uncertain reasoning. Morgan Kaufmann, San Mateo

    Google Scholar 

  33. 33.

    Parsons S, Hunter A (1998) A review of uncertainty handling formalisms. In: Hunter A, Parsons S (eds) Applications of uncertainty formalisms. Springer-Verlag, Berlin, pp 8–37

    Google Scholar 

  34. 34.

    Herold M, Mayaux P, Woodcock CE, Baccini A, Schmullius C (2008) Some challenges in global land cover mapping: an assessment of agreement and accuracy in existing 1 km datasets. Remote Sens Environ 112(5):2538–2556

    Article  Google Scholar 

  35. 35.

    See L, Fritz S, Perger C, Schill C, McCallum I, Schepaschenko D, Duerauer M, Sturn T, Karner M, Kraxner F, Obersteiner M (2015) Harnessing the power of volunteers, the Internet and Google Earth to collect and validate global spatial information using Geo-Wiki. Technol Social Forecasting. doi:10.1016/j.techfore.2015.03.002

    Google Scholar 

  36. 36.

    Gollini I, Lu B, Charlton M, Brunsdon C, Harris P (2013) GWmodel: an R Package for exploring spatial heterogeneity using geographically weighted models. arXiv preprint arXiv:1306.0413

  37. 37.

    Lesiv M, Moltchanova E, Schepaschenko D, See L, Shvidenko A, Fritz S, Comber A (2016) Comparison of data fusion methods using crowdsourced data in creating a hybrid forest cover map. Remote Sensing 7, 1-x manuscripts; doi:10.3390/rs70x000x

  38. 38.

    Fixsen D, Mahler RPS (1997) The modified Dempster-Shafer approach to classification. IEEE Trans Syst, Man Cybernetics, Part A: Syst Humans 27:96–104

    Article  Google Scholar 

  39. 39.

    Fisher P, Arnot C, Wadsworth R, Wellens J (2006) Detecting change in vague interpretations of landscapes. Ecological Informatics 1:163–178

    Article  Google Scholar 

  40. 40.

    Dubois D, Prade H (2001) Possibility theory, probability theory and multiple-valued logics: a clarification. Ann Math Artif Intell 32:35–66

    Article  Google Scholar 

  41. 41.

    Comber A, Fisher PF, Brunsdon C, Khmag A (2012) Spatial analysis of remote sensing image classification accuracy. Remote Sens Environ 127:237–246

    Article  Google Scholar 

  42. 42.

    Ali AL, Schmid F, Al-Salman R, Kauppinen T (2014) Ambiguity and plausibility: managing classification quality in volunteered geographic information. In Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (pp. 143–152). ACM

Download references

Acknowledgments

The authors would like to acknowledge the support and contribution of COST Action TD1202 ‘Mapping and the Citizen Sensor’. http://www.citizensensor-cost.eu and partial funding from the ERC project CrowdLand (No. 617754). The authors would like to thank the anonymous reviewers whose comments helped significantly improve this article.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Alexis Comber.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Comber, A., Fonte, C., Foody, G. et al. Geographically weighted evidence combination approaches for combining discordant and inconsistent volunteered geographical information. Geoinformatica 20, 503–527 (2016). https://doi.org/10.1007/s10707-016-0248-z

Download citation

Keywords

  • Crowdsourcing
  • Land cover
  • Data quality
  • VGI
  • Data mining