Abstract
In this paper, we discuss the potential and problems of paid crowd-based geospatial data collection. First, we present a web-based program for the crowd-based collection of geodata by paid crowdworkers that is implemented on the commercial platform microWorkers. We will discuss our implemented approach and show on data samples that in principle, it is possible to produce high-quality geospatial data sets with paid crowdsourcing. However, the problem is that geodata, which are collected by the crowd, can have limited and inhomogeneous quality. Even when experts collect geodata, one may yield incorrect objects, which we will demonstrate on examples. A possible approach to handle this problem is to collect the data not only once but also multiple times and to integrate the multiple representations into one common data set. We will analyze how the quality measures of such multiple representations are statistically distributed. Finally, we discuss how individual results as well as multiple collected data can be integrated into one common data set.
Zusammenfassung
In diesem Artikel diskutieren wir das Potential und die Probleme von bezahlter crowd-basierter Erfassung von Geodaten. Zuerst stellen wir ein webbasiertes Programm für die Erfassung von Geodaten durch bezahlte Crowd-Arbeiter vor, welches auf der kommerziellen Plattform microWorkers implementiert worden ist. Anhand von Datenbeispielen demonstrieren wir, dass mit der Crowd prinzipiell qualitativ hochwertige Geodaten erfasst werden können. Es zeigt sich jedoch, dass die Qualität der so gewonnenen Geodaten sehr inhomogen sein kann. Selbst bei einer Erfassung von Geodaten durch Experten können fehlerhafte Ergebnisse entstehen, wie wir an Beispielen zeigen werden. Eine mögliche Lösung dieses Problems ist die Daten mehrfach erfassen zu lassen und danach in einen gemeinsamen Datensatz zu integrieren. Wir untersuchen, wie sich die Qualitätsmaße solcher multiplen Repräsentationen statistisch verteilen. Abschließend diskutieren wir, wie sich die verschiedenen Teilergebnisse sowie mehrfach erfasste Daten in einen gemeinsamen Datensatz integrieren lassen.
Similar content being viewed by others
References
Aker A, El-Haj M, Albakour M-D, Kruschwitz U (2012) Assessing crowdsourcing quality through objective tasks. Paper presented at the 8th international conference on language resources and evaluation, Istanbul, Turkey, pp 1456–1461
Bär D, Biemann C, Gurevych I, Zesch T (2012) UKP: computing semantic textual similarity by combining multiple content similarity measures. In: Proceedings of the Sixth international workshop on semantic evaluation. Association for computational linguistics, Stroudsburg, PA, USA, pp 435–440
Barron C, Neis P, Zipf A (2013) A comprehensive framework for intrinsic OpenStreetMap quality analysis. Trans GIS 18(6):877–895
Bernstein M-S, Little G, Miller R-C, Hartmann B, Ackerman M-S, Karger DR, Panovich K (2010) Soylent: a word processor with a crowd inside. In: Proceedings of the 23nd annual ACM symposium on user interface software and technology, pp 313–322
Budhathoki R, Haythornthwaite C (2012) Motivation for open collaboration: crowd and community models and the case of OpenStreetMap. Am Behav Sci 57(5):548–575
Devillers R, Stein A, Bédard Y, Chrisman N, Fisher P, Shi W (2010) Thirty years of research on spatial data quality: achievements, failures, and opportunities. Trans GIS 14(4):387–440
Filippovska Y (2012) Evaluierung generalisierter Gebäudegrundrisse in großen Maßstäben, Reihe C, Nr. 693. Deutsche Geodätische Kommission, München
Glemser M (2001) Zur Berücksichtigung der geometrischen Objektunsicherheit in der Geoinformatik Reihe C, Nr. 539. Deutsche Geodätische Kommission, München
Goncalves J, Ferreira D, Hosio S, Liu Y, Rogstadius J, Kukka H, Kostakos V (2013) Crowdsourcing on the spot: altruistic use of public displays, feasibility, performance, and behaviours. In: Proceedings of the 2013 ACM international joint conference on pervasive and ubiquitous computing (UbiComp ‘13). ACM, New York, NY, USA, pp 753–762
Goncalves J, Hosio S, Rogstadius J, Karapanos E, Kostakos V (2015) Motivating participation and improving quality of contribution in ubiquitous crowdsourcing. Comput Netw 90:34–48
Hirth M, Hoßfeld T, Tran-Gia P (2011) Anatomy of a crowdsourcing platform—using the example of microworkers.com. 2011 Fifth international conference on innovative mobile and internet services in ubiquitous computing, pp 322–329
Holland H, Hoffmann P (2013) Crowdsourcing-kampagnen—Teilnahmemotivation von Konsumenten. In: Deutscher Dialogmarketing Verband e.V. (Hrsg.). Dialogmarketingperspektiven 2012/2013, Tagungsband, 7. wissenschaftlicher interdisziplinärer kongress für dialogmarketing, pp 179–209
Hossain M (2012) Crowdsourcing: activities, incentives and users’ motivations to participate. In: International conference on innovation management and technology research, Malacca, pp 501–506
Hoßfeld T, Hirth M, Tran-Gia P (2012) Crowdsourcing. Informatik Spektrum 35(3):204–208
Jain A, Sarma A, Parameswaran A, Widom J (2017) Understanding workers, developing effective tasks, and enhancing marketplace dynamics: a study of a large crowdsourcing marketplace. Proc VLDB Endow 10(7):829–840
Le J, Edmonds A, Hester V, Biewald L (2010) Ensuring quality in crowdsourced search relevance evaluation: the effects of training question distribution. In: Proceedings of the SIGIR 2010 workshop on crowdsourcing for search evaluation (CSE 2010), pp 17–20
Ledoux H, Ohori K-A (2017) Solving the horizontal conflation problem with a constrained Delaunay triangulation. J Geogr Syst 19(1):21–42
Leimeister J, Zogaj S (2013) Neue arbeitsorganisation durch crowdsourcing. Arbeitspapier Nr. 287. Hans Böckler Stiftung, Düsseldorf
Mao A, Kamar E, Chen Y, Horvitz E, Schwamb M, Lintott C, Smith A (2013) Volunteering versus work for pay: incentives and tradeoffs in crowdsourcing. AAAI publications First AAAI conference on human computation and crowdsourcing, pp 94–102
Mason W, Watts D (2009) Financial incentives and the “performance of crowds”. In: HCOMP’ 09: Proceedings of the ACM SIGKDD workshop on human computation, pp 77–85
Oxford (2017) Definition of conflate—combine into one. https://en.oxforddictionaries.com/definition/conflate. Visited 15 Jan 2018
Park S, Shoemark P, Morency L (2014) Toward crowdsourcing micro-level behavior annotations: the challenges of interface, training, and generalization. In: Proceedings of the 19th international conference on intelligent user interfaces, IUI ‘14, pp 37–46
Ross J, Irani L, Silberman M, Zaldivar A, Tomlinson B (2010) Who are the crowdworkers?: shifting demographics in mechanical turk. In: CHI ‘10 extended abstracts on human factors in computing systems (CHI EA ‘10). ACM, New York, NY, USA, pp 2863–2872
Rote G (1991) Computing the minimum Hausdorff distance between two point sets on a line under translation. Inform Proc Lett 38:123–127
Schenk E, Claude G (2009) Crowdsourcing: what can be outsourced to the crowd, and why. In: Workshop on open source innovation, Strasbourg, France 2009. http://tinyurl.com/pj44n5s. Visited 15 Jan 2018
Senaratne H, Mobasheri A, Ali A-L, Capineri C, Haklay M (2017) A review of volunteered geographic information quality assessment methods. Int J Geogr Inform Sci 31(1):139–167
Shrier D, Adjodah D, Wu W, Pentland A (2016) Prediction markets. Technical report. Massachusetts Institute of Technology, Cambridge
Spindeldreher K, Schlagwein D (2016) What drives the crowd? A meta-analysis of the motivation of participants in crowdsourcing. In: Pacific Asia Conference on Information Systems (PACIS) Proceedings 119. http://aisel.aisnet.org/pacis2016/119. Visited 15 Jan 2018
Sui D, Elwood S, Goodchild M (2013) Crowdsourcing Geographic knowledge, volunteered geographic information (VGI) in theory and practice. Springer, New York
van Exel M, Dias E, Fruijtier S (2010) The impact of crowdsourcing on spatial data quality indicators. In: Proceedings of GiScience 2011, Zurich, Switzerland, 14–17 September 2010, p 4
Walter V, Fritsch D (1999) Matching spatial data sets: a statistical approach. Int J Geogr Inform Sci 13(5):445–473
Wiemann S, Bernard L (2010) Conflation services within spatial data infrastructures. In: Painho M, Santos MY, Pundt H (eds), 13th AGILE International Conference on Geographic Information Science. pp 1–8
Xavier E, Francisco J, Manuel A (2016) A survey of measures and methods for matching geospatial vector datasets. ACM Comput Surv 49(2):34 (article 39)
Yuan S, Tao C (1999) Development of conflation components. In: Li B et al (eds) The Proceedings of Geoinformatics'99 Conference, Ann Arbor, 19–21 June, 1999, pp 1–13
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Walter, V., Sörgel, U. Implementation, Results, and Problems of Paid Crowd-Based Geospatial Data Collection. PFG 86, 187–197 (2018). https://doi.org/10.1007/s41064-018-0058-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41064-018-0058-z