Abstract
As the volumes of AI problems involving, human knowledge are likely to soar, crowdsourcing has become essential in a wide range of world-wide-web applications. One of the biggest challenges of crowdsourcing is aggregating the answers collected from the crowd since the workers might have wide-ranging levels of expertise. In order to tackle this challenge, many aggregation techniques have been proposed. These techniques, however, have never been compared and analyzed under the same setting, rendering a ‘right’ choice for a particular application very difficult. Addressing this problem, this paper presents a benchmark that offers a comprehensive empirical study on the performance comparison of the aggregation techniques. Specifically, we integrated several state-of-the-art methods in a comparable manner, and measured various performance metrics with our benchmark, including computation time, accuracy, robustness to spammers, and adaptivity to multi-labeling. We then provide in-depth analysis of benchmarking results, obtained by simulating the crowdsourcing process with different types of workers. We believe that the findings from the benchmark will be able to serve as a practical guideline for crowdsourcing applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
von Ahn, L., et al.: Labeling images with a computer game. In: CHI (2004)
von Ahn, L., et al.: recaptcha: Human-based character recognition via web security measures. Science (2008)
Hung, N.Q.V., Tam, N.T., Miklós, Z., Aberer, K.: On leveraging crowdsourcing techniques for schema matching networks. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds.) DASFAA 2013, Part II. LNCS, vol. DASFAA, pp. 139–154. Springer, Heidelberg (2013)
Difallah, D.E., et al.: Mechanical cheat: Spamming schemes and adversarial techniques on crowdsourcing platforms. In: CrowdSearch (2012)
Doan, A., et al.: Crowdsourcing systems on the world-wide web. CACM (2011)
Hosmer, D.W., et al.: Applied logistic regression. Wiley-Interscience Publication (2000)
Ipeirotis, P.G., et al.: Quality management on amazon mechanical turk. In: HCOMP (2010)
Kamar, E., et al.: Combining human and machine intelligence in large-scale crowdsourcing. In: AAMAS (2012)
Kamar, E., et al.: Incentives for truthful reporting in crowdsourcing. In: AAMAS (2012)
Karger, D., et al.: Iterative learning for reliable crowdsourcing systems. In: NIPS (2011)
Nguyen, Q.V.H., et al.: Batc - a benchmark for aggregation techniques in crowdsourcing. In: SIGIR (2013)
Kazai, G., et al.: Worker types and personality traits in crowdsourcing relevance labels. In: CIKM (2011)
Khattak, F., et al.: Quality Control of Crowd Labeling through Expert Evaluation. In: NIPS (2011)
Nguyen, Q.V.H., et al.: Collaborative schema matching reconciliation. In: CoopIS (2013)
Kuncheva, L., et al.: Limits on the majority vote accuracy in classifier fusion. Pattern Anal. Appl. (2003)
Law, E., et al.: Human Computation. Morgan & Claypool Publishers (2011)
Lee, K., et al.: The social honeypot project: protecting online communities from spammers. In: WWW (2010)
Mason, W., et al.: Conducting behavioral research on amazon mechanical turk. BRM (2012)
Quinn, A.J., et al.: Human computation: a survey and taxonomy of a growing field. In: CHI (2011)
Nguyen, Q.V.H., et al.: Minimizing Human Effort in Reconciling Match Networks. In: ER (2013)
Raykar, V., et al.: Supervised learning from multiple experts: Whom to trust when everyone lies a bit. In: ICML (2009)
Raykar, V.C., et al.: Learning from crowds. Mach. Learn. Res. (2010)
Ross, J., et al.: Who are the crowdworkers?: shifting demographics in mechanical turk. In: CHI (2010)
Vuurens, J., et al.: How much spam can you take? an analysis of crowdsourcing results to increase accuracy. In: CIR (2011)
Whitehill, J., et al.: Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In: NIPS (2009)
Yan, T., et al.: CrowdSearch: exploiting crowds for accurate real-time image search on mobile phones. In: MobiSys (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Quoc Viet Hung, N., Tam, N.T., Tran, L.N., Aberer, K. (2013). An Evaluation of Aggregation Techniques in Crowdsourcing. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds) Web Information Systems Engineering – WISE 2013. WISE 2013. Lecture Notes in Computer Science, vol 8181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41154-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-41154-0_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41153-3
Online ISBN: 978-3-642-41154-0
eBook Packages: Computer ScienceComputer Science (R0)