Quality Control for Crowdsourced POI Collection

  • Shunsuke Kajimura
  • Yukino Baba
  • Hiroshi Kajino
  • Hisashi Kashima
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9078)


Crowdsourcing allows human intelligence tasks to be outsourced to a large number of unspecified people at low costs. However, because of the uneven ability and diligence of crowd workers, the quality of their submitted work is also uneven and sometimes quite low. Therefore, quality control is one of the central issues in crowdsourcing research. In this paper, we consider a quality control problem of POI (points of interest) collection tasks, in which workers are asked to enumerate location information of POIs. Since workers neither necessarily provide correct answers nor provide exactly the same answers even if the answers indicate the same place, we propose a two-stage quality control method consisting of an answer clustering stage and a reliability estimation stage. Implemented with a new constrained exemplar clustering and a modified HITS algorithm, the effectiveness of our method is demonstrated as compared to baseline methods on several real crowdsourcing datasets.


Quality control POI Constrained exemplar clustering HITS 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baba, Y., Kashima, H.: Statistical quality estimation for general crowdsourcing tasks. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2013)Google Scholar
  2. 2.
    Bachrach, Y., Graepel, T., Minka, T., Guiver, J.: How to grade a test without knowing the answers–a Bayesian graphical model for adaptive crowdsourcing and aptitude testing. In: Proceedings of the 29th International Conference on Machine Learning (2012)Google Scholar
  3. 3.
    Chen, X., Bennett, P.N., Collins-Thompson, K., Horvitz, E.: Pairwise ranking aggregation in a crowdsourced setting. In: Proceedings of the 6th ACM International Conference on Web Search and Data Mining (2013)Google Scholar
  4. 4.
    Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society 28(1), 20–28 (1979). Series C (Applied Statics)Google Scholar
  5. 5.
    Elhamifar, E., Sapiro, G., Vidal, R.: Finding exemplars from pairwise dissimilarities via simultaneous sparse recovery. In: Advances in Neural Information Processing Systems 25 (2012)Google Scholar
  6. 6.
    Heer, J., Bostock, M.: Crowdsourcing graphical perception: using mechanical turk to assess visualization design. In: Proceedings of the 28th SIGCHI Conference on Human Factors in Computing Systems (2010)Google Scholar
  7. 7.
    Kleinberg, J.: Authoritative sources in a hyperlinked environment. In: Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms (1998)Google Scholar
  8. 8.
    Lease, M.: On quality control and machine learning in crowdsourcing. In: Proceedings of the 3rd Human Computation Workshop (2011)Google Scholar
  9. 9.
    Lin, C., Mausam, M., Weld, D.: Crowdsourcing control: moving beyond multiple choice. In: Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence (2012)Google Scholar
  10. 10.
    Mathur, S., Jin, T., Kasturirangan, N., Chandrasekaran, J., Xue, W., Gruteser, M., Trappe, W.: Parknet: Drive-by sensing of road-side parking statistics. In: Proceedings of the 8th International Conference on Mobile Systems, Applications, and Services (2010)Google Scholar
  11. 11.
    Matsui, T., Baba, Y., Kamishima, T., Kashima, H.: Crowdordering. In: Tseng, V.S., Ho, T.B., Zhou, Z.-H., Chen, A.L.P., Kao, H.-Y. (eds.) PAKDD 2014, Part II. LNCS, vol. 8444, pp. 336–347. Springer, Heidelberg (2014) CrossRefGoogle Scholar
  12. 12.
    Raykar, V.C., Yu, S.: Ranking annotators for crowdsourced labeling tasks. In: Advances in Neural Information Processing 24 (2011)Google Scholar
  13. 13.
    Tian, Y., Zhu, J.: Learning from crowds in the presence of schools of thought. In: Proceedings of the 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2012)Google Scholar
  14. 14.
    Trushkowsky, B., Kraska, T., Franklin, M.J., Sarkar, P.: Crowdsourced enumeration queries. In: Proceedings of the 52th IEEE International Conference on Data Engineering (2013)Google Scholar
  15. 15.
    Venanzi, M., Rogers, A., Jennings, N.R.: Crowdsourcing spatial phenomena using trust-based heteroskedastic gaussian processes. In: Proceedings of the 1st AAAI Conference on Human Computation and Crowdsourcing (2013)Google Scholar
  16. 16.
    Wagstaff, K., Rogers, S., Schroedl, S.: Constrained \(k\)-means clustering with background knowledge. In: Proceedings of the 8th International Conference on Machine Learning (2001)Google Scholar
  17. 17.
    Welinder, P., Branson, S., Belongie, S., Perona, P.: The multidimensional wisdom of crowds. In: Advances in Neural Information Processing Systems 23 (2010)Google Scholar
  18. 18.
    Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Advances in Neural Information Processing Systems 22 (2009)Google Scholar
  19. 19.
    Wu, X., Fan, W., Yu, Y.: Sembler: ensembling crowd sequential labeling for improved quality. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence (2012)Google Scholar
  20. 20.
    Yi, J., Jin, R., Jain, S., Jain, A.: Inferring users’ preferences from crowdsourced pairwise comparisons: A matrix completion approach. In: Proceedings of the 1st AAAI Conference on Human Computation and Crowdsourcing (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Shunsuke Kajimura
    • 1
  • Yukino Baba
    • 2
    • 3
  • Hiroshi Kajino
    • 1
  • Hisashi Kashima
    • 4
  1. 1.The University of TokyoTokyoJapan
  2. 2.National Institute of InformaticsTokyoJapan
  3. 3.JST, ERATO, Kawarabayashi Large Graph ProjectTokyoJapan
  4. 4.Kyoto UniversityKyotoJapan

Personalised recommendations