Incremental Quality Inference in Crowdsourcing

  • Jianhong Feng
  • Guoliang Li
  • Henan Wang
  • Jianhua Feng
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8422)


Crowdsourcing has attracted significant attention from the database community in recent years and several crowdsourced databases have been proposed to incorporate human power into traditional database systems. One big issue in crowdsourcing is to achieve high quality because workers may return incorrect answers. A typical solution to address this problem is to assign each question to multiple workers and combine workers’ answers to generate the final result. One big challenge arising in this strategy is to infer worker’s quality. Existing methods usually assume each worker has a fixed quality and compute the quality using qualification tests or historical performance. However these methods cannot accurately estimate a worker’s quality. To address this problem, we propose a worker model and devise an incremental inference strategy to accurately compute the workers’ quality. We also propose a question model and develop two efficient strategies to combine the worker’s model to compute the question’s result. We implement our method and compare with existing inference approaches on real crowdsourcing platforms using real-world datasets, and the experiments indicate that our method achieves high accuracy and outperforms existing approaches.


Majority Vote Confusion Matrix Work Model Inference Method Inference Result 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
  3. 3.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J.R.Statist.Soc.B 30(1), 1–38 (1977)MathSciNetGoogle Scholar
  4. 4.
    Feng, A., Franklin, M., Kossmann, D., Kraska, T., Madden, S., Ramesh, S., Wang, A., Xin, R.: Crowddb: Query processing with the vldb crowd. Proceedings of the VLDB Endowment 4(12) (2011)Google Scholar
  5. 5.
    Franklin, M.J., Kossmann, D., Kraska, T., Ramesh, S., Xin, R.: Crowddb: Answering queries with crowdsourcing. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, pp. 61–72. ACM (2011)Google Scholar
  6. 6.
    Howe, J.: Crowdsourcing: How the power of the crowd is driving the future of business. Random House (2008)Google Scholar
  7. 7.
    Ipeirotis, P.G., Provost, F., Wang, J.: Quality management on amazon mechanical turk. In: Proceedings of the ACM SIGKDD Workshop on Human Computation, pp. 64–67. ACM (2010)Google Scholar
  8. 8.
    Karger, D.R., Oh, S., Shah, D.: Iterative learning for reliable crowdsourcing systems. In: Advances in Neural Information Processing Systems, pp. 1953–1961 (2011)Google Scholar
  9. 9.
    Liu, X., Lu, M., Ooi, B.C., Shen, Y., Wu, S., Zhang, M.: Cdas: A crowdsourcing data analytics system. Proceedings of the VLDB Endowment 5(10), 1040–1051 (2012)CrossRefGoogle Scholar
  10. 10.
    Marcus, A., Wu, E., Karger, D.R., Madden, S., Miller, R.C.: Demonstration of qurk: a query processor for humanoperators. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, pp. 1315–1318. ACM (2011)Google Scholar
  11. 11.
    Marcus, A., Wu, E., Karger, D.R., Madden, S.R., Miller, R.C.: Crowdsourced databases: Query processing with people. In: CIDR (2011)Google Scholar
  12. 12.
    Mason, W., Suri, S.: Conducting behavioral research on amazon mechanical turk. Behavior Research Methods 44(1), 1–23 (2012)CrossRefGoogle Scholar
  13. 13.
    Park, H., Garcia-Molina, H., Pang, R., Polyzotis, N., Parameswaran, A., Widom, J.: Deco: A system for declarative crowdsourcing. Proceedings of the VLDB Endowment 5(12), 1990–1993 (2012)CrossRefGoogle Scholar
  14. 14.
    Raykar, V.C., Yu, S., Zhao, L.H., Valadez, G.H., Florin, C., Bogoni, L., Moy, L.: Learning from crowds. The Journal of Machine Learning Research 99, 1297–1322 (2010)MathSciNetGoogle Scholar
  15. 15.
    Wang, J., Kraska, T., Franklin, M.J., Feng, J.: Crowder: Crowdsourcing entity resolution. Proceedings of the VLDB Endowment 5(11), 1483–1494 (2012)CrossRefGoogle Scholar
  16. 16.
    Whitehill, J., Wu, T.-F., Bergsma, J., Movellan, J.R., Ruvolo, P.L.: Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In: Advances in Neural Information Processing Systems, pp. 2035–2043 (2009)Google Scholar
  17. 17.
    Yuen, M.-C., King, I., Leung, K.-S.: A survey of crowdsourcing systems. In: 2011 IEEE Third International Conference on Social Computing (socialcom), pp. 766–773. IEEE (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Jianhong Feng
    • 1
  • Guoliang Li
    • 1
  • Henan Wang
    • 1
  • Jianhua Feng
    • 1
  1. 1.Department of Computer ScienceTsinghua UniversityBeijingChina

Personalised recommendations