Skip to main content


Log in

Task assignment optimization in knowledge-intensive crowdsourcing

The VLDB Journal Aims and scope Submit manuscript


We present SmartCrowd, a framework for optimizing task assignment in knowledge-intensive crowdsourcing (KI-C). SmartCrowd distinguishes itself by formulating, for the first time, the problem of worker-to-task assignment in KI-C as an optimization problem, by proposing efficient adaptive algorithms to solve it and by accounting for human factors, such as worker expertise, wage requirements, and availability inside the optimization process. We present rigorous theoretical analyses of the task assignment optimization problem and propose optimal and approximation algorithms with guarantees, which rely on index pre-computation and adaptive maintenance. We perform extensive performance and quality experiments using real and synthetic data to demonstrate that the SmartCrowd approach is necessary to achieve efficient task assignments of high-quality under guaranteed cost budget.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21


  1. With the availability of historical information, worker profiles (knowledge skills and expected wage) can be learned by the platform. Profile learning is an independent research problem in its own merit, orthogonal to this work.

  2. Acceptance ratio of a worker is the probability that she accepts a recommended task.

  3. Non-preemption ensures that a worker cannot be interrupted after she is assigned to a task.

  4. \(Q_{t_j}\) is the threshold for skill \(j\) and \(q_{t_j} \ge Q_{t_j}\).

  5. \(Q_{t_j}\) is the threshold for skill \(j\) and \(q_{t_j} \ge Q_{t_j}\).

  6. If none of the workers in \({\mathcal {A'}}\) contributed to \(t\), then \(v'_t=v_t\).

  7. Amazon Mechanical Turk,







  1. Alimonti, P.: Non-oblivious local search for max 2-ccsp with application to max dicut. In: WG ’97, pp. 2–14 (1997)

  2. Anagnostopoulos, A., Becchetti, L., Castillo, C., Gionis, A., Leonardi, S.: Online team formation in social networks. In: WWW, pp. 839–848 (2012)

  3. Baba, Y., Kashima, H.: Statistical quality estimation for general crowdsourcing tasks. In: KDD (2013)

  4. Boim, R., Greenshpan, O., Milo, T., Novgorodov, S., Polyzotis, N., Tan, W.C.: Asking the right questions in crowd data sourcing. In: ICDE (2012)

  5. Bragg, J.M., Weld, D.S.: Crowdsourcing multi-label classification for taxonomy creation. In: HCOMP (2013)

  6. Chai, K., Potdar, V., Dillon, T.: Content quality assessment related frameworks for social media. In: ICCSA ’09

  7. Chandler, D., Kapelner, A.: Breaking monotony with meaning: motivation in crowdsourcing markets. J. Econ. Behav. Organ. 90, 123–133 (2013)

  8. Dalip, D.H., Gonçalves, M.A., Cristo, M., Calado, P.: Automatic assessment of document quality in web collaborative digital libraries. JDIQ 2(3), 1–30 (2011)

  9. Dow, S., Kulkarni, A., Klemmer, S., Hartmann, B.: Shepherding the crowd yields better work. In: CSCW (2012)

  10. Downs, J.S., Holbrook, M.B., Sheng, S., Cranor, L.F.: Are your participants gaming the system? Screening mechanical turk workers. In: CHI ’10 (2010)

  11. Feige, U., Mirrokni, V.S., Vondrák, J.: Maximizing non-monotone submodular functions. In: FOCS (2007)

  12. Feng, A., Franklin, M.J., Kossmann, D., Kraska, T., Madden, S., Ramesh, S., Wang, A., Xin, R.: Crowddb: Query processing with the vldb crowd. In: PVLDB 4(12)

  13. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. WH Freeman & Co, San Francisco (1979)

  14. Goel, G., Nikzad, A., Singla, A.: Allocating tasks to workers with matching constraints: truthful mechanisms for crowdsourcing markets. In: WWW (2014)

  15. Goemans, M.X., Correa, J.R. (eds.): Lecture Notes in Computer Science, vol. 7801. Springer, Berlin (2013)

    Google Scholar 

  16. Guo, S., Parameswaran, A.G., Garcia-Molina, H.: So who won? Dynamic max discovery with the crowd. In: SIGMOD, pp. 385–396 (2012)

  17. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, Los Altos (2000)

    Google Scholar 

  18. Ho, C.J., Vaughan, J.W.: Online task assignment in crowdsourcing markets. In: AAAI (2012)

  19. van der Hoek, W., Padgham, L., Conitzer, V., Winikoff, M. (eds.): IFAAMAS (2012)

  20. Hossain, M.: Crowdsourcing: activities, incentives and users’ motivations to participate. In: ICIMTR (2012)

  21. Ipeirotis, P., Gabrilovich, E.: Quizz: Targeted crowdsourcing with a billion (potential) users. In: WWW (2014)

  22. Ipeirotis, P.G., Provost, F., Wang, J.: Quality management on amazon mechanical turk. In: HCOMP (2010)

  23. Jøsang, A., Ismail, R., Boyd, C.: A survey of trust and reputation systems for online service provision. Decis. Support Syst. 43(2), 618–644 (2007)

  24. Joyce, E., Pike, J.C., Butler, B.S.: Rules and roles vs. consensus: self-governed deliberative mass collaboration bureaucracies. Am. Behav. Sci. 57(5), 576–594 (2013)

  25. Kaplan, H., Lotosh, I., Milo, T., Novgorodov, S.: Answering planning queries with the crowd. In: PVDLB (2013)

  26. Karger, D.R., Oh, S., Shah, D.: Budget-optimal task allocation for reliable crowdsourcing systems. CoRR abs/1110.3564 (2011)

  27. Kaufmann, N., Schulze, T., Veit, D.: More than fun and money. worker motivation in crowdsourcing—a study on mechanical turk. In: AMCIS (2011)

  28. Kittur, A., Lee, B., Kraut, R.E.: Coordination in collective intelligence: the role of team structure and task interdependence. In: CHI (2009)

  29. Kittur, A., Nickerson, J.V., Bernstein, M., Gerber, E., Shaw, A., Zimmerman, J., Lease, M., Horton, J.: The future of crowd work. In: CSCW ’13 (2013)

  30. Kulkarni, A., Can, M., Hartmann, B.: Collaboratively crowdsourcing workflows with turkomatic. In: CSCW ’12

  31. Lam, S.T.K., Riedl, J.: Is Wikipedia growing a longer tail? In: GROUP ’09 (2009)

  32. Lee, S., Park, S., Park, S.: A quality enhancement of crowdsourcing based on quality evaluation and user-level task assignment framework. In: BIGCOMP (2014)

  33. Lykourentzou, I., Papadaki, K., Vergados, D.J., Polemi, D., Loumos, V.: Corpwiki: a self-regulating wiki to promote corporate collective intelligence through expert peer matching. Inf. Sci. 180(1), 18–38 (2010)

  34. Lykourentzou, I., Vergados, D.J., Naudet, Y.: Improving wiki article quality through crowd coordination: a resource allocation approach. Int. J. Semant. Web Inf. Syst. 9(3), 105–125 (2013)

    Article  Google Scholar 

  35. Marcus, A., Karger, D., Madden, S., Miller, R., Oh, S.: Counting with the crowd. In: PVLDB (2013)

  36. Marcus, A., Wu, E., Karger, D., Madden, S., Miller, R.: Human-powered sorts and joins. In: PVLDB (2011)

  37. Matsui, T., Baba, Y., Kamishima, T., Hisashi, K.: Crowdsourcing quality control for item ordering tasks. In: HCOMP (2013)

  38. Nemhauser, G. L., Wolsey, L. A., Fisher, M. L.: An analysis of approximations for maximizing submodular set functions –I. Math. Prog 14(1):265–294 (1978)

  39. O’Mahony, S., Ferraro, F.: The emergence of governance in an open source community. Acad. Manag. J. 50(5), 1079–1106 (2007)

  40. Parameswaran, A.G., Garcia-Molina, H., Park, H., Polyzotis, N., Ramesh, A., Widom, J.: Crowdscreen: algorithms for filtering data with humans. In: SIGMOD (2012)

  41. Park, H., Widom, J.: Query optimization over crowdsourced data. In: VLDB (2013)

  42. Ramesh, A., Parameswaran, A., Garcia-Molina, H., Polyzotis, N.: Identifying reliable workers swiftly. Technical report (2012)

  43. Roy, S.B., Lykourentzou, I., Thirumuruganathan, S., Amer-Yahia, S., Das, G.: Crowds, not drones: modeling human factors in interactive crowdsourcing. In: DBCrowd (2013)

  44. Rzeszotarski, J.M., Chi, E., Paritosh, P., Dai, P.: Inserting micro-breaks into crowdsourcing workflows. In: HCOMP. AAAI (2013)

  45. Soler, E.M., de Sousa, V.A., da Costa, G.R.M.: A modified primal–dual logarithmic-barrier method for solving the optimal power flow problem with discrete and continuous control variables. Eur. J. Oper. Res. 222(3), 616–622 (2012)

  46. Vondrák, J.: Symmetry and approximability of submodular maximization problems. In: FOCS (2009)

  47. Wang, J., Kraska, T., Franklin, M.J., Feng, J.: Crowder: Crowdsourcing entity resolution. In: PVLDB (2012)

  48. Wang, J., Li, G., Kraska, T., Franklin, M.J., Feng, J.: Leveraging transitive relations for crowdsourced joins. In: SIGMOD Conference, pp. 229–240 (2013)

  49. Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: NIPS (2009)

  50. Yu, L., André, P., Kittur, A., Kraut, R.: A comparison of social, learning, and financial strategies on crowd engagement and output quality. In: CSCW (2014)

  51. Yuen, M.C., King, I., Leung, K.S.: Task recommendation in crowdsourcing systems. In: CrowdKDD (2012)

  52. Zhang, H., Horvitz, E., Miller, R.C., Parkes, D.C.: Crowdsourcing general computation. In: ACM CHI 2011 Workshop on Crowdsourcing and Human Computation (2011)

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Senjuti Basu Roy.

Additional information

The work of Saravanan Thirumuruganathan and Gautam Das is partially supported by NSF Grants 0812601, 0915834, 1018865, a NHARP grant from the Texas Higher Education Coordinating Board, and grants from Microsoft Research and Nokia Research.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 166 KB)

Supplementary material 2 (pdf 126 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Basu Roy, S., Lykourentzou, I., Thirumuruganathan, S. et al. Task assignment optimization in knowledge-intensive crowdsourcing. The VLDB Journal 24, 467–491 (2015).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: