Knowledge and Information Systems

, Volume 53, Issue 1, pp 1–41 | Cite as

Crowdsourcing for data management

  • Valter Crescenzi
  • Alvaro A. A. Fernandes
  • Paolo Merialdo
  • Norman W. Paton
Survey Paper
  • 197 Downloads

Abstract

Crowdsourcing provides access to a pool of human workers who can contribute solutions to tasks that are challenging for computers. Proposals have been made for the use of crowdsourcing in a wide range of data management tasks, including data gathering, query processing, data integration, and cleaning. We provide a classification of key features of these proposals and survey results to date, identifying recurring themes and open issues.

Keywords

Data management Crowdsourcing Data integration Data cleaning Data extraction Entity resolution 

References

  1. 1.
    Acosta M, Zaveri A, Simperl E, Kontokostas D, Auer S, Lehmann J (2013) Crowdsourcing linked data quality assessment. ISWC 2:260–276Google Scholar
  2. 2.
    Allahbakhsh M, Benatallah B, Ignjatovic A, Motahari-Nezhad H, Bertino E, Dustdar S (2013) Quality control in crowdsourcing systems: issues and directions. IEEE Internet Comput 17(2):76–81CrossRefGoogle Scholar
  3. 3.
    Amsterdamer Y, Grossman Y, Milo T, Senellart P (2013) Crowd mining. In: ACM SIGMOD. pp 241–252Google Scholar
  4. 4.
    Amsterdamer Y, Grossman Y, Milo T, Senellart P (2013) Crowdminer: mining association rules from the crowd. PVLDB 6(12):1250–1253. http://www.vldb.org/pvldb/vol6/p1250-amsterdamer.pdf
  5. 5.
    Amsterdamer Y, Milo T (2015) Foundations of crowd data sourcing. ACM SIGMOD Rec 43(4):5–14CrossRefGoogle Scholar
  6. 6.
    Anagnostopoulos A, Becchetti L, Fazzone A, Mele I, Riondato M (2015) The importance of being expert: efficient max-finding in crowdsourcing. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, SIGMOD ’15. ACM, New York, pp 983–998, NY, USA. doi:10.1145/2723372.2723722
  7. 7.
    Belhajjame K, Paton NW, Embury SM, Fernandes AAA, Hedeler C (2013) Incrementally improving dataspaces based on user feedback. Inf Syst 38(5):656–687CrossRefGoogle Scholar
  8. 8.
    Belhajjame K, Paton NW, Hedeler C, Fernandes AAA (2015) Enabling community-driven information integration through clustering. Distrib Parallel Databases 33(1):33–67. doi:10.1007/s10619-014-7160-z CrossRefGoogle Scholar
  9. 9.
    Bilenko M, Kamath B, Mooney R (2006) Adaptive blocking: learning to scale up record linkage. In: ICDM. pp 87–96. doi:10.1109/ICDM.2006.13
  10. 10.
    Bizer C, Lehmann J, Kobilarov GS, Becker C, Cyganiak R, Hellmann S (2009) Dbpedia—a crystallization point for the web of data. J Web Semant 7(3):154–165CrossRefGoogle Scholar
  11. 11.
    Boim R, Greenshpan O, Milo T, Novgorodov S, Polyzotis N, Tan WC (2012) Asking the right questions in crowd data sourcing. In: 2012 IEEE 28th international conference on data engineering (ICDE). pp 1261–1264. doi:10.1109/ICDE.2012.122
  12. 12.
    Bozzon A, Brambilla M, Ceri S (2012) Answering search queries with crowdsearcher. In: Proceedings of 21st WWW. pp 1009–1018Google Scholar
  13. 13.
    Bozzon A, Brambilla M, Ceri S, Silvestri M, Vesci G (2013) Choosing the right crowd: expert finding in social networks. In: Joint 2013 EDBT/ICDT Conferences, EDBT ’13 Proceedings, Genoa, Italy, 18–22 March, 2013. pp 637–648. doi:10.1145/2452376.2452451
  14. 14.
    Bühmann L, Usbeck R, Ngomo AN, Saleem M, Both A, Crescenzi V, Merialdo P, Qiu D (2014) Web-scale extension of RDF knowledge bases from templated websites. In: The Semantic Web—ISWC. pp 66–81Google Scholar
  15. 15.
    Cao CC, Chen L, Jagadish HV (2014) From labor to trader: opinion elicitation via online crowds as a market. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’14. ACM, pp 1067–1076, New York, NY, USA. doi:10.1145/2623330.2623717
  16. 16.
    Cao CC, She J, Tong Y, Chen L (2012) Whom to ask? jury selection for decision making tasks on micro-blog services. PVLDB 5(11):1495–1506. http://vldb.org/pvldb/vol5/p1495_calebchencao_vldb2012.pdf
  17. 17.
    Cao CC, Tong Y, Chen L, Jagadish HV (2013) Wisemarket: a new paradigm for managing wisdom of online social users. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’13. ACM, pp 455–463, New York, NY, USA. doi:10.1145/2487575.2487642
  18. 18.
    Chang C, Kayed M, Girgis M, Shaalan K (2006) A survey of web information extraction systems. IEEE TKDE 18(10):1411–1428Google Scholar
  19. 19.
    Christen P (2012) A survey of indexing techniques for scalable record linkage and deduplication. IEEE TKDE 24(9):1537–1555. doi:10.1109/TKDE.2011.127 Google Scholar
  20. 20.
    Chu X, Morcos J, Ilyas IF, Ouzzani M, Papotti P, Tang N, Ye Y (2015) KATARA: a data cleaning system powered by knowledge bases and crowdsourcing. In: SIGMOD. pp 1247–1261. doi:10.1145/2723372.2749431
  21. 21.
    Ciceri E, Fraternali P, Martinenghi D, Tagliasacchi M (2016) Crowdsourcing for top-k query processing over uncertain data. IEEE Trans Knowl Data Eng 28(1):41–53. doi:10.1109/TKDE.2015.2462357 CrossRefMATHGoogle Scholar
  22. 22.
    Crescenzi V, Merialdo P, Qiu D (2013) A framework for learning web wrappers from the crowd. In: WWW. pp 261–272Google Scholar
  23. 23.
    Crescenzi V, Merialdo P, Qiu D (2014) Crowdsourcing large scale wrapper inference. Distrib Parallel Databases 33:95–122CrossRefGoogle Scholar
  24. 24.
    Dalvi N, Dasgupta A, Kumar R, Rastogi V (2013) Aggregating crowdsourced binary ratings. In: Proceedings of the 22nd international conference on World Wide Web. ACM, pp. 285–294Google Scholar
  25. 25.
    Das Sarma AD, Parameswaran A, Widom J (2016) Globally optimal crowdsourcing quality management. In: Proceedings of the 2016 ACM SIGMOD international conference on management of data, SIGMOD ’16Google Scholar
  26. 26.
    Davidson SB, Khanna S, Milo T, Roy S (2013) Using the crowd for top-k and group-by queries. In: Proceedings of ICDT ’13. pp 225–236Google Scholar
  27. 27.
    Dawid AP, Skene AM (1979) Maximum likelihood estimation of observer error-rates using the EM algorithm. J Roy Stat Soc. Ser C (Appl Stat) 28(1):20–28Google Scholar
  28. 28.
    Demartini G, Difallah DE, Cudré-Mauroux P (2013) Large-scale linked data integration using probabilistic reasoning and crowdsourcing. VLDB J 22(5):665–687CrossRefGoogle Scholar
  29. 29.
    Demartini G, Trushkowsky B, Kraska T, Franklin MJ (2013) CrowdQ: Crowdsourced query understanding. In: CIDRGoogle Scholar
  30. 30.
    Doan A, Ramakrishnan R, Halevy AY (2011) Crowdsourcing systems on the world-wide web. Commun ACM 54(4):86–96CrossRefGoogle Scholar
  31. 31.
    Donmez P, Carbonell JG, Schneider J (2009) Efficiently learning the accuracy of labeling sources for selective sampling. In: 15th ACM SIGKDD. pp 259–268Google Scholar
  32. 32.
    Elmagarmid A, Ipeirotis P, Verykios V (2007) Duplicate record detection: a survey. IEEE TKDE 19(1):1–16. doi:10.1109/TKDE.2007.250581 Google Scholar
  33. 33.
    Fan J, Li G, Ooi BC, Tan Kl, Feng J (2015) icrowd: an adaptive crowdsourcing framework. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data. ACM, pp 1015–1030Google Scholar
  34. 34.
    Fan J, Lu M, Ooi BC, Tan W, Zhang M (2014) A hybrid machine-crowdsourcing system for matching web tables. In: IEEE 30th International conference on data engineering, Chicago, ICDE 2014, IL, USA, March 31–April 4, 2014. pp 976–987. doi:10.1109/ICDE.2014.6816716
  35. 35.
    Fan J, Zhang M, Kok S, Lu M, Ooi BC (2015) Crowdop: query optimization for declarative crowdsourcing systems. IEEE Trans Knowl Data Eng 27(8):2078–2092. doi:10.1109/TKDE.2015.2407353 CrossRefGoogle Scholar
  36. 36.
    Faradani S, Hartmann B, Ipeirotis PG (2011) What’s the right price? pricing tasks for finishing on time. In: Human computation, AAAI Workshops, vol WS-11-11. AAAI. http://dblp.uni-trier.de/db/conf/aaai/hc2011.html#FaradaniHI11
  37. 37.
    Franklin M, Kossmann D, Kraska T, Ramesh S, Xin R (2011) Crowddb: answering queries with crowdsourcing. In: ACM SIGMOD. pp 61–72Google Scholar
  38. 38.
    Franklin MJ, Halevy AY, Maier D (2005) From databases to dataspaces: a new abstraction for information management. SIGMOD Rec 34(4):27–33CrossRefGoogle Scholar
  39. 39.
    Franklin MJ, Trushkowsky B, Sarkar P, Kraska T (2013) Crowdsourced enumeration queries. In: Proceedings of ICDE. doi:10.1109/ICDE.2013.6544865
  40. 40.
    Gao J, Liu X, Ooi BC, Wang H, Chen G (2013) An online cost sensitive decision-making method in crowdsourcing systems. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data, SIGMOD ’13. ACM, pp 217–228, New York, NY, USA. doi:10.1145/2463676.2465307
  41. 41.
    Gao Y, Parameswaran A (2014) Finish them!: pricing algorithms for human computation. Proc VLDB Endow 7(14):1965–1976CrossRefGoogle Scholar
  42. 42.
    Gokhale C, Das S, Doan A, Naughton JF, Rampalli N, Shavlik JW, Zhu X (2014) Corleone: hands-off crowdsourcing for entity matching. In: SIGMOD conference. pp 601–612Google Scholar
  43. 43.
    Guo S, Parameswaran A, Garcia-Molina H (2012) So who won?: dynamic max discovery with the crowd. In: ACM SIGMOD. pp 385–396Google Scholar
  44. 44.
    Hall MA, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1):10–18. doi:10.1145/1656274.1656278 CrossRefGoogle Scholar
  45. 45.
    Ho CJ, Jabbari S, Vaughan JW (2013) Adaptive task assignment for crowdsourced classification. In: ICML (1). pp 534–542Google Scholar
  46. 46.
    Howe J (2006) The rise of crowdsourcing. Wired 14(6):1–4Google Scholar
  47. 47.
    Hung NQV, Tam NT, Chau VT, Wijaya TK, Miklós Z, Aberer K, Gal A, Weidlich M (2015) SMART: a tool for analyzing and reconciling schema matching networks. In: 31st IEEE international conference on data engineering, ICDE 2015, Seoul, South Korea, 13–17 April, 2015, pp 1488–1491. doi:10.1109/ICDE.2015.7113408
  48. 48.
    Hung NQV, Tam NT, Miklós Z, Aberer K (2013) On leveraging crowdsourcing techniques for schema matching networks. In: DASFAA (2). pp 139–154Google Scholar
  49. 49.
    Hung NQV, Tam NT, Miklós Z, Aberer K, Gal A, Weidlich M (2014) Pay-as-you-go reconciliation in schema matching networks. In: IEEE 30th international conference on data engineering, Chicago, ICDE 2014, IL, USA, March 31–April 4, 2014, pp 220–231. doi:10.1109/ICDE.2014.6816653
  50. 50.
    Hung NQV, Tam NT, Tran LN, Aberer K (2013) An evaluation of aggregation techniques in crowdsourcing. In: International conference on web information systems engineering. Springer, pp 1–15Google Scholar
  51. 51.
    Ipeirotis P (2010) Analyzing the amazon mechanical turk marketplace. XRDS ACM Crossroads 17(2):16–21CrossRefGoogle Scholar
  52. 52.
    Ipeirotis P, Provost F, Wang J (2010) Quality management on Amazon mechanical turk. In: Proceedings ACM SIGKDD Workshop on Human Computation. pp 64–67Google Scholar
  53. 53.
    Isele R, Bizer C (2012) Learning expressive linkage rules using genetic programming. PVLDB 5(11):1638–1649Google Scholar
  54. 54.
    Isele R, Bizer C (2013) Active learning of expressive linkage rules using genetic programming. J Web Semant 23:2–15CrossRefGoogle Scholar
  55. 55.
    Jeffery SR, Franklin MJ, Halevy AY (2008) Pay-as-you-go user feedback for dataspace systems. In: SIGMOD conference. pp 847–860Google Scholar
  56. 56.
    Jeffery SR, Sun L, DeLand M, Pendar N, Barber R, Galdi A (2013) Arnold: declarative crowd-machine data integration. In: CIDR 2013, sixth biennial conference on innovative data systems research, Asilomar, CA, USA, 6–9 January, 2013, Online Proceedings. http://www.cidrdb.org/cidr2013/Papers/CIDR13_Paper22.pdf
  57. 57.
    Joglekar M, Garcia-Molina H, Parameswaran A (2013) Evaluating the crowd with confidence. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 686–694Google Scholar
  58. 58.
    Kandel S, Paepcke A, Hellerstein JM, Heer J (2011) Wrangler: interactive visual specification of data transformation scripts. In: Proceedings of the international conference on human factors in computing systems, CHI 2011, Vancouver, BC, Canada, 7–12 May, 2011. pp 3363–3372Google Scholar
  59. 59.
    Karger DR, Oh S, Shah D (2011) Budget-optimal crowdsourcing using low-rank matrix approximations. In: 2011 49th annual allerton conference on communication, control, and computing (allerton). IEEE, pp 284–291Google Scholar
  60. 60.
    Karger DR, Oh S, Shah D (2011) Iterative learning for reliable crowdsourcing systems. In: 25th conference on neural information processing systems. pp 1953–1961Google Scholar
  61. 61.
    Karger DR, Oh S, Shah D (2014) Budget-optimal task allocation for reliable crowdsourcing systems. Oper Res 62(1):1–24CrossRefMATHGoogle Scholar
  62. 62.
    Kondreddi SK, Triantafillou P, Weikum G (2014) Combining information extraction and human computing for crowdsourced knowledge acquisition. In: 2014 IEEE 30th international conference on data engineering (ICDE). IEEE, pp 988–999Google Scholar
  63. 63.
    Li G, Wang J, Zheng Y, Franklin MJ (2016) Crowdsourced data management: a survey. IEEE Trans Knowl Data Eng 28(9):2296–2319CrossRefGoogle Scholar
  64. 64.
    Liu X, Lu M, Ooi BC, Shen Y, Wu S, Zhang M (2012) Cdas: a crowdsourcing data analytics system. Proc VLDB Endow 5(10):1040–1051CrossRefGoogle Scholar
  65. 65.
    Lofi C, Maarry KE, Balke WT (2013) Skyline queries in crowd-enabled databases. In: Proceedings of 16th EDBT. pp 465–476Google Scholar
  66. 66.
    Marcus A, Karger D, Madden S, Miller R, Oh S (2012) Counting with the crowd. PVLDB 6(2):109–120Google Scholar
  67. 67.
    Marcus A, Parameswaran A (2015) Crowdsourced data management: industry and academic perspectives. Found Trends Databases 6(1–2):1–161CrossRefGoogle Scholar
  68. 68.
    Marcus A, Wu E, Karger DR, Madden S, Miller RC (2011) Demonstration of qurk: a query processor for human operators. In: SIGMOD conference. pp 1315–1318Google Scholar
  69. 69.
    Marcus A, Wu E, Karger DR, Madden S, Miller RC (2011) Human-powered sorts and joins. PVLDB 5(1):13–24Google Scholar
  70. 70.
    Marge M, Banerjee S, Rudnicky A (2010) Using the Amazon mechanical turk for transcription of spoken language. In: International conference acoustics speech and signal processing (ICASSP). IEEE, pp 5270–5273Google Scholar
  71. 71.
    Mason W, Suri S (2012) Conducting behavioral research on amazons mechanical turk. Behav Res Methods 44(1):1–23CrossRefGoogle Scholar
  72. 72.
    McCann R, Shen W, Doan A (2008) Matching schemas in online communities: a web 2.0 approach. In: Procedings 24th ICDE. pp 110–119Google Scholar
  73. 73.
    Michelson M, Knoblock CA (2006) Learning blocking schemes for record linkage. In: Proceedings of 21st AAAI. AAAI Press, pp 440–445Google Scholar
  74. 74.
    Mortensen J, Alexander PR, Musen MA, Noy NF (2013) Crowdsourcing ontology verification. In: ICBO. pp 40–45Google Scholar
  75. 75.
    Mozafari B, Sarkar P, Franklin M, Jordan M, Madden S (2014) Scaling up crowd-sourcing to very large datasets: a case for active learning. Proc VLDB Endow 8(2):125–136CrossRefGoogle Scholar
  76. 76.
    Muhammadi J, Rabiee HR, Hosseini A (2015) A unified statistical framework for crowd labeling. Knowl Inf Syst 45(2):271–294. doi:10.1007/s10115-014-0790-7 CrossRefGoogle Scholar
  77. 77.
    Nguyen QVH, Duong CT, Weidlich M, Aberer K (2015) Minimizing efforts in validating crowd answers. In: The 2015 ACM SIGMOD/PODS conference, EPFL-CONF-204725Google Scholar
  78. 78.
    Osorno-Gutierrez F, Paton NW, Fernandes AAA (2013) Crowdsourcing feedback for pay-as-you-go data integration. In: DBCrowd. pp 32–37Google Scholar
  79. 79.
    Paolacci G, Chandler J, Ipeirotis P (2010) Running experiments on amazon mechanical turk. Judgm Decis Mak 5(5):411–419Google Scholar
  80. 80.
    Parameswaran AG, Boyd S, Garcia-Molina H, Gupta A, Polyzotis N, Widom J (2014) Optimal crowd-powered rating and filtering algorithms. PVLDB 7(9):685–696Google Scholar
  81. 81.
    Parameswaran AG, Garcia-Molina H, Park H, Polyzotis N, Ramesh A, Widom J (2012) Crowdscreen: algorithms for filtering data with humans. In: ACM SIGMOD. pp. 361–372. doi:10.1145/2213836.2213878
  82. 82.
    Parameswaran AG, Park H, Garcia-Molina H, Polyzotis N, Widom J (2012) Deco: declarative crowdsourcing. In: Proceedings of 21st CIKM. pp 1203–1212Google Scholar
  83. 83.
    Parameswaran AG, Teh MH, Garcia-Molina H, Widom J (2013) Datasift: an expressive and accurate crowd-powered search toolkit. In: Proceedings of AAAI conference on human computation and crowdsourcingGoogle Scholar
  84. 84.
    Park H, Widom J (2013) Query optimization over crowdsourced data. PVLDB 6(10):781–792Google Scholar
  85. 85.
    Park H, Widom J (2014) Crowdfill: collecting structured data from the crowd. In: ACM SIGMODGoogle Scholar
  86. 86.
    Quinn AJ, Bederson BB (2011) Human computation: a survey and taxonomy of a growing field. In: CHI. pp 1403–1412Google Scholar
  87. 87.
    Rahm E, Bernstein PA (2001) A survey of approaches to automatic schema matching. VLDB J 10(4):334–350CrossRefMATHGoogle Scholar
  88. 88.
    Raykar VC, Yu S, Zhao LH, Jerebko A, Florin C, Valadez GH, Bogoni L, Moy L (2009) Supervised learning from multiple experts: whom to trust when everyone lies a bit. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 889–896Google Scholar
  89. 89.
    Sarma AD, Dong X, Halevy AY (2008) Bootstrapping pay-as-you-go data integration systems. In: SIGMOD. pp 861–874Google Scholar
  90. 90.
    Sarma AD, Parameswaran AG, Garcia-Molina H, Halevy AY (2014) Crowd-powered find algorithms. In: IEEE 30th international conference on data engineering, Chicago, ICDE 2014, IL, USA, March 31–April 4, 2014, pp 964–975Google Scholar
  91. 91.
    Selke J, Lofi C, Balke WT (2012) Pushing the boundaries of crowd-enabled databases with query-driven schema expansion. PVLDB 5(6):538–549Google Scholar
  92. 92.
    Settles B (2012) Active learning. Synth Lect Artif Intell Mach Learn 6(1):1–114MathSciNetCrossRefMATHGoogle Scholar
  93. 93.
    Singh R, Gulwani S (2016) Transforming spreadsheet data types using examples. In: Proceedings of the 43rd annual ACM SIGPLAN-SIGACT symposium on principles of programming languages, POPL 2016, St. Petersburg, FL, USA, 20–22 January, 2016, pp 343–356Google Scholar
  94. 94.
    Stonebraker M, Bruckner D, Ilyas IF, Beskales G, Cherniack M, Zdonik SB, Pagan A, Xu S (2013) Data curation at scale: the data tamer system. In: CIDR 2013, sixth biennial conference on innovative data systems research, Asilomar, CA, USA, 6–9 January, 2013, Online Proceedings. http://www.cidrdb.org/cidr2013/Papers/CIDR13_Paper28.pdf
  95. 95.
    Talukdar PP, Jacob M, Mehmood MS, Crammer K, Ives ZG, Pereira F, Guha S (2008) Learning to create data-integrating queries. PVLDB 1(1):785–796Google Scholar
  96. 96.
    Tong Y, Cao CC, Zhang CJ, Li Y, Chen L (2014) Crowdcleaner: data cleaning for multi-version data on the web via crowdsourcing. In: 30th international conference on data engineering, ICDE. pp 1182–1185. doi:10.1109/ICDE.2014.6816736
  97. 97.
    Trushkowsky B, Kraska T, Franklin M, Sarkar P, Ramachandran V (2015) Crowdsourcing enumeration queries: estimators and interfaces. IEEE Trans Knowl Data Eng 27(7):1796–1809. doi:10.1109/TKDE.2014.2339857 CrossRefGoogle Scholar
  98. 98.
    Venetis P, Garcia-Molina H, Huang K, Polyzotis N (2012) Max algorithms in crowdsourcing environments. In: Proceedings of WWW. pp 989–998Google Scholar
  99. 99.
    Verroios V, Lofgren P, Garcia-Molina H (2015) tdp: an optimal-latency budget allocation strategy for crowdsourced maximum operations. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data. ACM, pp 1047–1062Google Scholar
  100. 100.
    Wang J, Kraska T, Franklin M, Feng J (2012) Crowder: crowdsourcing entity resolution. Proc VLDB Endow 5(11):1483–1494CrossRefGoogle Scholar
  101. 101.
    Wang J, Li G, Kraska T, Franklin MJ, Feng J (2013) Leveraging transitive relations for crowdsourced joins. In: ACM SIGMOD ’13Google Scholar
  102. 102.
    Wang S, Xiao X, Lee C (2015) Crowd-based deduplication: an adaptive approach. In: SIGMOD. pp 1263–1277. doi:10.1145/2723372.2723739
  103. 103.
    Whang SE, Lofgren P, Garcia-Molina H (2013) Question selection for crowd entity resolution. PVLDB 6(6):349–360Google Scholar
  104. 104.
    Whitehill J, Wu Tf, Bergsma J, Movellan JR, Ruvolo PL (2009) Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Bengio Y, Schuurmans D Lafferty J, Williams C, Culotta A (eds) Advances in neural information processing systems 22. pp 2035–2043. Machine Perception Laboratory, University of California, San Diego. http://books.nips.cc/papers/files/nips22/NIPS2009_0100.pdf
  105. 105.
    Yan Z, Zheng N, Ives ZG, Talukdar PP, Yu C (2015) Active learning in keyword search-based data integration. VLDB J 24(5):611–631. doi:10.1007/s00778-014-0374-x CrossRefGoogle Scholar
  106. 106.
    Yuen MC, King I, Leung KS (2011) A survey of crowdsourcing systems. In: IEEE international conference on social computing. pp 766–773Google Scholar
  107. 107.
    Zhang CJ, Chen L, Jagadish HV, Cao CC (2013) Reducing uncertainty of schema matching via crowdsourcing. PVLDB 6(9):757–768Google Scholar
  108. 108.
    Zhang CJ, Chen L, Tong Y (2014) Mac: a probabilistic framework for query answering with machine-crowd collaboration. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management. ACM, pp 11–20Google Scholar
  109. 109.
    Zhang CJ, Chen L, Tong Y, Liu Z (2015) Cleaning uncertain data with a noisy crowd. In: ICDE. pp 6–17. doi:10.1109/ICDE.2015.7113268
  110. 110.
    Zhang CJ, Zhao Z, Chen L, Jagadish HV, Cao CC (2014) Crowdmatcher: crowd-assisted schema matching. In: International conference on management of data, SIGMOD 2014, Snowbird, UT, USA, 22–27 June, 2014, pp 721–724. doi:10.1145/2588555.2594515
  111. 111.
    Zhang J, Wu X, Sheng VS (2016) Learning from crowdsourced labeled data: a survey. Artif Intell Rev. doi:10.1007/s10462-016-9491-9 Google Scholar
  112. 112.
    Zhao Z, Wei F, Zhou M, Chen W, Ng W (2015) Crowd-selection query processing in crowdsourcing databases: a task-driven approach. In: Proceedings of the 18th international conference on extending database technology, EDBT 2015, Brussels, Belgium, 23–27 March, 2015, pp 397–408. doi:10.5441/002/edbt.2015.35
  113. 113.
    Zheng Y, Cheng R, Maniu S, Mo L (2015) On optimality of jury selection in crowdsourcing. In: Proceedings of the 18th international conference on extending database technology, EDBT 2015, Brussels, Belgium, 23–27 March , 2015, pp 193–204. doi:10.5441/002/edbt.2015.18
  114. 114.
    Zheng Y, Scott SD, Deng K (2010) Active learning from multiple noisy labelers with varied costs. In: 10th ICDM. IEEE Computer Society, pp 639–648Google Scholar
  115. 115.
    Zuccon G, Leelanupab T, Whiting S, Yilmaz E, Jose JM, Azzopardi L (2013) Crowdsourcing interactions: using crowdsourcing for evaluating interactive information retrieval systems. Inf Retr 16(2):267–305. doi:10.1007/s10791-012-9206-z CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2017

Authors and Affiliations

  1. 1.Dipartimento di IngegneriaUniversità degli Studi Roma TreRomeItaly
  2. 2.School of Computer ScienceUniversity of ManchesterManchesterUK

Personalised recommendations