Skip to main content

ACRyLIQ: Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment

  • Conference paper
  • First Online:
Knowledge Engineering and Knowledge Management (EKAW 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10024))

Included in the following conference series:

Abstract

Crowdsourcing has emerged as a powerful paradigm for quality assessment and improvement of Linked Data. A major challenge of employing crowdsourcing, for quality assessment in Linked Data, is the cold-start problem: how to estimate the reliability of crowd workers and assign the most reliable workers to tasks? We address this challenge by proposing a novel approach for generating test questions from DBpedia based on the topics associated with quality assessment tasks. These test questions are used to estimate the reliability of the new workers. Subsequently, the tasks are dynamically assigned to reliable workers to help improve the accuracy of collected responses. Our proposed approach, ACRyLIQ, is evaluated using workers hired from Amazon Mechanical Turk, on two real-world Linked Data datasets. We validate the proposed approach in terms of accuracy and compare it against the baseline approach of reliability estimate using gold-standard task. The results demonstrate that our proposed approach achieves high accuracy without using gold-standard task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://lodstats.aksw.org.

  2. 2.

    A DBpedia triple is considered a fact.

  3. 3.

    http://linkedspending.aksw.org/.

  4. 4.

    https://openspending.org/.

  5. 5.

    https://github.com/optimaize/language-detector.

  6. 6.

    http://oaei.ontologymatching.org/.

  7. 7.

    As of 2014 http://wiki.dbpedia.org/about.

  8. 8.

    http://dataevaluation.aksw.org.

References

  1. Acosta, M., Zaveri, A., Simperl, E., Kontokostas, D., Auer, S., Lehmann, J.: Crowdsourcing linked data quality assessment. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8219, pp. 260–276. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41338-4_17

    Chapter  Google Scholar 

  2. Difallah, D.E., Demartini, G., Cudrè-Mauroux, P.: Pick-a-crowd: tell me what you like, and i’ll tell you what to do. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 367–374 (2013)

    Google Scholar 

  3. Fan, J., et al.: iCrowd: an adaptive crowdsourcing framework. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1015–1030. ACM (2015)

    Google Scholar 

  4. Ghazvinian, A., Noy, N.F., Musen, M.A., et al.: Creating mappings for ontologies in biomedicine: simple methods work. In: AMIA (2009)

    Google Scholar 

  5. Ul Hassan, U., O’Riain, S., Curry, E.: Effects of expertise assessment on the quality of task routing in human computation. In: Proceedings of the 2nd International Workshop on Social Media for Crowdsourcing and Human Computation, Paris, France (2013)

    Google Scholar 

  6. Ul Hassan, U., O’Riain, S., Curry, E.: Leveraging matching dependencies for guided user feedback in linked data applications. In: Proceedings of the 9th International Workshop on Information Integration on the Web, pp. 1–6. ACM Press (2012)

    Google Scholar 

  7. Heath, T., Bizer, C.: Linked Data: Evolving the Web Into a Global Data Space, vol. 1. Morgan & Claypool Publishers, San Rafael (2011)

    Google Scholar 

  8. Ho, C.-J., Jabbari, S., Vaughan, J.W.: Adaptive task assignment for crowdsourced classification. In: Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 534–542 (2013)

    Google Scholar 

  9. Howe, J.: The rise of crowdsourcing. Wired Mag. 14(6), 1–4 (2006)

    MathSciNet  Google Scholar 

  10. Ipeirotis, P.G.: Analyzing the amazon mechanical turk marketplace. XRDS: Crossroads ACM Mag. Students 17(2), 16–21 (2010)

    Article  Google Scholar 

  11. Ipeirotis, P.G., Provost, F., Wang, J.: Quality management on amazon mechanical turk. In: Proceedings of the ACM SIGKDD Workshop on Human Computation, pp. 64–67. ACM (2010)

    Google Scholar 

  12. Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 6(2), 167–195 (2015)

    Google Scholar 

  13. Ngonga Ngomo, A.-C., Auer, S.: LIMES - a time-efficient approach for large-scale link discovery on the web of data. In: Proceedings of IJCAI (2011)

    Google Scholar 

  14. Noy, N.F., et al.: Mechanical turk as an ontology engineer?: using microtasks as a component of an ontology-engineering workflow. In: Proceedings of the 5th Annual ACM Web Science Conference, pp. 262–271 (2013)

    Google Scholar 

  15. Oleson, D., et al.: Programmatic gold: targeted and scalable quality assurance in crowdsourcing. In: Human Computation 11.11 (2011)

    Google Scholar 

  16. Sarasua, C., Simperl, E., Noy, N.F.: CrowdMap: crowdsourcing ontology alignment with microtasks. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 525–541. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35176-1_33

    Chapter  Google Scholar 

  17. Shannon, C.E.: A mathematical theory of communication. ACM SIGMOBILE Mob. Comput. Commun. Rev. 5(1), 3–55 (2001)

    Article  MathSciNet  Google Scholar 

  18. Tarasov, A., Delany, S.J., Namee, B.M.: Dynamic estimation of worker reliability in crowdsourcing for regression tasks: making it work. In: Expert Systems with Applications 41.14, pp. 6190–6210 (2014)

    Google Scholar 

  19. Winkler, W.: String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of record linkage. In: Proceedings of the Section on Survey Research Methods (American Statistical Association), pp. 354–359 (1990)

    Google Scholar 

  20. Zaveri, A., et al.: Quality assessment for linked data: a survey. Semant. Web J. 7(1), 63–93 (2016)

    Article  Google Scholar 

  21. Zaveri, A., et al.: User-driven quality evaluation of DBpedia. In: Proceedings of the 9th International Conference on Semantic Systems, pp. 97–104. ACM (2013)

    Google Scholar 

  22. Zhou, Y., Chen, X., Li, J.: Optimal PAC multiple arm identification with applications to crowdsourcing. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 217–225 (2014)

    Google Scholar 

Download references

Acknowledgement

This work has been supported in part by the Science Foundation Ireland (SFI) under grant No. SFI/12/RC/2289 and the Seventh EU Framework Programme (FP7) from ICT grant agreement No. 619660 (WATERNOMICS).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Umair ul Hassan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

ul Hassan, U., Zaveri, A., Marx, E., Curry, E., Lehmann, J. (2016). ACRyLIQ: Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds) Knowledge Engineering and Knowledge Management. EKAW 2016. Lecture Notes in Computer Science(), vol 10024. Springer, Cham. https://doi.org/10.1007/978-3-319-49004-5_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49004-5_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49003-8

  • Online ISBN: 978-3-319-49004-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics