ACRyLIQ: Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment

ul Hassan, Umair; Zaveri, Amrapali; Marx, Edgard; Curry, Edward; Lehmann, Jens

doi:10.1007/978-3-319-49004-5_44

Umair ul Hassan¹⁷,
Amrapali Zaveri¹⁸,
Edgard Marx¹⁹,
Edward Curry¹⁷ &
…
Jens Lehmann^20,21

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10024))

Included in the following conference series:

European Knowledge Acquisition Workshop

2299 Accesses
4 Citations
1 Altmetric

Abstract

Crowdsourcing has emerged as a powerful paradigm for quality assessment and improvement of Linked Data. A major challenge of employing crowdsourcing, for quality assessment in Linked Data, is the cold-start problem: how to estimate the reliability of crowd workers and assign the most reliable workers to tasks? We address this challenge by proposing a novel approach for generating test questions from DBpedia based on the topics associated with quality assessment tasks. These test questions are used to estimate the reliability of the new workers. Subsequently, the tasks are dynamically assigned to reliable workers to help improve the accuracy of collected responses. Our proposed approach, ACRyLIQ, is evaluated using workers hired from Amazon Mechanical Turk, on two real-world Linked Data datasets. We validate the proposed approach in terms of accuracy and compare it against the baseline approach of reliability estimate using gold-standard task. The results demonstrate that our proposed approach achieves high accuracy without using gold-standard task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://lodstats.aksw.org.
2.
A DBpedia triple is considered a fact.
3.
http://linkedspending.aksw.org/.
4.
https://openspending.org/.
5.
https://github.com/optimaize/language-detector.
6.
http://oaei.ontologymatching.org/.
7.
As of 2014 http://wiki.dbpedia.org/about.
8.
http://dataevaluation.aksw.org.

References

Acosta, M., Zaveri, A., Simperl, E., Kontokostas, D., Auer, S., Lehmann, J.: Crowdsourcing linked data quality assessment. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8219, pp. 260–276. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41338-4_17
Chapter Google Scholar
Difallah, D.E., Demartini, G., Cudrè-Mauroux, P.: Pick-a-crowd: tell me what you like, and i’ll tell you what to do. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 367–374 (2013)
Google Scholar
Fan, J., et al.: iCrowd: an adaptive crowdsourcing framework. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1015–1030. ACM (2015)
Google Scholar
Ghazvinian, A., Noy, N.F., Musen, M.A., et al.: Creating mappings for ontologies in biomedicine: simple methods work. In: AMIA (2009)
Google Scholar
Ul Hassan, U., O’Riain, S., Curry, E.: Effects of expertise assessment on the quality of task routing in human computation. In: Proceedings of the 2nd International Workshop on Social Media for Crowdsourcing and Human Computation, Paris, France (2013)
Google Scholar
Ul Hassan, U., O’Riain, S., Curry, E.: Leveraging matching dependencies for guided user feedback in linked data applications. In: Proceedings of the 9th International Workshop on Information Integration on the Web, pp. 1–6. ACM Press (2012)
Google Scholar
Heath, T., Bizer, C.: Linked Data: Evolving the Web Into a Global Data Space, vol. 1. Morgan & Claypool Publishers, San Rafael (2011)
Google Scholar
Ho, C.-J., Jabbari, S., Vaughan, J.W.: Adaptive task assignment for crowdsourced classification. In: Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 534–542 (2013)
Google Scholar
Howe, J.: The rise of crowdsourcing. Wired Mag. 14(6), 1–4 (2006)
MathSciNet Google Scholar
Ipeirotis, P.G.: Analyzing the amazon mechanical turk marketplace. XRDS: Crossroads ACM Mag. Students 17(2), 16–21 (2010)
Article Google Scholar
Ipeirotis, P.G., Provost, F., Wang, J.: Quality management on amazon mechanical turk. In: Proceedings of the ACM SIGKDD Workshop on Human Computation, pp. 64–67. ACM (2010)
Google Scholar
Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 6(2), 167–195 (2015)
Google Scholar
Ngonga Ngomo, A.-C., Auer, S.: LIMES - a time-efficient approach for large-scale link discovery on the web of data. In: Proceedings of IJCAI (2011)
Google Scholar
Noy, N.F., et al.: Mechanical turk as an ontology engineer?: using microtasks as a component of an ontology-engineering workflow. In: Proceedings of the 5th Annual ACM Web Science Conference, pp. 262–271 (2013)
Google Scholar
Oleson, D., et al.: Programmatic gold: targeted and scalable quality assurance in crowdsourcing. In: Human Computation 11.11 (2011)
Google Scholar
Sarasua, C., Simperl, E., Noy, N.F.: CrowdMap: crowdsourcing ontology alignment with microtasks. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 525–541. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35176-1_33
Chapter Google Scholar
Shannon, C.E.: A mathematical theory of communication. ACM SIGMOBILE Mob. Comput. Commun. Rev. 5(1), 3–55 (2001)
Article MathSciNet Google Scholar
Tarasov, A., Delany, S.J., Namee, B.M.: Dynamic estimation of worker reliability in crowdsourcing for regression tasks: making it work. In: Expert Systems with Applications 41.14, pp. 6190–6210 (2014)
Google Scholar
Winkler, W.: String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of record linkage. In: Proceedings of the Section on Survey Research Methods (American Statistical Association), pp. 354–359 (1990)
Google Scholar
Zaveri, A., et al.: Quality assessment for linked data: a survey. Semant. Web J. 7(1), 63–93 (2016)
Article Google Scholar
Zaveri, A., et al.: User-driven quality evaluation of DBpedia. In: Proceedings of the 9th International Conference on Semantic Systems, pp. 97–104. ACM (2013)
Google Scholar
Zhou, Y., Chen, X., Li, J.: Optimal PAC multiple arm identification with applications to crowdsourcing. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 217–225 (2014)
Google Scholar

Download references

Acknowledgement

This work has been supported in part by the Science Foundation Ireland (SFI) under grant No. SFI/12/RC/2289 and the Seventh EU Framework Programme (FP7) from ICT grant agreement No. 619660 (WATERNOMICS).

Author information

Authors and Affiliations

Insight Centre for Data Analytics, National University of Ireland, Galway, Ireland
Umair ul Hassan & Edward Curry
Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, USA
Amrapali Zaveri
AKSW Group, University of Leipzig, Leipzig, Germany
Edgard Marx
Computer Science Institute, University of Bonn, Bonn, Germany
Jens Lehmann
Knowledge Discovery Department, Fraunhofer IAIS, Sankt Augustin, Germany
Jens Lehmann

Authors

Umair ul Hassan
View author publications
You can also search for this author in PubMed Google Scholar
Amrapali Zaveri
View author publications
You can also search for this author in PubMed Google Scholar
Edgard Marx
View author publications
You can also search for this author in PubMed Google Scholar
Edward Curry
View author publications
You can also search for this author in PubMed Google Scholar
Jens Lehmann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Umair ul Hassan .

Editor information

Editors and Affiliations

Linköping University, Linköping, Sweden
Eva Blomqvist
University of Bologna, Bologna, Italy
Paolo Ciancarini
University of Bologna, Bologna, Italy
Francesco Poggi
University of Bologna, Bologna, Italy
Fabio Vitali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

ul Hassan, U., Zaveri, A., Marx, E., Curry, E., Lehmann, J. (2016). ACRyLIQ: Leveraging DBpedia for Adaptive Crowdsourcing in Linked Data Quality Assessment. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds) Knowledge Engineering and Knowledge Management. EKAW 2016. Lecture Notes in Computer Science(), vol 10024. Springer, Cham. https://doi.org/10.1007/978-3-319-49004-5_44

Download citation

DOI: https://doi.org/10.1007/978-3-319-49004-5_44
Published: 04 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49003-8
Online ISBN: 978-3-319-49004-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics