Predicting Quality of Crowdsourced Annotations Using Graph Kernels

  • Archana Nottamkandath
  • Jasper Oosterman
  • Davide Ceolin
  • Gerben Klaas Dirk de Vries
  • Wan Fokkink
Conference paper
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 454)


Annotations obtained by Cultural Heritage institutions from the crowd need to be automatically assessed for their quality. Machine learning using graph kernels is an effective technique to use structural information in datasets to make predictions. We employ the Weisfeiler-Lehman graph kernel for RDF to make predictions about the quality of crowdsourced annotations in dataset, which is modelled and enriched as RDF. Our results indicate that we could predict quality of crowdsourced annotations with an accuracy of 75 %. We also employ the kernel to understand which features from the RDF graph are relevant to make predictions about different categories of quality.


Trust Machine learning Crowdsourcing RDF graph kernels 



This publication is supported by the Dutch national program COMMIT.


  1. 1.
    Artz, D., Gil, Y.: A survey of trust in computer science and the semantic web. J. Semant. Web 5(2), 58–71 (2007)CrossRefGoogle Scholar
  2. 2.
    Ceolin, D., Nottamkandath, A., Fokkink, W.: Automated evaluation of annotators for museum collections using subjective logic. In: Dimitrakos, T., Moona, R., Patel, D., McKnight, D.H. (eds.) IFIPTM 2012. IFIP AICT, vol. 374, pp. 232–239. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  3. 3.
    Ceolin, D., Nottamkandath, A., Fokkink, W.: Efficient semi-automated assessment of annotation trustworthiness. J. Trust Manag. 1, 1–31 (2014)CrossRefGoogle Scholar
  4. 4.
    Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011). CrossRefGoogle Scholar
  5. 5.
    Dan Brickley, L. M.: FOAF, January 2014.
  6. 6.
    de Vries, G.K.D.: A fast approximation of the weisfeiler-lehman graph kernel for RDF data. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013, Part I. LNCS, vol. 8188, pp. 606–621. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  7. 7.
    Doan, A., Ramakrishnan, R., Halevy, A.Y.: Crowdsourcing systems on the world-wide web. Commun. ACM 54(4), 86–96 (2011)CrossRefGoogle Scholar
  8. 8.
    Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)zbMATHGoogle Scholar
  9. 9.
    Golbeck, J.: Trust on the world wide web: a survey. Found. Trends Web Sci. 1(2), 131–197 (2006)CrossRefGoogle Scholar
  10. 10.
    Hennicke, S., Olensky, M., de Boer, V., Isaac, A., Wielemaker, J.: A data model for cross-domain data representation. The “Europeana Data Model” in the Case of Archival and Museum Data (2011)Google Scholar
  11. 11.
    Inel, O., Khamkham, K., Cristea, T., Dumitrache, A., Rutjes, A., van der Ploeg, J., Romaszko, L., Aroyo, L., Sips, R.-J.: CrowdTruth: machine-human computation framework for harnessing disagreement in gathering annotated data. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014, Part II. LNCS, vol. 8797, pp. 486–504. Springer, Heidelberg (2014) CrossRefGoogle Scholar
  12. 12.
    U. institute of Museum and L. Services. Steve Social Tagging Project, January 2012Google Scholar
  13. 13.
    Lösch, U., Bloehdorn, S., Rettinger, A.: Graph kernels for RDF data. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 134–148. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  14. 14.
    Nottamkandath, A., Oosterman, J., Ceolin, D., Fokkink, W.: Automated evaluation of crowdsourced annotations in the cultural heritage domain. In: URSW. CEUR Workshop Proceedings, vol. 1259, pp. 25–36. (2014)Google Scholar
  15. 15.
    Prasad, T.K., Anantharam, P., Henson, C.A., Sheth, A.P.: Comparative trust management with applications: bayesian approaches emphasis. Future Gener. Comput. Syst. 31, 182–199 (2014)CrossRefGoogle Scholar
  16. 16.
    Rettinger, A., Lösch, U., Tresp, V., d’Amato, C., Fanizzi, N.: Mining the semantic web–statistical learning for next generation knowledge bases. Data Min. Knowl. Discov. 24(3), 613–662 (2012)CrossRefzbMATHMathSciNetGoogle Scholar
  17. 17.
    Ridge, M.: Introduction. In: Ridge, M. (ed.) Crowdsourcing Our Cultural Heritage. Digital Research in the Arts and Humanities. Ashgate, Farnham (2014)Google Scholar
  18. 18.
    Sabater, J., Sierra, C.: Review on computational trust and reputation models. Artif. Intell. Rev. 24, 33–60 (2005)CrossRefzbMATHGoogle Scholar
  19. 19.
    Sanderson, R., Ciccarese, P., de Sompel, H.V., Clark, T., Cole, T., Hunter, J., Fraistat, N.: Open annotation core data model. Technical report, W3C Community, 9 May 2012Google Scholar
  20. 20.
    Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)Google Scholar
  21. 21.
    Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)CrossRefGoogle Scholar
  22. 22.
    Shervashidze, N., Schweitzer, P., van Leeuwen, E.J., Mehlhorn, K., Borgwardt, K.M.: Weisfeiler-lehman graph kernels. J. Mach. Learn. Res. 12, 2539–2561 (2011)zbMATHMathSciNetGoogle Scholar
  23. 23.
    Singh, P., Lin, T., Mueller, E.T., Lim, G., Perkins, T., Zhu, W.L.: Open mind common sense: knowledge acquisition from the general public. In: Meersman, Robert, Tari, Z. (eds.) CoopIS/DOA/ODBASE 2002. LNCS, vol. 2519, pp. 1223–1237. Springer, Heidelberg (2002) CrossRefGoogle Scholar
  24. 24.
    Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast–but is it good?: Evaluating non-expert annotations for natural language tasks. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, pp. 254–263. Association for Computational Linguistics (2008)Google Scholar
  25. 25.
    von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2004, pp. 319–326. ACM (2004)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2015

Authors and Affiliations

  • Archana Nottamkandath
    • 1
  • Jasper Oosterman
    • 2
  • Davide Ceolin
    • 1
  • Gerben Klaas Dirk de Vries
    • 3
  • Wan Fokkink
    • 1
  1. 1.VU University AmsterdamAmsterdamThe Netherlands
  2. 2.Delft University of TechnologyDelftThe Netherlands
  3. 3.University of AmsterdamAmsterdamThe Netherlands

Personalised recommendations