Skip to main content

Transitivity, Time Consumption, and Quality of Preference Judgments in Crowdsourcing

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10193))

Included in the following conference series:

Abstract

Preference judgments have been demonstrated as a better alternative to graded judgments to assess the relevance of documents relative to queries. Existing work has verified transitivity among preference judgments when collected from trained judges, which reduced the number of judgments dramatically. Moreover, strict preference judgments and weak preference judgments, where the latter additionally allow judges to state that two documents are equally relevant for a given query, are both widely used in literature. However, whether transitivity still holds when collected from crowdsourcing, i.e., whether the two kinds of preference judgments behave similarly remains unclear. In this work, we collect judgments from multiple judges using a crowdsourcing platform and aggregate them to compare the two kinds of preference judgments in terms of transitivity, time consumption, and quality. That is, we look into whether aggregated judgments are transitive, how long it takes judges to make them, and whether judges agree with each other and with judgments from Trec. Our key findings are that only strict preference judgments are transitive. Meanwhile, weak preference judgments behave differently in terms of transitivity, time consumption, as well as of the quality of judgment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://trec.nist.gov/data/webmain.html.

  2. 2.

    Queries are available in http://trec.nist.gov/data/webmain.html.

  3. 3.

    http://lemurproject.org/clueweb12/index.php.

  4. 4.

    http://people.mpi-inf.mpg.de/~khui/data/ecir17empirical.

  5. 5.

    http://trec.nist.gov/data/docs_eng.html.

References

  1. Alonso, O., Baeza-Yates, R.: Design and implementation of relevance assessments using crowdsourcing. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 153–164. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20161-5_16

    Chapter  Google Scholar 

  2. Alonso, O., Mizzaro, S.: Can we get rid of TREC assessors? Using mechanical turk for relevance assessment. In: SIGIR 2009 Workshop on the Future of IR Evaluation (2009)

    Google Scholar 

  3. Alonso, O., Mizzaro, S.: Using crowdsourcing for TREC relevance assessment. Inf. Process. Manag. 48(6), 1053–1066 (2012)

    Article  Google Scholar 

  4. Bashir, M., Anderton, J., Wu, J., Golbus, P.B., Pavlu, V., Aslam, J.A.: A document rating system for preference judgements. In: SIGIR 2013 (2013)

    Google Scholar 

  5. Carterette, B., Bennett, P.N., Chickering, D.M., Dumais, S.T.: Here or there: preference judgments for relevance. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 16–27. Springer, Heidelberg (2008). doi:10.1007/978-3-540-78646-7_5

    Chapter  Google Scholar 

  6. Cleverdon, C.: The cranfield tests on index language devices. In: Aslib Proceedings, vol. 19 (1967)

    Google Scholar 

  7. Grady, C., Lease, M.: Crowdsourcing document relevance assessment with mechanical turk. In: NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk (2010)

    Google Scholar 

  8. Hansson, S.O., Grne-Yanoff, T.: Preferences. In: Zalta, E.N. (ed.) The Stanford Encyclopedia of Philosophy (2012)

    Google Scholar 

  9. Kazai, G.: In search of quality in crowdsourcing for search engine evaluation. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 165–176. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20161-5_17

    Chapter  Google Scholar 

  10. Kazai, G., Yilmaz, E., Craswell, N., Tahaghoghi, S.M.: User intent and assessor disagreement in web search evaluation. In: CIKM 2013 (2013)

    Google Scholar 

  11. Moshfeghi, Y., Huertas-Rosero, A.F., Jose, J.M.: Identifying careless workers in crowdsourcing platforms: a game theory approach. In: SIGIR 2016 (2016)

    Google Scholar 

  12. Moshfeghi, Y., Rosero, A.F.H., Jose, J.M.: A game-theory approach for effective crowdsource-based relevance assessment. ACM Trans. Intell. Syst. Technol. 7(4) (2016)

    Google Scholar 

  13. Radinsky, K., Ailon, N.: Ranking from pairs and triplets: information quality, evaluation methods and query complexity. In: WSDM 2011 (2011)

    Google Scholar 

  14. Rorvig, M.E.: The simple scalability of documents. J. Am. Soc. Inf. Sci. 41(8), 590–598 (1990)

    Article  Google Scholar 

  15. Song, R., Guo, Q., Zhang, R., Xin, G., Wen, J.R., Yu, Y., Hon, H.W.: Select-the-best-ones: a new way to judge relative relevance. Inf. Process. Manag. 47(1), 37–52 (2011)

    Article  Google Scholar 

  16. Zhu, D., Carterette, B.: An analysis of assessor behavior in crowdsourced preference judgments. In: SIGIR 2010 Workshop on Crowdsourcing for Search Evaluation (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kai Hui .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Hui, K., Berberich, K. (2017). Transitivity, Time Consumption, and Quality of Preference Judgments in Crowdsourcing. In: Jose, J., et al. Advances in Information Retrieval. ECIR 2017. Lecture Notes in Computer Science(), vol 10193. Springer, Cham. https://doi.org/10.1007/978-3-319-56608-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-56608-5_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-56607-8

  • Online ISBN: 978-3-319-56608-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics