Advertisement

Training Deep Ranking Model with Weak Relevance Labels

  • Cheng Luo
  • Yukun Zheng
  • Jiaxin Mao
  • Yiqun Liu
  • Min Zhang
  • Shaoping Ma
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10538)

Abstract

Deep neural networks have already achieved great success in a number of fields, for example, computer vision, natural language processing, speech recognition, and etc. However, such advances have not been observed in information retrieval (IR) tasks yet, such as ad-hoc retrieval. A potential explanation is that in a particular IR task, training a document ranker usually needs large amounts of relevance labels which describe the relationship between queries and documents. However, this kind of relevance judgments are usually very expensive to obtain. In this paper, we propose to train deep ranking models with weak relevance labels generated by click model based on real users’ click behavior. We investigate the effectiveness of different weak relevance labels trained based on several major click models, such as DBN, RCM, PSCM, TCM, and UBM. The experimental results indicate that the ranking models trained with weak relevance labels are able to utilize large scale of behavior data and they can get similar performance compared to the ranking model trained based on relevance labels from external assessors, which are supposed to be more accurate. This preliminary finding encourages us to develop deep ranking models with weak supervised data.

Keywords

Ranking model Click model Deep learning 

Notes

Acknowledgement

This work is supported by Natural Science Foundation of China (Grant No. 61622208, 61732008, 61532011) and National Key Basic Research Program of China (2015CB358700).

References

  1. 1.
    Huang, P.-S., et al.: Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM International Conference on Conference on Information and Knowledge Management. ACM (2013)Google Scholar
  2. 2.
    Mikolov, T., et al.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems (2013)Google Scholar
  3. 3.
    Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, vol. 14 (2014)Google Scholar
  4. 4.
    Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, vol. 14 (2014)Google Scholar
  5. 5.
    Dehghani, M., Zamani, H., Severyn, A., Kamps, J., Croft, W.B.: Neural ranking models with weak supervision. arXiv preprint arXiv:1704.08803 (2017)
  6. 6.
    Yin, D., Hu, Y., Tang, J., Daly, T., Zhou, M., Ouyang, H., Chen, J., et al.: Ranking relevance in Yahoo search. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 323–332. ACM (2016)Google Scholar
  7. 7.
    Salakhutdinov, R., Hinton, G.: Semantic hashing. Int. J. Approx. Reason. 50(7), 969–978 (2009)CrossRefGoogle Scholar
  8. 8.
    Chuklin, A., Markov, I., de Rijke, M.: Click models for web search. Synth. Lect. Inf. Concepts Retr. Serv. 7(3), 1–115 (2015)Google Scholar
  9. 9.
    Guo, J., Fan, Y., Ai, Q., Croft, W.B.: A deep relevance matching model for ad-hoc retrieval. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 55–64. ACM (2016)Google Scholar
  10. 10.
    Shen, Y., He, X., Gao, J., Deng, L., Mesnil, G.: Learning semantic representations using convolutional neural networks for web search. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 373–374. ACM (2014)Google Scholar
  11. 11.
    Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. In: Advances in Neural Information Processing Systems, pp. 2042–2050 (2014)Google Scholar
  12. 12.
    Lu, Z., Li, H.: A deep architecture for matching short texts. In: Advances in Neural Information Processing Systems, pp. 1367–1375 (2013)Google Scholar
  13. 13.
    Pang, L., Lan, Y., Guo, J., Xu, J., Wan, S., Cheng, X.: Text matching as image recognition. arXiv preprint arXiv:1602.06359 (2016)
  14. 14.
    Hui, K., Yates, A., Berberich, K., de Melo, G.: A position-aware deep model for relevance matching in information retrieval. arXiv preprint arXiv:1704.03940 (2017)
  15. 15.
    Mitra, B., Diaz, F., Craswell, N.: Learning to match using local and distributed representations of text for web search. arXiv preprint arXiv:1610.08136 (2016)
  16. 16.
    Agichtein, E., Brill, E., Dumais, S., Ragno, R.: Learning user interaction models for predicting web search result preferences. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 3–10. ACM (2006)Google Scholar
  17. 17.
    Joachims, T., Granka, L., Pan, B., Hembrooke, H., Gay, G.: Accurately interpreting clickthrough data as implicit feedback. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 154–161. ACM (2005)Google Scholar
  18. 18.
    Zhang, Y., Chen, W., Wang, D., Yang, Q.: User-click modeling for understanding and predicting search-behavior. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1388–1396. ACM (2011)Google Scholar
  19. 19.
    Wang, C., Liu, Y., Zhang, M., Ma, S., Zheng, M., Qian, J., Zhang, K.: Incorporating vertical results into search click models. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 503–512. ACM, July 2013Google Scholar
  20. 20.
    Wang, C., Liu, Y., Wang, M., Zhou, K., Nie, J., Ma, S.: Incorporating non-sequential behavior into click models. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 283–292. ACM (2015)Google Scholar
  21. 21.
    Chapelle, O., Zhang, Y.: A dynamic Bayesian network click model for web search ranking. In: Proceedings of the 18th International Conference on World Wide Web, pp. 1–10. ACM (2009)Google Scholar
  22. 22.
    Dupret, G.E., Piwowarski, B.: A user browsing model to predict search engine click data from past observations. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 331–338. ACM (2008)Google Scholar
  23. 23.
    Xu, W., Manavoglu, E., Cantu-Paz, E.: Temporal click model for sponsored search. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 106–113. ACM (2010)Google Scholar
  24. 24.
    Craswell, N., Zoeter, O., Taylor, M., Ramsey, B.: An experimental comparison of click position-bias models. In: WSDM 2008, pp. 87–94. ACM (2008)Google Scholar
  25. 25.
    Luo, C., Zheng, Y., Liu, Y., Wang, X., Xu, J., Zhang, M., Ma, S.: SogouT-16: a new web corpus to embrace IR research. In: The 40th ACM SIGIR International Conference on Research and Development in Information Retrieval, SIGIR 2017 (2017)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Cheng Luo
    • 1
  • Yukun Zheng
    • 1
  • Jiaxin Mao
    • 1
  • Yiqun Liu
    • 1
  • Min Zhang
    • 1
  • Shaoping Ma
    • 1
  1. 1.Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and TechnologyTsinghua UniversityBeijingChina

Personalised recommendations