Advertisement

Rankboost\(+\): an improvement to Rankboost

  • Harold ConnamacherEmail author
  • Nikil Pancha
  • Rui Liu
  • Soumya Ray
Article

Abstract

Rankboost is a well-known algorithm that iteratively creates and aggregates a collection of “weak rankers” to build an effective ranking procedure. Initial work on Rankboost proposed two variants. One variant, that we call Rb-d and which is designed for the scenario where all weak rankers have the binary range \(\{0,1\}\), has good theoretical properties, but does not perform well in practice. The other, that we call Rb-c, has good empirical behavior and is the recommended variation for this binary weak ranker scenario but lacks a theoretical grounding. In this paper, we rectify this situation by proposing an improved Rankboost algorithm for the binary weak ranker scenario that we call Rankboost\(+\). We prove that this approach is theoretically sound and also show empirically that it outperforms both Rankboost variants in practice. Further, the theory behind Rankboost\(+\) helps us to explain why Rb-d may not perform well in practice, and why Rb-c is better behaved in the binary weak ranker scenario, as has been observed in prior work.

Keywords

Ranking Boosting Ensemble methods Rankboost 

Notes

Supplementary material

References

  1. Agarwal, A., Raghavan, H., Subbian, K., Melville, P., Lawrence, R. D., Gondek, D. C., & Fan, J. (2012). Learning to rank for robust question answering. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (pp. 833–842). ACM.Google Scholar
  2. Agarwal, S., Dugar, D., & Sengupta, S. (2010). Ranking chemical structures for drug discovery: A new machine learning approach. Journal of Chemical Information and Modeling, 50(5), 716–731.CrossRefGoogle Scholar
  3. Aslam, J. A., Kanoulas, E., Pavlu, V., Savev, S., & Yilmaz, E. (2009). Document selection methodologies for efficient and effective learning-to-rank. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 468–475). ACM.Google Scholar
  4. Cao, Z., Qin, T., Liu, T. Y., Tsai, M. F., & Li, H. (2007). Learning to rank: from pairwise approach to listwise approach. In: Proceedings of the 24th International Conference on Machine Learning (pp. 129–136). ACM.Google Scholar
  5. Cohen, W. W., Schapire, R. E., & Singer, Y. (1999). Learning to order things. Journal of Artificial Intelligence Research, 10, 243–270.MathSciNetCrossRefzbMATHGoogle Scholar
  6. Cortes, C., & Mohri, M. (2004). AUC optimization vs. error rate minimization. In Proceedings of the 16th International Conference on Neural Information Processing Systems (pp. 313–320). MIT Press.Google Scholar
  7. Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7(Jan), 1–30.MathSciNetzbMATHGoogle Scholar
  8. Freund, Y., Iyer, R., Schapire, R. E., & Singer, Y. (2003). An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4(Nov), 933–969.MathSciNetzbMATHGoogle Scholar
  9. Freund, Y., & Schapire, R. E. (1995). A decision-theoretic generalization of on-line learning and an application to boosting. In European Conference on Computational Learning Theory (pp. 23–37). Springer.Google Scholar
  10. Guiver, J., & Snelson, E. (2009). Bayesian inference for Plackett–Luce ranking models. In Proceedings of the 26th Annual International Conference on Machine Learning (pp. 377–384). ACM.Google Scholar
  11. Harper, F. M., & Konstan, J. A. (2016). The movielens datasets: History and context. ACM Transactions on Interactive Intelligent Systems (TiiS), 5(4), 19.Google Scholar
  12. Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems, 20(4), 422–446.  https://doi.org/10.1145/582415.582418.CrossRefGoogle Scholar
  13. Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2012). Foundations of Machine Learning. Cambridge: MIT Press.zbMATHGoogle Scholar
  14. Nuray, R., & Can, F. (2006). Automatic ranking of information retrieval systems using data fusion. Information Processing & Management, 42(3), 595–614.CrossRefzbMATHGoogle Scholar
  15. Qin, T., & Liu, T. (2013). Introducing LETOR 4.0 datasets. CoRR. http://arxiv.org/abs/1306.2597
  16. Qin, T., Liu, T. Y., Xu, J., & Li, H. (2010). LETOR: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval, 13(4), 346–374.CrossRefGoogle Scholar
  17. Rudin, C., Cortes, C., Mohri, M., & Schapire, R. E. (2005). Margin-based ranking meets boosting in the middle. In International Conference on Computational Learning Theory (pp. 63–78). Springer.Google Scholar
  18. Valizadegan, H., Jin, R., Zhang, R., & Mao, J. (2009). Learning to rank by optimizing NDCG measure. In Proceedings of the 22nd International Conference on Neural Information Processing Systems (pp. 1883–1891). Curran Associates Inc.Google Scholar
  19. Weimer, M., Karatzoglou, A., Le, Q. V., & Smola, A. (2007). COFIRANK maximum margin matrix factorization for collaborative ranking. In Proceedings of the 20th International Conference on Neural Information Processing Systems (pp. 1593–1600). Curran Associates Inc.Google Scholar
  20. Xia, F., Liu, T. Y., Wang, J., Zhang, W., & Li, H. (2008). Listwise approach to learning to rank: Theory and algorithm. In Proceedings of the 25th International Conference on Machine Learning (pp. 1192–1199). ACM.Google Scholar
  21. Zheng, Z., Zha, H., & Sun, G. (2008). Query-level learning to rank using isotonic regression. In 2008 46th Annual Allerton Conference on Communication, Control, and Computing (pp. 1108–1115). IEEE.Google Scholar

Copyright information

© The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Electrical Engineering and Computer ScienceCase Western Reserve UniversityClevelandUSA

Personalised recommendations