## Abstract

Rankboost is a well-known algorithm that iteratively creates and aggregates a collection of “weak rankers” to build an effective ranking procedure. Initial work on Rankboost proposed two variants. One variant, that we call Rb-d and which is designed for the scenario where all weak rankers have the binary range \(\{0,1\}\), has good theoretical properties, but does not perform well in practice. The other, that we call Rb-c, has good empirical behavior and is the recommended variation for this binary weak ranker scenario but lacks a theoretical grounding. In this paper, we rectify this situation by proposing an improved Rankboost algorithm for the binary weak ranker scenario that we call Rankboost\(+\). We prove that this approach is theoretically sound and also show empirically that it outperforms both Rankboost variants in practice. Further, the theory behind Rankboost\(+\) helps us to explain why Rb-d may not perform well in practice, and why Rb-c is better behaved in the binary weak ranker scenario, as has been observed in prior work.

## Keywords

Ranking Boosting Ensemble methods Rankboost## Notes

## Supplementary material

## References

- Agarwal, A., Raghavan, H., Subbian, K., Melville, P., Lawrence, R. D., Gondek, D. C., & Fan, J. (2012). Learning to rank for robust question answering. In
*Proceedings of the 21st ACM International Conference on Information and Knowledge Management*(pp. 833–842). ACM.Google Scholar - Agarwal, S., Dugar, D., & Sengupta, S. (2010). Ranking chemical structures for drug discovery: A new machine learning approach.
*Journal of Chemical Information and Modeling*,*50*(5), 716–731.CrossRefGoogle Scholar - Aslam, J. A., Kanoulas, E., Pavlu, V., Savev, S., & Yilmaz, E. (2009). Document selection methodologies for efficient and effective learning-to-rank. In
*Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval*(pp. 468–475). ACM.Google Scholar - Cao, Z., Qin, T., Liu, T. Y., Tsai, M. F., & Li, H. (2007). Learning to rank: from pairwise approach to listwise approach. In:
*Proceedings of the 24th International Conference on Machine Learning*(pp. 129–136). ACM.Google Scholar - Cohen, W. W., Schapire, R. E., & Singer, Y. (1999). Learning to order things.
*Journal of Artificial Intelligence Research*,*10*, 243–270.MathSciNetCrossRefzbMATHGoogle Scholar - Cortes, C., & Mohri, M. (2004). AUC optimization vs. error rate minimization. In
*Proceedings of the 16th International Conference on Neural Information Processing Systems*(pp. 313–320). MIT Press.Google Scholar - Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets.
*Journal of Machine Learning Research*,*7*(Jan), 1–30.MathSciNetzbMATHGoogle Scholar - Freund, Y., Iyer, R., Schapire, R. E., & Singer, Y. (2003). An efficient boosting algorithm for combining preferences.
*Journal of Machine Learning Research*,*4*(Nov), 933–969.MathSciNetzbMATHGoogle Scholar - Freund, Y., & Schapire, R. E. (1995). A decision-theoretic generalization of on-line learning and an application to boosting. In
*European Conference on Computational Learning Theory*(pp. 23–37). Springer.Google Scholar - Guiver, J., & Snelson, E. (2009). Bayesian inference for Plackett–Luce ranking models. In
*Proceedings of the 26th Annual International Conference on Machine Learning*(pp. 377–384). ACM.Google Scholar - Harper, F. M., & Konstan, J. A. (2016). The movielens datasets: History and context.
*ACM Transactions on Interactive Intelligent Systems (TiiS)*,*5*(4), 19.Google Scholar - Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of ir techniques.
*ACM Transactions on Information Systems*,*20*(4), 422–446. https://doi.org/10.1145/582415.582418.CrossRefGoogle Scholar - Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2012).
*Foundations of Machine Learning*. Cambridge: MIT Press.zbMATHGoogle Scholar - Nuray, R., & Can, F. (2006). Automatic ranking of information retrieval systems using data fusion.
*Information Processing & Management*,*42*(3), 595–614.CrossRefzbMATHGoogle Scholar - Qin, T., & Liu, T. (2013). Introducing LETOR 4.0 datasets. CoRR. http://arxiv.org/abs/1306.2597
- Qin, T., Liu, T. Y., Xu, J., & Li, H. (2010). LETOR: A benchmark collection for research on learning to rank for information retrieval.
*Information Retrieval*,*13*(4), 346–374.CrossRefGoogle Scholar - Rudin, C., Cortes, C., Mohri, M., & Schapire, R. E. (2005). Margin-based ranking meets boosting in the middle. In
*International Conference on Computational Learning Theory*(pp. 63–78). Springer.Google Scholar - Valizadegan, H., Jin, R., Zhang, R., & Mao, J. (2009). Learning to rank by optimizing NDCG measure. In
*Proceedings of the 22nd International Conference on Neural Information Processing Systems*(pp. 1883–1891). Curran Associates Inc.Google Scholar - Weimer, M., Karatzoglou, A., Le, Q. V., & Smola, A. (2007). COFIRANK maximum margin matrix factorization for collaborative ranking. In
*Proceedings of the 20th International Conference on Neural Information Processing Systems*(pp. 1593–1600). Curran Associates Inc.Google Scholar - Xia, F., Liu, T. Y., Wang, J., Zhang, W., & Li, H. (2008). Listwise approach to learning to rank: Theory and algorithm. In
*Proceedings of the 25th International Conference on Machine Learning*(pp. 1192–1199). ACM.Google Scholar - Zheng, Z., Zha, H., & Sun, G. (2008). Query-level learning to rank using isotonic regression. In
*2008 46th Annual Allerton Conference on Communication, Control, and Computing*(pp. 1108–1115). IEEE.Google Scholar