Ranking with a P-Norm Push

  • Cynthia Rudin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4005)


We are interested in supervised ranking with the following twist: our goal is to design algorithms that perform especially well near the top of the ranked list, and are only required to perform sufficiently well on the rest of the list. Towards this goal, we provide a general form of convex objective that gives high-scoring examples more importance. This “push” near the top of the list can be chosen to be arbitrarily large or small. We choose ℓ p -norms to provide a specific type of push; as p becomes large, the algorithm concentrates harder near the top of the list. We derive a generalization bound based on the p-norm objective. We then derive a corresponding boosting-style algorithm, and illustrate the usefulness of the algorithm through experiments on UCI data. We prove that the minimizer of the objective is unique in a specific sense.


Receiver Operator Characteristic Receiver Operator Characteristic Curve Neural Information Processing System Price Function Threshold Feature 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agarwal, S., Graepel, T., Herbich, R., Har-Peled, S., Roth, D.: Generalization bounds for the area under the ROC curve. Journal of Machine Learning Research 6, 393–425 (2005)MATHGoogle Scholar
  2. 2.
    Clemençon, S., Lugosi, G., Vayatis, N.: Ranking and scoring using empirical risk minimization. In: Proceedings of the Eighteenth Annual Conference on Computational Learning Theory (2005)Google Scholar
  3. 3.
    Collins, M., Schapire, R.E., Singer, Y.: Logistic regression, AdaBoost and Bregman distances. Machine Learning 48(1/2/3) (2002)Google Scholar
  4. 4.
    Cortes, C., Mohri, M.: AUC optimization vs. error rate minimization. In: Advances in Neural Information Processing Systems 16 (2004)Google Scholar
  5. 5.
    Cucker, F., Smale, S.: On the mathematical foundations of learning. Bull. Amer. Math. Soc. 39, 1–49 (2002)CrossRefMathSciNetMATHGoogle Scholar
  6. 6.
    Dekel, O., Manning, C., Singer, Y.: Log-linear models for label ranking. In: Advances in Neural Information Processing Systems 16 (2004)Google Scholar
  7. 7.
    Pietra, S.D., Pietra, V.D., Lafferty, J.: Duality and auxiliary functions for Bregman distances. Technical Report CMU-CS-01-109R, School of Computer Science, Carnegie Mellon University (2002)Google Scholar
  8. 8.
    Freund, Y., Iyer, R., Schapire, R.E., Singer, Y.: An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research 4, 933–969 (2003)CrossRefMathSciNetGoogle Scholar
  9. 9.
    Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)CrossRefMathSciNetMATHGoogle Scholar
  10. 10.
    Koltchinskii, V., Panchenko, D.: Empirical margin distributions and bounding the generalization error of combined classifiers. The Annals of Statistics 30(1) (February 2002)Google Scholar
  11. 11.
    Mozer, M.C., Dodier, R., Colagrosso, M.D., Guerra-Salcedo, C., Wolniewicz, R.: Prodding the ROC curve: Constrained optimization of classifier performance. In: Advances in Neural Information Processing Systems 14, pp. 1409–1415 (2002)Google Scholar
  12. 12.
    Rudin, C.: Ranking with a p-norm push. Technical Report TR2005-874, New York University (2005)Google Scholar
  13. 13.
    Rudin, C., Cortes, C., Mohri, M., Schapire, R.E.: Margin-based ranking meets boosting in the middle. In: Proceedings of the Eighteenth Annual Conference on Computational Learning Theory (2005)Google Scholar
  14. 14.
    Rudin, C., Schapire, R.E.: Margin-based ranking and why Adaboost is actually a ranking algorithm (in progress, 2006)Google Scholar
  15. 15.
    Blake, C.L., Hettich, S., Merz, C.J.: UCI repository of machine learning databases (1998)Google Scholar
  16. 16.
    Usunier, N., Amini, M.-R., Gallinari, P.: A data-dependent generalisation error bound for the AUC. In: Proceedings of the ICML 2005 Workshop on ROC Analysis in Machine Learning (2005)Google Scholar
  17. 17.
    Yan, L., Dodier, R.H., Mozer, M., Wolniewicz, R.H.: Optimizing classifier performance via an approximation to the Wilcoxon-Mann-Whitney statistic. In: Proc. ICML, pp. 848–855 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Cynthia Rudin
    • 1
  1. 1.Center for Neural Science and Courant Institute of Mathematical SciencesNew York University / Howard Hughes Medical InstituteNew York

Personalised recommendations