Lazy Learning for Improving Ranking of Decision Trees

  • Han Liang
  • Yuhong Yan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4304)


Decision tree-based probability estimation has received great attention because accurate probability estimation can possibly improve classification accuracy and probability-based ranking. In this paper, we aim to improve probability-based ranking under decision tree paradigms using AUC as the evaluation metric. We deploy a lazy probability estimator at each leaf to avoid uniform probability assignment. More importantly, the lazy probability estimator gives higher weights to the leaf samples closer to an unlabeled sample so that the probability estimation of this unlabeled sample is based on its similarities to those leaf samples. The motivation behind it is that ranking is a relative evaluation measurement among a set of samples, therefore, it is reasonable to yield the probability for an unlabeled sample with reference to its extent of similarities to its neighbors. The proposed new decision tree model, LazyTree, outperforms C4.5, its recent improvement C4.4 and their state-of-the-art variants in AUC on a large suite of benchmark sample sets.


Decision Tree Probability Estimator Decision Tree Model Unlabeled Sample Improve Ranking 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Blake, C., Merz, C.J.: Uci repository of machine learning databaseGoogle Scholar
  2. 2.
    Flach, P.A., Ferri, C., Hernandez-Orallo, J.: Improving the auc of probabilistic estimation trees. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, Springer, Heidelberg (2003)Google Scholar
  3. 3.
    Hand, D.J., Till, R.J.: A simple generalisation of the area under the roc curve for multiple class classification problems. Machine Learning 45 (2001)Google Scholar
  4. 4.
    Kohavi, R.: Scaling up the accuracy of naive-bayes classifiers: a decision-tree hybrid. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (1996)Google Scholar
  5. 5.
    Liang, H., Yan, Y.: Lazy learning for improving ranking of decision trees (2006),
  6. 6.
    Ling, C.X., Yan, R.J.: Decision tree with better ranking. In: Proceedings of the 20th International Conference on Machine Learning (ICML 2003), Morgan Kaufmann, San Francisco (2003)Google Scholar
  7. 7.
    Provost, F.J., Domingos, P.: Tree induction for probability-based ranking. Machine Learning 52(30) (2003)Google Scholar
  8. 8.
    Witten, I.H., Frank, E.: Data Mining –Practical Machine Learning Tools and Techniques with Java Implementation. Morgan Kaufmann, San Francisco (2000)Google Scholar
  9. 9.
    Zadrozny, B., Elkan, C.: Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. In: Proceedings of the 18th International Conference on Machine Learning (ICML 2001), Springer, Heidelberg (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Han Liang
    • 1
  • Yuhong Yan
    • 2
  1. 1.Faculty of Computer ScienceUniversity of New BrunswickFrederictonCanada
  2. 2.National Research Council of CanadaFrederictonCanada

Personalised recommendations