Ordinal Label Proportions

  • Rafael PoyiadziEmail author
  • Raúl Santos-Rodríguez
  • Tijl De Bie
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11051)


In Machine Learning, it is common to distinguish different degrees of supervision, ranging from fully supervised to completely unsupervised scenarios. However, lying in between those, the Learning from Label Proportions (LLP) setting [19] assumes the training data is provided in the form of bags, and the only supervision comes through the proportion of each class in each bag. In this paper, we present a novel version of the LLP paradigm where the relationship among the classes is ordinal. While this is a highly relevant scenario (e.g. customer surveys where the results can be divided into various degrees of satisfaction), it is as yet unexplored in the literature. We refer to this setting as Ordinal Label Proportions (OLP). We formally define the scenario and introduce an efficient algorithm to tackle it. We test our algorithm on synthetic and benchmark datasets. Additionally, we present a case study examining a dataset gathered from the Research Excellence Framework that assesses the quality of research in the United Kingdom.


Label Proportions Ordinal classification Discriminant learning 



The research leading to these results has received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013)/ERC Grant Agreement no. 615517, from the FWO (project no. G091017N, G0F9816N), and from the European Union’s Horizon 2020 research and innovation programme and the FWO under the Marie Sklodowska-Curie Grant Agreement no. 665501. Additionally, this study was supported by EPSRC and MRC through the SPHERE IRC (EP/K031910/1) and CUBOID (MC/PC/16029) respectively.


  1. 1.
    Bishop, C.: Pattern Recognition and Machine Learning. Springer, Boston (2006). Scholar
  2. 2.
    Chu, W., Keerthi, S.S.: Support vector ordinal regression. Neural Comput. 19(3), 792–815 (2007)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)CrossRefGoogle Scholar
  4. 4.
    Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89(1–2), 31–71 (1997)CrossRefGoogle Scholar
  5. 5.
    Frank, E., Hall, M.: A simple approach to ordinal classification. In: De Raedt, L., Flach, P. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 145–156. Springer, Heidelberg (2001). Scholar
  6. 6.
    Gutiérrez, P.A., Perez-Ortiz, M., Sanchez-Monedero, J., Fernandez-Navarro, F., Hervas-Martinez, C.: Ordinal regression methods: survey and experimental study. IEEE Trans. Knowl. Data Eng. 28(1), 127–146 (2016)CrossRefGoogle Scholar
  7. 7.
    Herbrich, R., Graepel, T., Obermayer, K.: Support vector learning for ordinal regression. In: International Conference on Artificial Neural Networks. IET (1999)Google Scholar
  8. 8.
    Huhn, J.C., Hullermeier, E.: Is an ordinal class structure useful in classifier learning? Int. J. Data Min. Model. Manag. 1(1), 45–67 (2008)zbMATHGoogle Scholar
  9. 9.
    Kuck, H., de Freitas, N.: Learning about individuals from group statistics (2012). arXiv preprint: arXiv:1207.1393
  10. 10.
    Li, Y.F., Tsang, I.W., Kwok, J., Zhou, Z.H.: Tighter and convex maximum margin clustering. In: Artificial Intelligence and Statistics, pp. 344–351 (2009)Google Scholar
  11. 11.
    Maron, O., Ratan, A.L.: Multiple-instance learning for natural scene classification. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 341–349. Morgan Kaufmann Publishers Inc., San Francisco (1998)Google Scholar
  12. 12.
    McCullagh, P.: Regression models for ordinal data. J. R. Stat. Soc. Ser. B (Methodol.) 42, 109–142 (1980)MathSciNetzbMATHGoogle Scholar
  13. 13.
    Mika, S., Rätsch, G., Weston, J., Schölkopf, B., Müller, K.R.: Fisher discriminant analysis with kernels. In: Proceedings of the 1999 IEEE Signal Processing Society Workshop, Max-Planck-Gesellschaft, vol. 9, pp. 41–48. IEEE (1999)Google Scholar
  14. 14.
    Quadrianto, N., Smola, A.J., Caetano, T.S., Le, Q.V.: Estimating labels from label proportions. J. Mach. Learn. Res. 10, 2349–2374 (2009)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Rueping, S.: SVM classifier estimation from group probabilities. In: Proceedings of the 27th International Conference on Machine Learning, pp. 911–918 (2010)Google Scholar
  16. 16.
    Santos-Rodríguez, R., Guerrero-Curieses, A., Aláiz-Rodríguez, R., Cid-Sueiro, J.: Cost-sensitive learning based on Bregman divergences. Mach. Learn. 76, 14 (2009)CrossRefGoogle Scholar
  17. 17.
    Sun, B.Y., Li, J., Wu, D.D., Zhang, X.M., Li, W.B.: Kernel discriminant learning for ordinal regression. IEEE Trans. Knowl. Data Eng. 22(6), 906–910 (2010)CrossRefGoogle Scholar
  18. 18.
    Xu, L., Neufeld, J., Larson, B., Schuurmans, D.: Maximum margin clustering. In: Advances in Neural Information Processing Systems, pp. 1537–1544 (2005)Google Scholar
  19. 19.
    Yu, F.X., Choromanski, K., Kumar, S., Jebara, T., Chang, S.F.: On learning from label proportions (2014). arXiv preprint: arXiv:1402.5902
  20. 20.
    Yu, F.X., Liu, D., Kumar, S., Jebara, T., Chang, S.F.: \({\propto }\) SVM for learning with label proportions (2013). arXiv preprint: arXiv:1306.0886

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Rafael Poyiadzi
    • 1
    Email author
  • Raúl Santos-Rodríguez
    • 1
  • Tijl De Bie
    • 2
  1. 1.Department of Engineering MathematicsUniversity of BristolBristolUK
  2. 2.Department of Electronics and Information Systems, IDLabGhent UniversityGhentBelgium

Personalised recommendations