Advertisement

One-Class Quantification

  • Denis dos ReisEmail author
  • André MaletzkeEmail author
  • Everton ChermanEmail author
  • Gustavo BatistaEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11051)

Abstract

This paper proposes one-class quantification, a new Machine Learning task. Quantification estimates the class distribution of an unlabeled sample of instances. Similarly to one-class classification, we assume that only a sample of examples of a single class is available for learning, and we are interested in counting the cases of such class in a test set. We formulate, for the first time, one-class quantification methods and assess them in a comprehensible open-set evaluation. In an open-set problem, several “subclasses” represent the negative class, and we cannot assume to have enough observations for all of them at training time. Therefore, new classes may appear after deployment, making this a challenging setup for existing quantification methods. We show that our proposals are simple and more accurate than the state-of-the-art in quantification. Finally, the approaches are very efficient, fitting batch and stream applications. Code related to this paper is available at: https://github.com/denismr/One-class-Quantification.

Keywords

One-class quantification Counting Open-set recognition 

Notes

Acknowledgement

The authors thank CAPES (PROEX-6909543/D), CNPq (306631/2016-4) and FAPESP (2016/04986-6). This material is based upon work supported by the United States Agency for International Development under Grant No AID-OAA-F-16-00072.

References

  1. 1.
    Alimoglu, F., Alpaydin, E., Denizhan, Y.: Combining multiple classifiers for pen-based handwritten digit recognition (1996)Google Scholar
  2. 2.
    Chan, Y.S., Ng, H.T.: Estimating class priors in domain adaptation for word sense disambiguation. In: COLING ACL, pp. 89–96 (2006)Google Scholar
  3. 3.
    Chapman, R., Simpson, S., Douglas, A.: The Insects: Structure and Function. Cambridge University Press, Cambridge (2013)Google Scholar
  4. 4.
    Chen, Y., Why, A., Batista, G.E., Mafra-Neto, A., Keogh, E.: Flying insect classification with inexpensive sensors. J. Insect Behav. 27(5), 657–677 (2014)CrossRefGoogle Scholar
  5. 5.
    Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J.: Modeling wine preferences by data mining from physicochemical properties. Decis. Support Syst. 47(4), 547–553 (2009)CrossRefGoogle Scholar
  6. 6.
    Diaz, J.J., Colonna, J.G., Soares, R.B., Figueiredo, C.M., Nakamura, E.F.: Compressive sensing for efficiently collecting wildlife sounds with wireless sensor networks. In: CCCN, Munich, pp. 1–7 (2012)Google Scholar
  7. 7.
    Forman, G.: Counting positives accurately despite inaccurate classification. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 564–575. Springer, Heidelberg (2005).  https://doi.org/10.1007/11564096_55CrossRefGoogle Scholar
  8. 8.
    Forman, G.: Quantifying trends accurately despite classifier error and class imbalance. In: SIGKDD, Philadelphia, pp. 157–166 (2006)Google Scholar
  9. 9.
    Frey, P.W., Slate, D.J.: Letter recognition using holland-style adaptive classifiers. Mach. Learn. 6(2), 161–182 (1991)Google Scholar
  10. 10.
    González, P., Castaño, A., Chawla, N.V., Coz, J.J.D.: A review on quantification learning. ACM Comput. Surv. 50(5), 74 (2017)CrossRefGoogle Scholar
  11. 11.
    González-Castro, V., Alaiz-Rodríguez, R., Alegre, E.: Class distribution estimation based on the hellinger distance. Inf. Sci. 218, 146–164 (2013)CrossRefGoogle Scholar
  12. 12.
    Hammami, N., Bedda, M.: Improved tree model for arabic speech recognition. ICCSIT 5, 521–526 (2010)Google Scholar
  13. 13.
    Hopkins, D.J., King, G.: A method of automated nonparametric content analysis for social science. Am. J. Polit. Sci. 54(1), 229–247 (2010)CrossRefGoogle Scholar
  14. 14.
    Jain, L.P., Scheirer, W.J., Boult, T.E.: Multi-class open set recognition using probability of inclusion. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 393–409. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10578-9_26CrossRefGoogle Scholar
  15. 15.
    Lichman, M.: UCI m.l. repository (2013). http://archive.ics.uci.edu/ml
  16. 16.
    Lyon, R., Stappers, B., Cooper, S., Brooke, J., Knowles, J.: Fifty years of pulsar candidate selection: from simple filters to a new principled real-time classification approach. MNRAS 459(1), 1104–1123 (2016)CrossRefGoogle Scholar
  17. 17.
    Pedregosa, F., et al.: Scikit-learn: machine learning in Python. JMLR 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  18. 18.
    dos Reis, D., Maletzke, A., Batista, G.: Unsupervised context switch for classification tasks on data streams with recurrent concepts. In: SAC (2018)Google Scholar
  19. 19.
    Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. SIGKDD Explor. 15(2), 49–60 (2013)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Universidade of São PauloSão CarlosBrazil

Personalised recommendations