Why is quantification an interesting learning problem?

González, Pablo; Díez, Jorge; Chawla, Nitesh; del Coz, Juan José

doi:10.1007/s13748-016-0103-3

Why is quantification an interesting learning problem?

Regular Paper
Published: 30 September 2016

Volume 6, pages 53–58, (2017)
Cite this article

Progress in Artificial Intelligence Aims and scope Submit manuscript

Pablo González¹,
Jorge Díez¹,
Nitesh Chawla^2,3 &
…
Juan José del Coz^1,2

520 Accesses
20 Citations
Explore all metrics

Abstract

There are real applications that do not demand to classify or to make predictions about individual objects, but to estimate some magnitude about a group of them. For instance, one of these cases happens in sentiment analysis and opinion mining. Some applications require to classify opinions as positives or negatives, but there are also others, even more useful sometimes, that just need an estimation of which is the proportion of each class during a concrete period of time. “How many tweets about our new product were positive yesterday?” Practitioners should apply quantification algorithms to tackle this kind of problems, instead of just using off-the-shelf classification methods, because classifiers are suboptimal in the context of quantification tasks. Unfortunately, quantification learning is still relatively an under explored area in machine learning. The goal of this paper is to show that quantification learning is an interesting open problem. To support its benefits, we shall show an application to analyze Twitter comments in which even the most simple quantification methods outperform classification approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

From classification to quantification in tweet sentiment analysis

Article 12 April 2016

Re-assessing the “Classify and Count” Quantification Method

Making Sentiment Analysis Algorithms Scalable

Notes

All the details can be found in [8]

References

Barranquero, J., González, P., Díez, J., del Coz, J.J.: On the study of nearest neighbour algorithms for prevalence estimation in binary problems. Pattern Recognit. 46(2), 472–482 (2013)
Article MATH Google Scholar
Barranquero, J., Díez, J., del Coz, J.J.: Quantification-oriented learning based on reliable classifiers. Pattern Recognit. 48(2), 591–604 (2015)
Article Google Scholar
Beijbom, O., Hoffman, J., Yao, E., Darrell, T., Rodriguez-Ramirez, A., Gonzalez-Rivero, M., Guldberg, O.H.: Quantification in-the-wild: data-sets and baselines. In: NIPS 2015, Workshop on Transfer and Multi-Task Learning. Montreal, CA (2015)
Bella, A., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.: Quantification via probability estimators. In: Proc. of the 10th IEEE International Conference on Data Mining, pp. 737–742 (2010)
Esuli, A., Sebastiani, F.: Sentiment quantification. IEEE Intell. Syst. 25(4), 72–75 (2010)
Article Google Scholar
Esuli, A., Sebastiani, F.: Optimizing text quantifiers for multivariate loss functions. ACM Trans. Knowl. Discov. Data 9(4), 27:1–27:27 (2015)
Fawcett, T., Flach, P.: A response to Webb and Ting’s on the application of ROC analysis to predict classification performance under varying class distributions. Mach. Learn. 58(1), 33–38 (2005)
Article Google Scholar
Forman, G.: Quantifying counts and costs via classification. Data Mining Knowl. Discov. 17(2), 164–206 (2008)
Article MathSciNet Google Scholar
Forman, G., Kirshenbaum, E., Suermondt, J.: Pragmatic text mining: minimizing human effort to quantify many issues in call logs. In: Proceedings of ACM SIGKDD’06, ACM, pp. 852–861 (2006)
Garcia, S., Herrera, F.: An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J. Mach. Learn. Res. 9, 2677–2694 (2008)
MATH Google Scholar
Go, A., Bhayani, R., Huang, L.: Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford 1:12 (2009)
González-Castro, V., Alaiz-Rodríguez, R., Alegre, E.: Class distribution estimation based on the hellinger distance. Inf. Sci. 218, 146–164 (2013)
Article Google Scholar
Latinne, P., Saerens, M., Decaestecker, C.: Adjusting the outputs of a classifier to new a priori probabilities may significantly improve classification accuracy: Evidence from a multi-class problem in remote sensing. In: Proceedings of ICML’01, M. Kaufmann, pp. 298–305 (2001)
Milli, L., Monreale, A., Rossetti, G., Giannotti, F., Pedreschi, D., Sebastiani, F.: Quantification trees. In: IEEE International Conference on Data Mining (ICDM’13), pp. 528–536 (2013)
Milli, L., Monreale, A., Rossetti, G., Pedreschi, D., Giannotti, F., Sebastiani, F.: Quantification in social networks. In: Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on, pp. 1–10 (2015)
Pérez-Gallego, P., Quevedo, J.R., del Coz, J.J.: Using ensembles for problems with characterizable changes in data distribution: a case study on quantification. Inf. Fusion 34, 87–100 (2017)
Article Google Scholar
Rakthanmanon, T., Keogh, E., Lonardi, S., Evans, S.: MDL-based time series clustering. Knowl. Inf. Syst. 33(2), 371–399 (2012)
Article Google Scholar
Saif, H., Fernández, M., He, Y., Alani, H.: Evaluation datasets for twitter sentiment analysis: a survey and a new dataset, the sts-gold. In: 1st Interantional Workshop on Emotion and Sentiment in Social and Expressive Media: Approaches and Perspectives from AI (ESSEM 2013) (2013)
Tasche, D.: Exact fit of simple finite mixture models. J. Risk Financial Manag. 7(4), 150–164 (2014)
Article Google Scholar

Download references

Acknowledgments

This research has been funded by MINECO (the Spanish Ministerio de Economía y Competitividad) and FEDER (Fondo Europeo de Desarrollo Regional), Grant TIN2015-65069-C2-2-R. Juan José del Coz is also supported by the Fulbright Commission and the Salvador de Madariaga Program, Grant PRX15/00607. This paper has been written during the stay of Juan José del Coz at the University of Notre Dame.

Author information

Authors and Affiliations

Artificial Intelligence Center, University of Oviedo at Gijón, Oviedo, Spain
Pablo González, Jorge Díez & Juan José del Coz
Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
Nitesh Chawla & Juan José del Coz
Interdisciplinary Center for Network Science and Applications, University of Notre Dame, Notre Dame, IN, 46556, USA
Nitesh Chawla

Authors

Pablo González
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Díez
View author publications
You can also search for this author in PubMed Google Scholar
Nitesh Chawla
View author publications
You can also search for this author in PubMed Google Scholar
Juan José del Coz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juan José del Coz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

González, P., Díez, J., Chawla, N. et al. Why is quantification an interesting learning problem?. Prog Artif Intell 6, 53–58 (2017). https://doi.org/10.1007/s13748-016-0103-3

Download citation

Received: 11 August 2016
Accepted: 19 September 2016
Published: 30 September 2016
Issue Date: March 2017
DOI: https://doi.org/10.1007/s13748-016-0103-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Why is quantification an interesting learning problem?

Abstract

Access this article

Similar content being viewed by others

From classification to quantification in tweet sentiment analysis

Re-assessing the “Classify and Count” Quantification Method

Making Sentiment Analysis Algorithms Scalable

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Why is quantification an interesting learning problem?

Abstract

Access this article

Similar content being viewed by others

From classification to quantification in tweet sentiment analysis

Re-assessing the “Classify and Count” Quantification Method

Making Sentiment Analysis Algorithms Scalable

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation