Data Mining and Knowledge Discovery

, Volume 30, Issue 2, pp 313–341 | Cite as

Instance-level accuracy versus bag-level accuracy in multi-instance learning

  • Gitte Vanwinckelen
  • Vinicius Tragante do O
  • Daan Fierens
  • Hendrik Blockeel
Article

Abstract

In multi-instance learning, instances are organized into bags, and a bag is labeled positive if it contains at least one positive instance, and negative otherwise; the labels of the individual instances are not given. The task is to learn a classifier from this limited information. While the original task description involved learning an instance classifier, in the literature the task is often interpreted as learning a bag classifier. Depending on which of these two interpretations is used, it is more natural to evaluate classifiers according to how well they predict, respectively, instance labels or bag labels. In the literature, however, the two interpretations are often mixed, or the intended interpretation is left implicit. In this paper, we investigate the difference between bag-level and instance-level accuracy, both analytically and empirically. We show that there is a substantial difference between these two, and better performance on one does not necessarily imply better performance on the other. It is therefore useful to clearly distinguish the two settings, and always use the evaluation criterion most relevant for the task at hand. We show experimentally that the same conclusions hold for area under the ROC curve.

Keywords

Classification Multi-instance learning Multiple-instance learning Classifier evaluation Evaluation Accuracy 

References

  1. Aha D (1990) Incremental constructive induction: an instance-based approach. In: Proceedings of the 7th international conference on machine learning. Morgan Kaufmann, San Francisco, pp 117–121Google Scholar
  2. Aha D, Kibler D, Albert M (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66Google Scholar
  3. Amores J (2013) Multiple instance classification: review, taxonomy and comparative study. Artif Intell 201:81–105CrossRefMathSciNetGoogle Scholar
  4. Andrews S, Tsochantaridis I, Hofmann T (2003) Support vector machines for multiple-instance learning. In: Advances in neural information processing systems 15 (NIPS). MIT Press, Cambridge, pp 577–584Google Scholar
  5. Auer P, Long P, Srinivasan A (1998) Approximating hyper-rectangles: learning and pseudo-random sets. J Comput Syst Sci 57(3):376–388CrossRefMathSciNetMATHGoogle Scholar
  6. Auer P, Ortner R (2004) A boosting approach to multiple instance learning. In: Proceedings of the 15th European conference on machine learning. Lecture Notes in Computer Science, vol 3201. Springer, Berlin, pp 63–74Google Scholar
  7. Bjerring L, Frank E (2011) Beyond trees: adopting miti to learn rules and ensemble classifiers for multi-instance data. In: Proceedings of the 24th Australian joint conference on artificial intelligence. Springer, Perth, pp 41–50Google Scholar
  8. Blockeel H, Page D, Srinivasan A (2005) Multi-instance tree learning. In: Proceedings of the 22nd international conference on machine learning. ACM Press, Washington, pp 57–64Google Scholar
  9. Blum A, Kalai A (1998) A note on learning from multiple-instance examples. Mach Learn 30(1):23–29CrossRefMATHGoogle Scholar
  10. Chen Y, Wang J (2004) Image categorization by learning and reasoning with regions. J Mach Learn Res 5:913–939Google Scholar
  11. Cranor L, LaMacchia B (1998) Spam!. Commun ACM 41(8):74–83CrossRefGoogle Scholar
  12. Dietterich T, Lathrop R, Lozano-Pérez T (1997) Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 89(1–2):31–71CrossRefMATHGoogle Scholar
  13. Dong L (2006) A comparison of multi-instance learning algorithms. Master’s thesisGoogle Scholar
  14. Dooly D, Zhang Q, Goldman S, Amar R (2003) Multiple instance learning of real valued data. J Mach Learn Res 3:651–678MATHGoogle Scholar
  15. Doran G, Ray S (2014) A theoretical and empirical analysis of support vector machine methods for multiple-instance classification. Mach Learn 97(1–2):79–102CrossRefMathSciNetMATHGoogle Scholar
  16. Foulds J, Frank E (2010) A review of multi-instance learning assumptions. Knowl Eng Rev 25:1–25CrossRefGoogle Scholar
  17. Frank E, Xu X (2003) Applying propositional learning algorithms to multi-instance data. Technical report. University of Waikato, HamiltonGoogle Scholar
  18. Freund Y, Schapire R (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: Proceedings of the 2nd European conference on computational learning theory. Springer, Berlin, pp 23–37Google Scholar
  19. Fu Z, Robles-Kelly A, Zhou J (2011) MILIS: multiple instance learning with instance selection. IEEE Trans Pattern Anal Mach Intell 33(5):958–977CrossRefGoogle Scholar
  20. Fung G, Dundar M, Krishnapuram B, Rao R (2007) Multiple instance learning for computer aided diagnosis. In: Advances in neural information processing systems 19 (NIPS). MIT Press, Cambridge, pp 425–432Google Scholar
  21. Gärtner T, Flach P, Kowalczyk A, Smola A (2002) Multi-instance kernels. In: Proceedings of the 19th international conference on machine learning. Morgan Kaufmann, Sydney, pp 179–186Google Scholar
  22. Kohavi R (1996) Scaling up the accuracy of naive-Bayes classifiers: a decision-tree hybrid. In: Proceedings of the 2nd international conference on knowledge discovery and data mining, vol 7. AAAI Press, Menlo Park, pp 202–207Google Scholar
  23. le Cessie S, van Houwelingen J (1992) Ridge estimators in logistic regression. Appl Stat 41(1):191–201CrossRefMATHGoogle Scholar
  24. Li Y, Kwok J, Tsang I, Zhou Z (2009) A convex method for locating regions of interest with multi-instance learning. In: Proceedings of the European conference on machine learning and knowledge discovery in databases. Springer, Berlin, pp 15–30Google Scholar
  25. Liu G, Wu J, Zhou Z (2012) Key instance detection in multi-instance learning. In: Hoi SCH, Buntine WL (eds) Proceedings of the 4th Asian conference on machine learning, vol 25, JMLR.org, JMLR Proceedings, Singapore, pp 253–268Google Scholar
  26. Long P, Tan L (1998) PAC learning axis-aligned rectangles with respect to product distributions from multiple-instance examples. Mach Learn 30(1):7–21CrossRefMATHGoogle Scholar
  27. Mandel M, Ellis D (2008) Multiple-instance learning for music information retrieval. In: Proceedings of the 9th international conference on music information retrieval, pp 577–582Google Scholar
  28. Maron O, Lozano-Pérez T (1998) A framework for multiple-instance learning. In: Advances in neural information processing systems 11 (NIPS). MIT Press, Cambridge, pp 570–576Google Scholar
  29. Maron O, Ratan A (1998) Multiple-instance learning for natural scene classification. In: Proceedings of the 15th international conference on machine learning. Morgan Kaufmann, San Francisco, pp 341–349Google Scholar
  30. Merz C, Murphy P (1996) UCI repository of machine learning databases. http://archive.ics.uci.edu/ml/
  31. Platt J (1999) Fast training of support vector machines using sequential minimal optimization. Adv Kernel Methods 208:185–208Google Scholar
  32. Quinlan J (2003) C4.5: programs for machine learning. Morgan Kaufmann, San FranciscoGoogle Scholar
  33. Ramon J, De Raedt L (2000) Multi instance neural networks. In: Proceedings of the 17th international conference on machine learning, Workshop on attribute-value and relational learning, pp 53–60Google Scholar
  34. Ray S, Craven M (2005) Supervised versus multiple instance learning: an empirical comparison. In: Proceedings of the 22nd international conference on machine learning, vol 22. ACM Press, Washington, pp 697–704Google Scholar
  35. Ray S, Scott S, Blockeel H (2011) Multi-instance learning. In: Sammut C, Webb G (eds) Encyclopedia of machine learning, ist edn. Springer, Berlin, pp 701–710Google Scholar
  36. Settles B, Craven M, Ray S (2008) Multiple-instance active learning. In: Advances in neural information processing systems 20 (NIPS). MIT Press, Cambridge, pp 1289–1296Google Scholar
  37. Shao J, He D, Yang Q (2008) Multi-semantic scene classification based on region of interest. In: Proceedings of the international conference on computational intelligence for modelling control & automation. IEEE Computer Society, Washington, DC, pp 732–737Google Scholar
  38. Smith J, Everhart J, Dickson W, Knowler W, Johannes R (1988) Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In: Proceedings of the symposium on computer applications and medical care, pp 261–265Google Scholar
  39. Tao Q, Scott S, Vinodchandran N, Osugi T (2004) SVM-based generalized multiple-instance learning via approximate box counting. In: Proceedings of the 21th international conference on machine learning. Morgan Kaufmann, San FranciscoGoogle Scholar
  40. Tragante do O V, Fierens D, Blockeel H (2011) Instance-level accuracy versus bag-level accuracy in multi-instance learning. In: Proceedings of the 23d Benelux conference on artificial intelligence. https://lirias.kuleuven.be/handle/123456789/316681
  41. Wang J, Zucker J (2000) Solving the multiple-instance problem: a lazy learning approach. In: Proceedings of the 17th international conference on machine learning. Morgan Kaufmann, San Francisco, pp 1119–1126Google Scholar
  42. Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, San FranciscoGoogle Scholar
  43. Xu X (2003) Statistical learning in multiple instance problems. Master’s thesisGoogle Scholar
  44. Xu X, Frank E (2004) Logistic regression and boosting for labeled bags of instances. In: Proceedings of the Pacific Asia conference on knowledge discovery and data mining. Lecture Notes in Computer Science. Springer, Berlin, pp 272–281Google Scholar
  45. Yeh I, Yang K, Ting T (2009) Knowledge discovery on RFM model using Bernoulli sequence. Expert Syst Appl 36(3P2):5866–5871CrossRefGoogle Scholar
  46. Zhang M (2009) Generalized multi-instance learning: Problems, algorithms and data sets. In: Proceedings of the WRI Global congress on intelligent systems, vol 3. Morgan Kaufmann, San Francisco, pp 539–543Google Scholar
  47. Zhang Q, Goldman S (2001) EM-DD: an improved multiple-instance learning technique. In: Advances in neural information processing systems 14 (NIPS). MIT Press, Cambridge, pp 1073–1080Google Scholar
  48. Zhou Z, Xue X, Jiang Y (2005) Locating regions of interest in cbir with multi-instance learning techniques. In: Proceedings of the 18th Australian joint conference on advances in artificial intelligence. Springer, Berlin, pp 92–101Google Scholar
  49. Zhou Z, Zhang M (2003) Ensembles of multi-instance learners. In: Proceedings of the 14th European conference on machine learning. Lecture Notes in Computer Science. Springer, Berlin, pp 492–502Google Scholar
  50. Zucker J, Chevaleyre Y (2001) Solving multiple-instance and multiple-part learning problems with decision trees and decision rules. Application to the mutagenesis problem. In: Proceedings of the 14th Biennial conference of the Canadian Society on computational studies of intelligence: advances in artificial intelligence. Springer, Berlin, pp 204–214Google Scholar

Copyright information

© The Author(s) 2015

Authors and Affiliations

  • Gitte Vanwinckelen
    • 1
  • Vinicius Tragante do O
    • 2
    • 3
  • Daan Fierens
    • 1
  • Hendrik Blockeel
    • 1
  1. 1.Department of Computer ScienceKU LeuvenLeuvenBelgium
  2. 2.Division of Heart and Lungs, Department of CardiologyUniversity Medical Center UtrechtUtrechtThe Netherlands
  3. 3.Division of Biomedical Genetics, Department of Medical GeneticsUniversity Medical Center UtrechtUtrechtThe Netherlands

Personalised recommendations