Multiple Instance Learning with Bag-Level Randomized Trees

  • Tomáš KomárekEmail author
  • Petr Somol
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11051)


Knowledge discovery in databases with a flexible structure poses a great challenge to machine learning community. Multiple Instance Learning (MIL) aims at learning from samples (called bags) represented by multiple feature vectors (called instances) as opposed to single feature vectors characteristic for the traditional data representation. This relaxation turns out to be useful in formulating many machine learning problems including classification of molecules, cancer detection from tissue images or identification of malicious network communications. However, despite the recent progress in this area, the current set of MIL tools still seems to be very application specific and/or burdened with many tuning parameters or processing steps. In this paper, we propose a simple, yet effective tree-based algorithm for solving MIL classification problems. Empirical evaluation against 28 classifiers on 29 publicly available benchmark datasets shows a high level performance of the proposed solution even with its default parameter settings. Data related to this paper are available at: Code related to this paper is available at:


Multiple Instance Learning Randomized trees Classification 



This research has been supported by the Grant Agency of the Czech Technical University in Prague, grant No. SGS16/235/OHK3/3T/13.


  1. 1.
    Amores, J.: Multiple instance classification: review, taxonomy and comparative study. Artif. Intell. 201, 81–105 (2013). Scholar
  2. 2.
    Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Proceedings of the 15th International Conference on Neural Information Processing Systems, NIPS 2002, pp. 577–584. MIT Press, Cambridge (2002).
  3. 3.
    Babenko, B., Yang, M.H., Belongie, S.: Visual tracking with online multiple instance learning. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 983–990, June 2009.
  4. 4.
    Blockeel, H., Page, D., Srinivasan, A.: Multi-instance tree learning. In: Proceedings of the 22nd International Conference on Machine Learning, ICML 2005, pp. 57–64. ACM, New York (2005).
  5. 5.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). Scholar
  6. 6.
    Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press, Boca Raton (1984)zbMATHGoogle Scholar
  7. 7.
    Chen, Y., Bi, J., Wang, J.Z.: Miles: multiple-instance learning via embeddedinstance selection. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 1931–1947 (2006). Scholar
  8. 8.
    Cheplygina, V., Tax, D.M.J.: Characterizing multiple instance datasets. In: Feragen, A., Pelillo, M., Loog, M. (eds.) SIMBAD 2015. LNCS, vol. 9370, pp. 15–27. Springer, Cham (2015). Scholar
  9. 9.
    Cheplygina, V., Tax, D.M., Loog, M.: Multiple instance learning with bag dissimilarities. Pattern Recogn. 48(1), 264–275 (2015). Scholar
  10. 10.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006). Scholar
  11. 11.
    Dietterich, T.G., Lathrop, R.H., Lozano-Prez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89(1), 31–71 (1997). Scholar
  12. 12.
    Fernández-Delgado, M., Cernadas, E., Barro, S., Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems? J. Mach. Learn. Res. 15(1), 3133–3181 (2014). Scholar
  13. 13.
    Gärtner, T., Flach, P.A., Kowalczyk, A., Smola, A.J.: Multi-instance kernels. In: Proceedings of the Nineteenth International Conference on Machine Learning, ICML 2002, pp. 179–186. Morgan Kaufmann Publishers Inc., San Francisco (2002).
  14. 14.
    Gehler, P.V., Chapelle, O.: Deterministic annealing for multiple-instance learning. In: Artificial Intelligence and Statistics, pp. 123–130 (2007)Google Scholar
  15. 15.
    Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)CrossRefGoogle Scholar
  16. 16.
    Kohout, J., Komárek, T., Čech, P., Bodnár, J., Lokoč, J.: Learning communication patterns for malware discovery in https data. Expert Systems with Applications 101, 129–142 (2018). Scholar
  17. 17.
    Leistner, C., Saffari, A., Bischof, H.: MIForests: multiple-instance learning with randomized trees. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6316, pp. 29–42. Springer, Heidelberg (2010). Scholar
  18. 18.
    Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986). Scholar
  19. 19.
    Ruffo, G.: Learning single and multiple instance decision trees for computer security applications. University of Turin, Torino (2000)Google Scholar
  20. 20.
    Stiborek, J., Pevný, T., Rehák, M.: Multiple instance learning for malware classification. Expert Syst. Appl. 93, 346–357 (2018). Scholar
  21. 21.
    Straehle, C., Kandemir, M., Koethe, U., Hamprecht, F.A.: Multiple instance learning with response-optimized random forests. In: 2014 22nd International Conference on Pattern Recognition, pp. 3768–3773, August 2014.
  22. 22.
    Tax, D.M.J.: A matlab toolbox for multiple-instance learning, version 1.2.2, Faculty EWI, Delft University of Technology, The Netherlands, April 2017Google Scholar
  23. 23.
    Zhang, C., Platt, J.C., Viola, P.A.: Multiple instance boosting for object detection. In: Weiss, Y., Schölkopf, B., Platt, J.C. (eds.) Advances in Neural Information Processing Systems, vol. 18, pp. 1417–1424. MIT Press (2006).
  24. 24.
    Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. Int. J. Comput. Vision 73(2), 213–238 (2007)CrossRefGoogle Scholar
  25. 25.
    Zhang, Q., Goldman, S.A.: Em-dd: an improved multiple-instance learning technique. In: Advances in Neural Information Processing Systems, pp. 1073–1080. MIT Press (2001)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Faculty of Electrical EngineeringCzech Technical University in PraguePrague 6Czech Republic
  2. 2.Institute of Information Theory and AutomationCzech Academy of SciencesPrague 8Czech Republic
  3. 3.Faculty of ManagementUniversity of EconomicsPragueCzech Republic

Personalised recommendations