Indexing Multiple-Instance Objects

  • Linfei Zhou
  • Wei Ye
  • Zhen Wang
  • Claudia Plant
  • Christian BöhmEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10439)


As an actively investigated topic in machine learning, Multiple-Instance Learning (MIL) has many proposed solutions, including supervised and unsupervised methods. We introduce an indexing technique supporting efficient queries on Multiple-Instance (MI) objects. Our technique has a dynamic structure that supports efficient insertions and deletions and is based on an effective similarity measure for MI objects. Some MIL approaches have proposed their similarity measures for MI objects, but they either do not use all information or are time consuming. In this paper, we use two joint Gaussian based measures for MIL, Joint Gaussian Similarity (JGS) and Joint Gaussian Distance (JGD). They are based on intuitive definitions and take all the information into account while being robust to noise. For JGS, we propose the Instance based Index for querying MI objects. For JGD, metric trees can be directly used as the index because of its metric properties. Extensive experimental evaluations on various synthetic and real-world data sets demonstrate the effectiveness and efficiency of the similarity measures and the performance of the corresponding index structures.


  1. 1.
    Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89(1–2), 31–71 (1997)CrossRefzbMATHGoogle Scholar
  2. 2.
    Chen, Y., Wang, J.Z.: Image categorization by learning and reasoning with regions. J. Mach. Learn. Res. 5, 913–939 (2004)MathSciNetGoogle Scholar
  3. 3.
    Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Advances in Neural Information Processing Systems 15, pp. 561–568 (2002)Google Scholar
  4. 4.
    Guan, X., Raich, R., Wong, W.: Efficient multi-instance learning for activity recognition from time series data using an auto-regressive hidden markov model. In: ICML, pp. 2330–2339 (2016)Google Scholar
  5. 5.
    Xu, X.: Statistical learning in multiple instance problems. Master’s thesis, University of Waikato (2003)Google Scholar
  6. 6.
    Wang, J., Zucker, J.: Solving the multiple-instance problem: a lazy learning approach. In: ICML, pp. 1119–1126 (2000)Google Scholar
  7. 7.
    Zhang, W., Lin, X., Cheema, M.A., Zhang, Y., Wang, W.: Quantile-based KNN over multi-valued objects. In: ICDE, pp. 16–27 (2010)Google Scholar
  8. 8.
    Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: ACM/SIGACT-SIAM SODA, pp. 311–321 (1993)Google Scholar
  9. 9.
    Amores, J.: Multiple instance classification: review, taxonomy and comparative study. Artif. Intell. 201, 81–105 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Hausdorff, F., Aumann, J.R.: Grundzüge der mengenlehre. Veit (1914)Google Scholar
  11. 11.
    Niiniluoto, I.: Truthlikeness, vol. 185. Springer Science & Business Media, Dordrecht (2012)zbMATHGoogle Scholar
  12. 12.
    Belongie, S.J., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)CrossRefGoogle Scholar
  13. 13.
    Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)CrossRefzbMATHGoogle Scholar
  14. 14.
    Ramon, J., Bruynooghe, M.: A polynomial time computable metric between point sets. Acta Inf. 37(10), 765–780 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    He, X.: Multi-purpose exploratory mining of complex data. Ph.D. dissertation, Ludwig-Maximilians-Universität München (2014)Google Scholar
  16. 16.
    Sørensen, L., Loog, M., Tax, D.M.J., Lee, W., de Bruijne, M., Duin, R.P.W.: Dissimilarity-based multiple instance learning. In: IAPR, pp. 129–138 (2010)Google Scholar
  17. 17.
    Fukui, T., Wada, T.: Commonality preserving multiple instance clustering based on diverse density. In: Jawahar, C.V., Shan, S. (eds.) ACCV 2014. LNCS, vol. 9010, pp. 322–335. Springer, Cham (2015). doi: 10.1007/978-3-319-16634-6_24 Google Scholar
  18. 18.
    Guillaumin, M., Verbeek, J., Schmid, C.: Multiple instance metric learning from automatically labeled bags of faces. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 634–647. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15549-9_46 CrossRefGoogle Scholar
  19. 19.
    Jin, R., Wang, S., Zhou, Z.: Learning a distance metric from multi-instance multi-label data. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), Miami, Florida, USA, 20–25 June 2009, pp. 896–902 (2009)Google Scholar
  20. 20.
    Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)CrossRefzbMATHGoogle Scholar
  21. 21.
    Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proceedings of Annual Meeting, SIGMOD 1984, Boston, Massachusetts, 18–21 June 1984, pp. 47–57 (1984)Google Scholar
  22. 22.
    Böhm, C., Pryakhin, A., Schubert, M.: The gauss-tree: efficient object identification in databases of probabilistic feature vectors. In: ICDE, p. 9 (2006)Google Scholar
  23. 23.
    Zhou, L., Wackersreuther, B., Fiedler, F., Plant, C., Böhm, C.: Gaussian component based index for GMMs. In: ICDM, pp. 1365–1370 (2016)Google Scholar
  24. 24.
    Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: VLDB, pp. 426–435 (1997)Google Scholar
  25. 25.
    Kriegel, H.-P., Pryakhin, A., Schubert, M.: An EM-approach for clustering multi-instance objects. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS, vol. 3918, pp. 139–148. Springer, Heidelberg (2006). doi: 10.1007/11731139_18 CrossRefGoogle Scholar
  26. 26.
    Wei, X., Wu, J., Zhou, Z.: Scalable multi-instance learning. In: ICDM, pp. 1037–1042 (2014)Google Scholar
  27. 27.
    Vatsavai, R.R.: Gaussian multiple instance learning approach for mapping the slums of the world using very high resolution imagery. In: KDD, pp. 1419–1426 (2013)Google Scholar
  28. 28.
    Zhou, L., Plant, C., Böhm, C.: Joint gaussian based measures for multiple-instance learning. In: ICDE, pp. 203–206 (2017)Google Scholar
  29. 29.
    Sfikas, G., Constantinopoulos, C., Likas, A., Galatsanos, N.P.: An analytic distance metric for gaussian mixture models with application in image retrieval. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 835–840. Springer, Heidelberg (2005). doi: 10.1007/11550907_132 Google Scholar
  30. 30.
    Jensen, J.H., Ellis, D.P.W., Christensen, M.G., Jensen, S.H.: Evaluation of distance measures between gaussian mixture models of MFCCs. In: ISMIR, pp. 107–108 (2007)Google Scholar
  31. 31.
    Cui, S., Datcu, M.: Comparison of kullback-leibler divergence approximation methods between gaussian mixture models for satellite image retrieval. In: IGARSS, pp. 3719–3722 (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Linfei Zhou
    • 1
  • Wei Ye
    • 1
  • Zhen Wang
    • 2
  • Claudia Plant
    • 3
  • Christian Böhm
    • 1
    Email author
  1. 1.Ludwig-Maximilians-Universität MünchenMunichGermany
  2. 2.University of Electronic Science and Technology of ChinaChengduChina
  3. 3.University of ViennaViennaAustria

Personalised recommendations