Abstract
Multiple instance boosting (MILBoost) is a framework which uses multiple instance learning (MIL) with boosting technique to solve the problems regarding weakly labeled inexact data. This paper proposes an enhanced multiple boosting framework—evolutionary MILBoost (EMILBoost) which utilizes differential evolution (DE) to optimize the combination of weak classifier or weak estimator weights in the framework. A standard MIL dataset MUSK and a binary classification dataset Hastie_10_2 are used to evaluate the results. Results are presented in terms of bag and instance classification error and also confusion matrix of test data.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
Multiple Instance Learning (MIL) is a type of weak supervision. It falls under the inexact supervision category of weak supervision where the data are given with labels but not as exact as desired. This type of data is prevalent in medical field where the class labels are often not available at desired granularity [1]. Hence, MIL is particularly well suited for medical data analysis [2].
MILBoost was first proposed by Viola et al. [3]. It was mainly developed for object detection in images and videos. There on, MILBoost and many of its variants are used for various tasks—human action recognition [4], MIL with gradient boosting for face recognition from videos [5], human detection from artificially generated 3D human models [6], multi-class MILBoost for human parts detection [7], logistic MILBoost for pedestrian detection [8], gentle MILBoost for human detection which uses the Newton update to get an optimal weak classifier [9], confidence rated MILBoost [10], online MILBoost [11], object tracking by incorporating instance significance estimation into online MILBoost [12], online MILBoost for object tracking [13,14,15], visual object tracking. In medical applications, MILBoost has been used for early temporal prediction of Type 2 diabetes risk condition [16], liver cirrhosis classification using ultrasound images [17], histopathology cancer image classification, segmentation, clustering [18,19,20]. The main concept behind boosting is to sequentially train several weak classifiers or weak estimators and combine them to form a strong classifier. This combining is done through weighted sum of the weak classifiers. For this, each weak classifier is assigned a weight. The main task here is to find the combination of optimized weights to generate the strongest classifier. MILBoost uses AnyBoost framework [21] where boosting classifier is trained by maximizing the log-likelihood of all bags. There is a scope of improving the MILBoost framework by enhancing the weight optimization process through population-based evolutionary technique instead of a single point gradient descent technique. This also opens up the scope for parallelizing the optimization process. Evolutionary algorithms—Genetic algorithm (GA) [22] and Differential Evolution (DE) [23] has been used for MIL to formulate pooling functions [24, 25].
The main objective of this work is to formulate a MILBoost framework based on differential evolution (DE) which will make the framework able to parallelize the optimization process.
The rest of the paper is divided into six sections. Section 2 elaborates on MILBoost. Section 3 gives a brief description of DE. Section 4 presents the methodology. Section 5 discusses the experiments done and the subsequent results are discussed in Sect. 6. Finally, Sect. 7 concludes the paper.
Multiple instance boosting (MILBoost)
This section presents the formal representation of MILBoost. Suppose we have a binary classification data \((X_{1} ,Y_{1} ),(X_{2} ,Y_{2} ),...,(X_{n} ,Y_{n} )\) where \(X_{i} = \{ x_{i1} ,x_{i2} ,...,x_{{{\text{im}}}} \}\), i ∈ {1,2,…,n}, n is number of bags, m is the dimension of Xi and Yi ∈ [0,1].Yi = 1 indicates that the positive bag Xi contains at least one positive instance xij (j = {1,2,…,m}). Yi = 0 means that there are no positive instances in the bag Xi. The task is to identify a real-valued function h(xij) to infer the instance label yij corresponding to an instance xij. This function is estimated through a weak classifier. Then, through boosting, weak classifiers are combined to form a strong classifier with low error
where K is the number of weak classifiers, αk are the classifier weights or estimator weights which signify the relative importance of a weak classifier. In each phase, incorrectly classified instances receive more weights.
In MILBoost, the probability of an instance being positive is
The probability that a bag is positive is
The log-likelihood of all bags is
The main task is to train the classifier by maximizing this log-likelihood function.
Differential evolution (DE)
Differential evolution (DE) is a population-based evolutionary metaheuristic technique; used for solving the complex structured optimization problem in many application areas. DE was initially proposed by Storn and Price [23] in 1996. For a more profound understanding, article [26] could be referred. In general, DE formulation is divided into two phases—initialization and evolution. Initialization phase comprises random population generation, and evolution phase consists of mutation, crossover and selection for generating the new population for next generation. The flowchart for DE is presented in Fig. 1.
Operations in DE
Initialization
In this step, a set of the uniformly distributed random population is generated. These represent the initial solution points in the search space.
where G is the number of generations, NP is the number of individuals in population, D is dimension of an individual, lb and ub are lower and upper bounds respectively, r ∈ [0,1] is random number, i ∈ {1,2,…,NP}, j ∈ {1,2,…,D}.
Mutation
After population generation, mutation is performed to expand the search space. In mutation strategy, for each target vector a corresponding mutant vector is generated. DE has various mutation strategies. In this paper, the DE/rand/1 strategy is used to generate mutant vector \(V_{i} = (v_{1i} ,v_{2i} ,...,v_{Di} )\)
where Vi is mutant vector, F ∈ (0,1.2] is scaling factor, X are individuals in population and r1,r2,r3 ∈ {1,2,…NP} where r1 ≠ r2 ≠ r3 ≠ i.
Crossover
Crossover is performed between target vector and mutant vector to increase the diversity of the population and to assimilate the best individual. After the crossover, trial vectors are generated. For a trial vector \(U_{i} = (u_{1i} ,u_{2i} ,...,u_{Di} )\) -
where CR ∈ [0,1] is crossover probability, r ∈ [0,1] is random number and jr ∈ {1,2,…,D}.
Selection
Tournament selection is done between the trial and the target vector and the one having a better fitness value move on to the next generation.
where f(∙) is the objective function.
Methodology
DE is used in the MILBoost framework to optimize the log-likelihood of all bags as defined in Eq. (4). So, the objective function for DE in this work is the log-likelihood function. A population of αk, the classifier weights as defined in Eq. (1) is randomly initialized. The algorithm for the proposed Evolutionary MILBoost (EMILBoost) is presented below while it is pictorially represented through flowchart in Fig. 2.
As mentioned earlier, DE paves the path for parallelization of the optimization process. Unlike general optimization techniques, DE—a metaheuristic process approaches the optimal value from various directions. It generates multiple values in the search space as initial solutions and then converges towards the optimal point in the search space. Hence, rather than approaching from a single point, as DE approaches the problem from various directions, naturally parallelization will improve the process.
Experiments
Data
For this work, two classic MIL benchmark datasets are used—MUSK1 and MUSK2 [27] which are available in UCI Machine Learning Repository [1]. These correspond to the problem of predicting drug activity. A molecule has the desired drug effect if and only if one or more of its conformations bind to the target binding site. Since molecules can adopt multiple shapes, a bag is made up of shapes belonging to the same molecule. MUSK1 and MUSK2 contain 476 and 6598 instances respectively. MUSK2 is used for training data as it contains greater number of instances. MUSK1 is used as testing data. Both the datasets have total 168 attributes out of which 166 are features. The data attribute information is given in Table 1.
Apart from the aforementioned datasets, a standarad binary classification dataset used to test boosting frameworks—Hastie_10_2 is used in this book [28] which is available in the scikit-learn dataset library [29]. The Hastie_10_2 dataset has 10 attributes \({X}_{1},{X}_{2},\dots .,{X}_{10}\) which are standard independent Gaussian variates. The class is defined as
Experimental setup
Decision Tree classifiers with maximum depth of 1 are used as the weak classifiers. Log-sum-exp pooling is used for bag pooling. For, implementation smoothness, the negative of the log-likelihood function is taken and minimized. This is the same as maximizing the original function. Number of weak classifiers are varied to check its effects on training i.e. K ∈ {10,15,20,25,30,35,40}. DE parameters—Crossover probability = 0.7, Mutation strategy = best/1/bin, Scaling factor = [0.5, 1], number of generations = 1000, population size = 20.
Hardware and software specifications
Experiments have been conducted on Spyder 4.2.0 Integrated Development Environment (IDE) with Python 3.7.9 through Anaconda distribution on an Intel Xeon 2.5 GHz system with 16 GB RAM, Nvidia Quadro 2000 GPU and 64-bit Windows 10 Operating System.
Evaluation metrics
As this is a classification problem, hence the standard training error, testing error and confusion matrix of testing data are used as the evaluation metrics here. In MIL, we evaluate our model on the basis of bag classification accuracy. Therefore, bag training error, bag testing error are used here.
Results and discussions
The results of the proposed EMILBoost is compared with another two boosting frameworks—GentleBoost and LogitBoost [30]. Tables 2 and 3 records the testing and training errors for bags for the MUSK dataset. Bag testing and training error rate curves or convergence curves for different boosting frameworks are presented in Figs. 3 and 4 for MUSK and Hastie_10_2 datasets respectively. The training and testing error rate curves for different numbers of weak classifiers are presented in Figs. 5 and 6 for MUSK and Hastie_10_2 datasets respectively while Confusion matrix for different numbers of weak classifiers are presented in Fig. 7 for MUSK dataset.
From Tables 2 and 3, it is clear that EMILBoost achieves the lowest errors and hence outperforms GentleBoost and LogitBoost. Figures 3 and 4 also establishes the supremacy of EMILBoost.
From Fig. 5 and 6, it can be easily inferred that increasing the number of weak classifiers improve the learning process i.e. corresponds to lesser error.
The upper left block of confusion matrix signifies the True Positives (TP), lower right signifies True Negatives (TN) while lower left signifies False Positives (FP) and upper right signifies False Negatives (FN). Main aim of a classifier is to obtain more numbers of TP + TN and lesser numbers of FP + FN. From Fig. 7, it can be easily inferred that for EMILBoost framework, \(TP+TN>FP+FN\). Hence, the framework is performing as desired.
Conclusion
The main aim of this paper was to enhance the MILBoost framework through DE, a population-based evolutionary metaheuristic method by optimizing the weak classifier weights. DE also paves the path to parallelize this optimization process. The results show that the proposed EMILBoost outperforms GentleBoost and LogitBoost. Increasing the number of weak classifiers improves the learning process while on the other hand it increases the learning time. A trade-off between these two is needed through optimizing the number of weak classifiers which is a multi-objective problem. This can be regarded as the future extension of this work.
Availability of data and material
Dua and Graff [1].
Code availability
Not applicable.
References
Dua D, Graff C (2019) UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml. Accessed 20 Nov 2020
Quellec G, Cazuguel G, Cochener B, Lamard M (2017) Multiple-instance learning for medical image and video analysis. IEEE Rev Biomed Eng 10:213–234. https://doi.org/10.1109/RBME.2017.2651164
Viola P, Platt JC, Zhang C (2005) Multiple Instance Boosting for Object Detection. In: Advances in Neural Information Processing Systems. pp 1417–1424
Zhu S, Song D (2014) Human action recognition based on multiple instance learning. J Appl Sci 14:2276–2284. https://doi.org/10.3923/jas.2014.2276.2284
Wohlhart P, Köstinger M, Roth PM, Bischof H (2011) Multiple instance boosting for face recognition in videos. Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics). Springer, Berlin, Heidelberg, pp 132–141
Yamauchi Y, Fujiyoshi H (2011) Automatic generation of training samples and a learning method based on advanced MILBoost for human detection. In: 1st Asian Conference on Pattern Recognition, ACPR 2011. pp 603–607
Chen YT, Chen CS, Hung YP, Chang KY (2009) Multi-class multi-instance boosting for part-based human detection. In: 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops 2009. pp 1177–1184
Pang J, Huang Q, Jiang S, Gao W (2008) Pedestrian detection via logistic multiple instance boosting. In: Proceedings - International Conference on Image Processing, ICIP. pp 1464–1467
Shen J, Yang W, Sun C (2013) Real-time human detection based on gentle MILBoost with variable granularity HOG-CSLBP. Neural Comput Appl 23:1937–1948. https://doi.org/10.1007/s00521-012-1153-5
Ali K, Saenko K (2014) Confidence-rated multiple instance boosting for object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, pp 2433–2440
Qi Z, Xu Y, Wang L, Song Y (2011) Online multiple instance boosting for object detection. Neurocomputing 74:1769–1775. https://doi.org/10.1016/j.neucom.2011.02.011
Liu J, Lu Y, Zhou T (2016) Instance significance guided multiple instance boosting for robust visual tracking. In: Proceedings—International Conference on Image Processing, ICIP. IEEE Computer Society, pp 1694–1698
Zhang K, Song H (2013) Real-time visual tracking via online weighted multiple instance learning. Pattern Recognit 46:397–411. https://doi.org/10.1016/j.patcog.2012.07.013
Babenko B, Yang MH, Belongie S (2011) Robust object tracking with online multiple instance learning. IEEE Trans Pattern Anal Mach Intell 33:1619–1632. https://doi.org/10.1109/TPAMI.2010.226
Babenko B, Yang M-H, Belongie S (2010) Visual tracking with online Multiple Instance Learning. In: IEEE Conference on Computer Vision and Pattern Recognition. Institute of Electrical and Electronics Engineers (IEEE), pp 983–990
Bernardini M, Morettini M, Romeo L et al (2020) Early temporal prediction of type 2 diabetes risk condition from a general practitioner electronic health record: a multiple instance boosting approach. Artif Intell Med 105:101847. https://doi.org/10.1016/j.artmed.2020.101847
Fujita Y, Mitani Y, Hamamoto Y, et al (2016) Training ROI selection based on MILBoost for liver cirrhosis classification using ultrasound images. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer, pp 451–459
Xu Y, Zhu JY, Chang E, Tu Z (2012) Multiple clustered instance learning for histopathology cancer image classification, segmentation and clustering. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pp 964–971
Xu Y, Zhu JY, Chang EIC et al (2014) Weakly supervised histopathology cancer image segmentation and classification. Med Image Anal 18:591–604. https://doi.org/10.1016/j.media.2014.01.010
Li W, Zhang J, McKenna SJ (2015) Multiple instance cancer detection by boosting regularised trees. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer, pp 645–652
Mason L, Bartlett P, Baxter J, Frean M (1999) Boosting algorithms as gradient descent. In: Advances in Neural Information Processing Systems. pp 512–518
Holland JH (1992) Genetic algorithms. Sci Am 267:66–73. https://doi.org/10.2307/24939139
Storn R, Price K (1997) Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 11:341–359. https://doi.org/10.1023/A:1008202821328
Bhattacharjee K, Pant M, Zhang Y-D, Satapathy SC (2020) Multiple instance learning with genetic pooling for medical data analysis. Pattern Recognit Lett. https://doi.org/10.1016/j.patrec.2020.02.025
Bhattacharjee K, Tiwari A, Pant M, Ahn CW (2020) A pooling function based on differential evolution for multiple instance learning. In: Proceedings of 9th International Conference on Smart Media and Applications (SMA 2020). Jeju, South Korea
Bilal PM, Zaheer H et al (2020) Differential evolution: a review of more than two decades of research. Eng Appl Artif Intell 90:103479. https://doi.org/10.1016/j.engappai.2020.103479
Dietterich TG, Lathrop RH, Lozano-Pérez T (1997) Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 89:31–71. https://doi.org/10.1016/s0004-3702(96)00034-3
Hastie T, Tibshirani R, Friedman J (2009) The Elements of statistical learning. Springer, New York,
Hastie_10_2. https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_hastie_10_2.html. Accessed 5 Mar 2021
Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 28:337–407
Funding
Not applicable.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest or no competing interests regarding the publication of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Bhattacharjee, K., Pant, M. & Srivastava, S. Evolutionary multiple instance boosting framework for weakly supervised learning. Complex Intell. Syst. 8, 3131–3141 (2022). https://doi.org/10.1007/s40747-021-00469-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40747-021-00469-9