Abstract
In the present study, aimed at reliably acquiring difficult samples for object detection models from massive raw data, we propose a novel difficult sample mining strategy based on active learning with Weighted Minimum Bounds (WMB). To accurately gauge the difficulty of samples for object detection models, we introduce the concept of weighted minimum bounds. Uniquely, the metric for measuring sample difficulty includes the classification discrepancy within detection frames and a weight factor derived from the Average Precision (AP) of the object detection model on the val set. Additionally, we introduce the Don’t Care Area (DCA) to capture the uncertainty in localization tasks for object detection models. The DCA is utilized only during the data mining and training phases, ensuring that no additional time is incurred during inference. Furthermore, we propose a periodic and phased framework based on active learning for mining difficult samples, which can progressively identify challenging samples from unlabeled data and perform iterative optimization. To evaluate the effectiveness of our methods, we have collected the VANJEE-Image dataset and the VANJEE-PointCloud dataset from real-world scenarios. We empirically demonstrate the superiority of our approach, which outperforms traditional active learning methods on both image detection and point cloud detection datasets. The code and datasets are available at https://github.com/sharkls/WMBAL-Weighted-Minimum-Bounds-for-Active-Learning.
Similar content being viewed by others
References
Parvaneh A, Abbasnejad E, Teney D, Haffari GR, Van Den Hengel A, Shi JQ (2022) Active learning by feature mixing, pp 12237–12246
Hemmer P, Kühl N, Schöffer J (2022) Deal: deep evidential active learning for image classification. Deep Learn Appl 3:171–192
Yang C, An Z, Cai L, Xu Y (2022) Mutual contrastive learning for visual representation learning, vol 36, no 3, pp 3045–3053
Köksal A, Schick T, Schütze H (2022) Meal: stable and active learning for few-shot prompting. arXiv:2211.08358
Zheng Y, Gao Y, Lu S, Mosalam KM (2022) Multistage semisupervised active learning framework for crack identification, segmentation, and measurement of bridges. Comput Aided Civ Infrastruct Eng 37(9):1089–1108
Lu Q, Wei L (2021) Multiscale superpixel-based active learning for hyperspectral image classification. IEEE Geosci Remote Sens Lett 19:1–5
Cao X, Yao J, Xu Z, Meng D (2020) Hyperspectral image classification with convolutional neural network and active learning. IEEE Trans Geosci Remote Sens 58(7):4604–4616
Kothawade S, Savarkar A, Iyer V, Ramakrishnan G, Iyer R (2022) Clinical: targeted active learning for imbalanced medical image classification, 119–129. Springer
Sener O, Savarese S (2017) Active learning for convolutional neural networks: a core-set approach. arXiv:1708.00489
Agarwal S, Arora H, Anand S, Arora C (2020) Contextual diversity for active learning, 137–153. Springer
Xie B, Yuan L, Li S, Liu CH, Cheng X (2022) Towards fewer annotations: active learning via region impurity and prediction uncertainty for domain adaptive semantic segmentation, 8068–8078
Yadav CS, Pradhan MK, Gangadharan SMP, Chaudhary JK, Singh J, Khan AA, Haq MA, Alhussen A, Wechtaisong C, Imran H et al (2022) Multi-class pixel certainty active learning model for classification of land cover classes using hyperspectral imagery. Electronics 11(17):2799
Huang H, Liu Z, Chen CP, Zhang Y (2023) Hyperspectral image classification via active learning and broad learning system. Appl Intell 53(12):15683–15694
Zheng Y, Gao Y, Lu S, Mosalam KM (2022) Multistage semisupervised active learning framework for crack identification, segmentation, and measurement of bridges. Comput Aided Civ Infrastruct Eng 37(9):1089–1108
Wang S, Li Y, Ma K, Ma R, Guan H, Zheng Y (2020) Dual adversarial network for deep active learning, 680–696. Springer
Ho DJ, Agaram NP, Schüffler PJ, Vanderbilt CM, Jean M-H, Hameed MR, Fuchs TJ (2020) Deep interactive learning: an efficient labeling approach for deep learning-based osteosarcoma treatment response assessment, 540–549. Springer
Kim D-J, Cho JW, Choi J, Jung Y, Kweon IS (2021) Single-modal entropy based active learning for visual question answering. arXiv:2110.10906
Ash JT, Zhang C, Krishnamurthy A, Langford J, Agarwal A (2019) Deep batch active learning by diverse, uncertain gradient lower bounds. arXiv:1906.03671
Yi JSK, Seo M, Park J, Choi D-G (2022) Pt4al: using self-supervised pretext tasks for active learning, 596–612. Springer
Xu Y, Ma L, Xiao W (2019) Active learning with spatial distribution based semi-supervised extreme learning machine for multiclass classification, 1–5. IEEE
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Advances in neural information processing systems 27
Cho JW, Kim D-J, Jung Y, Kweon IS (2022) Mcdal: maximum classifier discrepancy for active learning. IEEE Trans Neural Netw Learn Syst 1–11
Yoo D, Kweon IS (2019) Learning loss for active learning, 93–102
Zhang B, Li L, Yang S, Wang S, Zha Z-J, Huang Q (2020) State-relabeling adversarial active learning, 8756–8765
Aghdam HH, Gonzalez-Garcia A, Weijer Jvd, López AM (2019) Active learning for deep detection neural networks, 3672–3680
Angluin D (1988) Queries and concept learning. Mach Learn 2:319–342
Dagan I, Engelson SP (1995) Committee-based sampling for training probabilistic classifiers, 150–157
King RD, Whelan KE, Jones FM, Reiser PG, Bryant CH, Muggleton SH, Kell DB, Oliver SG (2004) Functional genomic hypothesis generation and experimentation by a robot scientist. Nature 427(6971):247–252
Krishnamurthy V (2002) Algorithms for optimal scheduling and management of hidden markov model sensors. IEEE Trans Signal Process 50(6):1382–1397
Schröder C, Niekler A, Potthast M (2021) Revisiting uncertainty-based query strategies for active learning with transformers. arXiv:2107.05687
Buchert F, Navab N, Kim ST (2022) Exploiting diversity of unlabeled data for label-efficient semi-supervised active learning, 2063–2069. IEEE
Shui C, Zhou F, Gagné C, Wang B (2020) Deep active learning: unified and principled method for query and training, 1308–1318. PMLR
Dor LE, Halfon A, Gera A, Shnarch E, Dankin L, Choshen L, Danilevsky M, Aharonov R, Katz Y, Slonim N (2020) Active learning for bert: an empirical study, 7949–7962
Kumar P, Gupta A (2020) Active learning query strategies for classification, regression, and clustering: a survey. J Comput Sci Technol 35:913–945
Yu W, Zhu S, Yang T, Chen C (2022) Consistency-based active learning for object detection, 3951–3960
Tong S (2001) Active learning: theory and applications
Xie J, Ma Z, Lei J, Zhang G, Xue J-H, Tan Z-H, Guo J (2021) Advanced dropout: a model-free methodology for Bayesian dropout optimization. IEEE Trans Pattern Anal Mach Intell 44(9):4605–4625
Ho DJ, Agaram NP, Schüffler PJ, Vanderbilt CM, Jean M-H, Hameed MR, Fuchs TJ (2020) Deep interactive learning: an efficient labeling approach for deep learning-based osteosarcoma treatment response assessment, 540–549. Springer
Schumann R, Rehbein I (2019) Active learning via membership query synthesis for semi-supervised sentence classification, 472–481
Vaith A, Taetz B, Bleser G (2020) Uncertainty based active learning with deep neural networks for inertial gait analysis, 1–8. IEEE
Sinha S, Ebrahimi S, Darrell T (2019) Variational adversarial active learning, 5972–5981
Wang X, Xiang X, Zhang B, Liu X, Zheng J, Hu Q (2022) Weakly supervised object detection based on active learning. Neural Process Lett 54(6):5169–5183
Wang S, Li Y, Ma K, Ma R, Guan H, Zheng Y (2020) Dual adversarial network for deep active learning, 680–696. Springer
Zhang B, Li L, Yang S, Wang S, Zha Z-J, Huang Q (2020) State-relabeling adversarial active learning, 8756–8765
Gal Y, Ghahramani Z (2016) Dropout as a Bayesian approximation: representing model uncertainty in deep learning, 1050–1059. PMLR
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lu, S., Zheng, J., Li, Z. et al. WMBAL: weighted minimum bounds for active learning. Appl Intell 54, 2551–2563 (2024). https://doi.org/10.1007/s10489-024-05328-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-024-05328-x