Abstract
In real-world cases, handling both labeled and unlabeled data has raised the interest of several Data Scientists and Machine Learning engineers, leading to several demonstrations that apply data-augmenting approaches in order to obtain a robust and, at the same time, accurate enough learning behavior. The main reason is the existence of much unlabeled data that are ignored by conventional supervised approaches, reducing the chance of enriching the final formatted hypothesis. However, the majority of the proposed methods that operate using both kinds of these data are oriented toward exploiting only one category of these algorithms, without combining their strategies. Since the most popular of them regarding the classification task are Active and Semi-supervised Learning approaches, we aim to design a framework that combines both of them trying to fuse their advantages during the main core of the learning process. Thus, we conduct an empirical evaluation of such a combinatory approach over three problems, which stem from various fields but are all tackled through the use of acoustical signals, operating under the pool-based scenario: gender identification, emotion detection and automatic speaker recognition. Into the proposed combinatory framework, which operates under training sets with small cardinality, our results prove the benefits of adopting such kind of semi-automated approaches regarding both the achieved predictive correctness when reduced consumption of resources takes place, as well as the smoothness of the learning convergence. Several learners have been examined for reaching to more general conclusions, and a variant of self-training scheme has been also examined.
Similar content being viewed by others
References
Khamassi I, Sayed-Mouchaweh M, Hammami M, Ghédira K (2018) Discussion and review on evolving data streams and concept drift adapting. Evol Syst 9:1–23. https://doi.org/10.1007/s12530-016-9168-2
Shayaa S, Jaafar NI, Bahri S, Sulaiman A, Seuk Wai P, Wai Chung Y, Piprani AZ, Al-Garadi MA (2018) Sentiment analysis of big data: methods, applications, and open challenges. IEEE Access 6:37807–37827. https://doi.org/10.1109/ACCESS.2018.2851311
Nguyen AT, Wallace BC, Lease M (2015) Combining crowd and expert labels using decision theoretic active learning. In: HCOMP. pp 120–129
Schwenker F, Trentin E (2014) Pattern classification and clustering: a review of partially supervised learning approaches. Pattern Recognit Lett 37:4–14. https://doi.org/10.1016/j.patrec.2013.10.017
Kostopoulos G, Karlos S, Kotsiantis S, Ragos O (2018) Semi-supervised regression: a recent review. J Intell Fuzzy Syst 35:1483–1500. https://doi.org/10.3233/JIFS-169689
Settles B (2012) Active learning. Morgan & Claypool Publishers, San Rafael
Akyürek HA, Koçer B (2019) Semi-supervised fuzzy neighborhood preserving analysis for feature extraction in hyperspectral remote sensing images. Neural Comput Appl 31:3385–3415. https://doi.org/10.1007/s00521-017-3279-y
Liu W, Zhang L, Tao D, Cheng J (2017) Support vector machine active learning by Hessian regularization. J Vis Commun Image Represent 49:47–56. https://doi.org/10.1016/j.jvcir.2017.08.001
Long B, Bian J, Chapelle O, Zhang Y, Inagaki Y, Chang Y (2015) Active learning for ranking through expected loss optimization. IEEE Trans Knowl Data Eng 27:1180–1191. https://doi.org/10.1109/TKDE.2014.2365785
Freund Y, Seung HS, Shamir E, Tishby N (1997) Selective sampling using the query by committee algorithm. Mach Learn 28:133–168. https://doi.org/10.1023/A:1007330508534
Granell E, Romero V, Martínez-Hinarejos CD (2018) Multimodality, interactivity, and crowdsourcing for document transcription. Comput Intell 34:398–419. https://doi.org/10.1111/coin.12169
Elahi M, Ricci F, Rubens N (2016) A survey of active learning in collaborative filtering recommender systems. Comput Sci Rev 20:29–50. https://doi.org/10.1016/j.cosrev.2016.05.002
Zhang C (2015) Active learning from weak and strong labelers. In: NIPS. pp 703–711
Karlos S, Fazakis N, Kotsiantis S, Sgarbas K (2016) A semisupervised cascade classification algorithm. Appl Comput Intell Soft Comput 2016:14. https://doi.org/10.1155/2016/5919717
Triguero I, García S, Herrera F (2015) Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl Inf Syst 42:245–284. https://doi.org/10.1007/s10115-013-0706-y
Kang P, Kim D, Cho S (2016) Semi-supervised support vector regression based on self-training with label uncertainty: an application to virtual metrology in semiconductor manufacturing. Expert Syst Appl 51:85–106. https://doi.org/10.1016/j.eswa.2015.12.027
Dalal MK, Zaveri MA (2013) Semisupervised learning based opinion summarization and classification for online product reviews. Appl Comput Intell Soft Comput 2013:1–8. https://doi.org/10.1155/2013/910706
Wu D, Luo X, Wang G, Shang M, Yuan Y, Yan H (2018) A highly accurate framework for self-labeled semisupervised classification in industrial applications. IEEE Trans Ind Inform 14:909–920. https://doi.org/10.1109/TII.2017.2737827
Wang Y, Xu X, Zhao H, Hua Z (2010) Semi-supervised learning based on nearest neighbor rule and cut edges. Knowl Based Syst 23:547–554. https://doi.org/10.1016/j.knosys.2010.03.012
Sabata T, Pulc P, Holena M (2018) Semi-supervised and active learning in video scene classification from statistical features. In: Krempl G, Lemaire V, Kottke D, Calma A, Holzinger A, Polikar R, Sick B (eds.), IAL@PKDD/ECML. CEUR-WS.org, pp 24–35
Yarowsky D, David (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd annual meeting on association for computational linguistics. Association for Computational Linguistics, Morristown, NJ, USA, pp 189–196
Potapova R, Potapov V (2016) On Individual Polyinformativity of Speech and Voice Regarding Speakers Auditive Attribution (Forensic Phonetic Aspect). Speech and Computer. SPECOM. Lecture Notes in Computer Science, vol 9811. Springer, Cham, pp 507–514
Kunešová M, Radová V (2015) Ideas for clustering of similar models of a speaker in an online speaker diarization system. TSD. Springer, Cham, pp 225–233
McCallumzy Andrew Kachites;Nigamy K (1998) Employing EM and pool-based active learning for text classification. In: ICML. pp 350–358
Muslea I, Minton S, Knoblock CA (2002) Active+ semi-supervised learning = robust multi-view learning. In: ICML. pp 435–442
Zhou Z-H, Chen K-J, Dai H-B (2006) Enhancing relevance feedback in image retrieval using unlabeled data. ACM Trans Inf Syst 24:219–244. https://doi.org/10.1145/1148020.1148023
Hanneke S (2014) Theory of disagreement-based active learning. Found Trends® Mach Learn 7:131–309. https://doi.org/10.1561/2200000037
Zhou ZH, Li M (2010) Semi-supervised learning by disagreement. Knowl Inf Syst 24:415–439. https://doi.org/10.1007/s10115-009-0209-z
Yu D, Varadarajan B, Deng L, Acero A (2010) Active learning and semi-supervised learning for speech recognition: a unified framework using the global entropy reduction maximization criterion. Comput Speech Lang 24:433–444. https://doi.org/10.1016/j.csl.2009.03.004
Hajmohammadi MS, Ibrahim R, Selamat A, Fujita H (2015) Combination of active learning and self-training for cross-lingual sentiment classification with density analysis of unlabelled samples. Inf Sci (Ny) 317:67–77
Han W, Coutinho E, Ruan H, Li H, Schuller B, Yu X, Zhu X (2016) Semi-supervised active learning for sound classification in hybrid learning environments. PLoS ONE 11:1–23. https://doi.org/10.1371/journal.pone.0162075
Tran VC, Nguyen NT, Fujita H, Hoang DT, Hwang D (2017) A combination of active learning and self-learning for named entity recognition on Twitter using conditional random fields. Knowl Based Syst 132:179–187. https://doi.org/10.1016/J.KNOSYS.2017.06.023
Calma A, Reitmaier T, Sick B (2018) Semi-supervised active learning for support vector machines: a novel approach that exploits structure information in data. Inf Sci (Ny) 456:13–33. https://doi.org/10.1016/J.INS.2018.04.063
Reitmaier T, Sick B (2013) Let us know your decision: Pool-based active training of a generative classifier with the selection strategy 4DS. Inf Sci (Ny) 230:106–131. https://doi.org/10.1016/J.INS.2012.11.015
Ding S, Zhu Z, Zhang X (2017) An overview on semi-supervised support vector machine. Neural Comput Appl 28:969–978. https://doi.org/10.1007/s00521-015-2113-7
van Engelen JE, Hoos HH (2020) A survey on semi-supervised learning. Mach Learn 109:373–440. https://doi.org/10.1007/s10994-019-05855-6
Hou S, Liu H, Sun Q (2019) Sparse regularized discriminative canonical correlation analysis for multi-view semi-supervised learning. Neural Comput Appl 31:7351–7359. https://doi.org/10.1007/s00521-018-3582-2
Hwa R, Osborne M, Sarkar A, Steedman M (2003) Corrected Co-training for Statistical Parsers. In: ICML 2003
Wang W, Zhou Z-H (2008) On multi-view active learning and the combination with semi-supervised learning. In: Proceedings of the 25th international conference on machine learning. association for computing machinery, New York, NY, USA, pp 1152–1159
Huang L, Liu Y, Liu X, Wang X, Lang B (2014) Graph-based active semi-supervised learning: a new perspective for relieving multi-class annotation labor. In: 2014 IEEE international conference on multimedia and expo (ICME). IEEE, pp 1–6
Li M, Zhou Z-H (2005) {SETRED:} Self-training with Editing. In: Ho TB, Cheung DW-L, Liu H (eds.), Advances in Knowledge Discovery and Data Mining, 9th Pacific-Asia Conf. {PAKDD}, Hanoi, Vietnam, Proceedings, Springer, pp 611–621. https://doi.org/10.1007/11430919_71
Tur G, Hakkani-Tür D, Schapire RE (2005) Combining active and semi-supervised learning for spoken language understanding. Speech Commun 45:171–186. https://doi.org/10.1016/J.SPECOM.2004.08.002
Yu C, Hansen JHL (2017) Active learning based constrained clustering for speaker diarization. IEEE/ACM Trans Audio Speech Lang Process 25:2188–2198
Gender Recognition by Voice | Kaggle. https://www.kaggle.com/primaryobjects/voicegender
Cummins F, Grimaldi M, Leonard T, Simko J (2006) The CHAINS Speech Corpus: CHAracterizing INdividual Speakers. In: Proc SPECOM, pp 1–6
Wang J-C, Wang C-Y, Chin Y-H, Liu Y-T, Chen E-T, Chang P-C (2017) Spectral-temporal receptive fields and MFCC balanced feature extraction for robust speaker recognition. Multimed Tools Appl 76:4055–4068. https://doi.org/10.1007/s11042-016-3335-0
Karlos S, Fazakis N, Karanikola K, Kotsiantis S, Sgarbas K (2016) Speech recognition combining MFCCs and image features. In: Speech and Computer. SPECOM 2016, LNCS (LNAI). Springer, Cham, pp 651–658
Chatzichristofis SA, Boutalis YS (2008) FCTH: Fuzzy color and texture histogram—a low level feature for accurate image retrieval. In: 2008 ninth international workshop on image analysis for multimedia interactive services. IEEE, pp 191–196
Klaylat S, Osman Z, Zantout R, Hamandi L (2018) Arabic Natural Audio Dataset, v1. In: Mendeley Data. https://data.mendeley.com/datasets/xm232yxf7t/1
Karlos S, Kanas VG, Aridas C, Fazakis N, Kotsiantis S (2019) Combining active learning with self-train algorithm for classification of multimodal problems. In: 10th international conference on information, intelligence, systems and applications (IISA). IEEE, pp 1–8
Qin Y, Langari R, Wang Z, Xiang C, Dong M (2017) Road excitation classification for semi-active suspension system with deep neural networks. J Intell Fuzzy Syst 33:1907–1918. https://doi.org/10.3233/JIFS-161860
Demiröz G, Güvenir HA (1997) Classification by voting feature intervals. Springer, Berlin, Heidelberg, pp 85–92
Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63:3–42. https://doi.org/10.1007/s10994-006-6226-1
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on ImageNet classification
Cai Y, Ji D, Cai D (2010) A KNN research paper classification method based on shared nearest neighbor. In: Proceedings of the 8th NTCIR Work Meet Eval Inf Access Technol Inf Retrieval, Quest Answering Cross-Lingual Inf Access, pp 336–340
Chen H, Liu W, Wang L (2016) Naive Bayesian classification of uncertain objects based on the theory of interval probability. Int J Artif Intell Tools 25:1–31. https://doi.org/10.1142/S0218213016500123
Aridas CK (2020) vfi: Classification by voting feature intervals in Python
Buitinck L, Louppe G, Blondel M, Pedregosa F, Müller AC, Grisel O, Niculae V, Prettenhofer P, Gramfort A, Grobler J, Layton R, Vanderplas J, Joly A, Holt B, Varoquaux G (2013) API design for machine learning software: experiences from the scikit-learn project
Saito T, Rehmsmeier M (2015) The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE 10:1–21. https://doi.org/10.1371/journal.pone.0118432
Rodríguez-Fdez I, Canosa A, Mucientes M, Bugarín A (2015) STAC: a web platform for the comparison of algorithms using statistical tests. In: FUZZ-IEEE. pp 1–8
Hollander M, Wolfe DA, Chicken E (2013) Nonparametric statistical methods, 3rd edn. Wiley, Hoboken
Holzinger A (2016) Interactive machine learning for health informatics: when do we need the human-in-the-loop? Brain Inform 3:119–131. https://doi.org/10.1007/s40708-016-0042-6
Singh A, Nowak R, Zhu J (2008) Unlabeled data: now it helps, now it doesn’t. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds.), NIPS. Curran Associates, Inc., pp 1513–1520
Leng Y, Xu X, Qi G (2013) Combining active learning and semi-supervised learning to construct SVM classifier. Knowl Based Syst 44:121–131. https://doi.org/10.1016/J.KNOSYS.2013.01.032
Reitmaier T, Calma A, Sick B (2015) Transductive active learning—a new semi-supervised learning approach based on iteratively refined generative models to capture structure in data. Inf Sci (Ny) 293:275–298. https://doi.org/10.1016/J.INS.2014.09.009
Batista AJL, Campello RJGB, Sander J (2016) Active semi-supervised classification based on multiple clustering hierarchies. In: DSAA. pp 11–20
Wang Q, Downey C, Wan L, Mansfield PA, Moreno IL (2017) Speaker Diarization with LSTM
I. Del Carmen Grau Garcia D. Sengupta MMGL, Nowé A (2018) Interpretable self-labeling semi-supervised classifier. In: Proceedings of the 2nd workshop on explainable artificial intelligence
Ioannis M, Nick B, Ioannis V, Grigorios T (2020) LionForests: local interpretation of random forests. In: Alessandro S, Luciano S, Paul L (eds.), First international workshop on new foundations for human-centered AI (NeHuAI 2020), Aachen, pp 17–24
Wang X, Wen J, Alam S, Jiang Z, Wu Y (2016) Semi-supervised learning combining transductive support vector machine with active learning. Neurocomputing 173:1288–1298. https://doi.org/10.1016/j.neucom.2015.08.087
Yan J, Song Y, Dai LR, McLoughlin I (2020) Task-Aware Mean Teacher Method for Large Scale Weakly Labeled Semi-Supervised Sound Event Detection. In: Proceedings of the ICASSP, IEEE international conference on acoustics, speech and signal processing. Institute of Electrical and Electronics Engineers Inc., pp 326–330
Kee S, del Castillo E, Runger G (2018) Query-by-committee improvement with diversity and density in batch active learning. Inf Sci (Ny) 454–455:401–418. https://doi.org/10.1016/j.ins.2018.05.014
Huang E, Pao H, Lee Y (2017) Big active learning. In: BigData. pp 94–101
Hsu W-N, Lin H-T (2015) Active learning by learning. In: AAAI conference on artificial intelligence, pp 2659–2665
Yue Y, Broder J, Kleinberg R, Joachims T (2012) The K-armed dueling bandits problem. J Comput Syst Sci 78:1538–1556. https://doi.org/10.1016/J.JCSS.2011.12.028
Huang S-J, Jin R, Zhou Z-H (2014) Active learning by querying informative and representative examples. IEEE Trans Pattern Anal Mach Intell 36:1936–1949
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Karlos, S., Aridas, C., Kanas, V.G. et al. Classification of acoustical signals by combining active learning strategies with semi-supervised learning schemes. Neural Comput & Applic 35, 3–20 (2023). https://doi.org/10.1007/s00521-021-05749-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-05749-6