Abstract
Multi-view learning consistently outperforms traditional single-view learning by leveraging multiple perspectives of data. However, the effectiveness of multi-view learning heavily relies on how the data are partitioned into feature sets. In many cases, different datasets may require different partitioning methods to capture their unique characteristics, making a single partitioning method insufficient. Finding an optimal feature set partitioning (FSP) for each dataset may be a time-consuming process, and the optimal FSP may still not be sufficient for all types of datasets. Therefore, the paper presents a novel approach called ensemble multi-view feature set partitioning (EMvFSP) to improve the performance of multi-view learning, a technique that uses multiple data sources to make predictions. The proposed EMvFSP method combines the different views produced by multiple partitioning methods to achieve better classification performance than any single partitioning method alone. The experiments were conducted on 15 structured datasets with varying ratios of samples, features, and labels, and the results showed that the proposed EMvFSP method effectively improved classification performance. The paper also includes statistical analyses using Friedman ranking and Holms procedure to demonstrate the effectiveness of the proposed method. This approach provides a robust solution for multi-view learning that can adapt to different types of datasets and partitioning methods, making it suitable for a wide range of applications.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10115-024-02114-6/MediaObjects/10115_2024_2114_Figa_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10115-024-02114-6/MediaObjects/10115_2024_2114_Figb_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10115-024-02114-6/MediaObjects/10115_2024_2114_Figc_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10115-024-02114-6/MediaObjects/10115_2024_2114_Figd_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10115-024-02114-6/MediaObjects/10115_2024_2114_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10115-024-02114-6/MediaObjects/10115_2024_2114_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10115-024-02114-6/MediaObjects/10115_2024_2114_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10115-024-02114-6/MediaObjects/10115_2024_2114_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10115-024-02114-6/MediaObjects/10115_2024_2114_Fig5_HTML.png)
Similar content being viewed by others
Data availability
The datasets analyzed during the current study are available in the following repositories: Uci machine learning repository, http://archive.ics.uci.edu/ml/index.php. Kent ridge bio-medical dataset, http://datam.i2r.a-star.edu.sg/datasets/krbd/index.html. Uci machine learning repository: Gisette data set, http://archive.ics.uci.edu/ml/datasets/Gisette?ref=datanews.io. Uci machine learning repository: Arcene data set, http://archive.ics.uci.edu/ml/datasets/Arcene?ref=datanews.io. Centralnervoussystem-iccr, https://www.iccr-cancer.org/datasets/published-datasets/central-nervous-system/. Colon cancer datasets-biogps, http://biogps.org/dataset/tag/colon%20cancer/. Data repository-debt (Stanford), https://leo.ugr.es/elvira/DBCRepository/DLBCL/DLBCL-Stanford.html. Leukemia classification-kaggle, https://www.kaggle.com/datasets/andrewmvd/leukemia-classification. Air quality-lung cancer data-Harvard dataverse, https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/HMOEJO. Data repository-lung cancer, https://leo.ugr.es/elvira/DBCRepository/LungCancer/LungCancer-Michigan.html. Uci machine learning repository: Madelon data set, http://archive.ics.uci.edu/ml/datasets/Madelon?ref=datanews.io. Prostate - datasets - plco - the cancer data access system, https://cdas.cancer.gov/datasets/plco/20/. Uci machine learning repository: Secom data set, https://archive.ics.uci.edu/ml/datasets/SECOM.
References
Zhao J, Xie X, Xu X, Sun S (2017) Multi-view learning overview: recent progress and new challenges. Inf Fus 38:43–54. https://doi.org/10.1016/j.inffus.2017.02.007
Yang Y, Wang H (2018) Multi-view clustering a survey. Big Data Min Anal 1(2):83–107
Xu C, Tao D, Xu C (2013) A survey on multi-view learning, arXiv preprint arXiv:1304.5634. https://doi.org/10.48550/arXiv.1304.5634
Nan F, Tang Y, Yang P, He Z, Yang Y (2021) A novel sub-kmeans based on co-training approach by transforming single-view into multi-view. Futur Gener Comput Syst 125:831–843. https://doi.org/10.1016/j.future.2021.07.019
Liu J, Liu X, Yang Y, Guo X, Kloft M, He L (2021) Multiview subspace clustering via co-training robust data representation. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2021.3069424
Zhang X, Zhao Dy, Chen Lw, Min Wh (2009) Batch mode active learning based multi-view text classification, In: Sixth International Conference on Fuzzy Systems and Knowledge Discovery, Vol. 7, IEEE, , (pp. 472–476). https://doi.org/10.1109/FSKD.2009.495
Foster DP, Kakade SM, Zhang T (2008) Multi-view dimensionality reduction via canonical correlation analysis
Rokach L (2010) Pattern classification using ensemble methods, vol 75. World Scientific, Singapore
Cai W, Zhou H, Xu L (2021) A multi-view co-training clustering algorithm based on global and local structure preserving. IEEE Access 9:29293–29302. https://doi.org/10.1109/ACCESS.2021.3056677
Tao J, Wu Z-G, Su H, Wu Y, Zhang D (2018) Asynchronous and resilient filtering for Markovian jump neural networks subject to extended dissipativity. IEEE Trans Cybern 49(7):2504–2513. https://doi.org/10.1109/TCYB.2018.2824853
Kumar V, Minz S (2016) Multi-view ensemble learning: an optimal feature set partitioning for high-dimensional data classification. Knowl Inf Syst 49(1):1–59. https://doi.org/10.1007/s10115-015-0875-y
Kumar V, Minz S (2014) Feature selection: a literature review. Smart Comput Rev 4(3):211–229. https://doi.org/10.6029/smartcr.2014.03.007
Muslea I, Minton S, Knoblock CA (2002) Active+ semi-supervised learning= robust multi-view learning, In: ICML, Vol. 2, Citeseer, (pp. 435–442)
Ding Z, Shao M, Fu Y (2018) Robust multi-view representation: a unified perspective from multi-view learning to domain adaption, In: IJCAI, https://doi.org/10.24963/ijcai.2018/767
Kumar V, S. Minz S (2015) Multi-view ensemble learning: a supervised feature set partitioning for high dimensional data classification, In: Proceedings of the Third International Symposium on Women in Computing and Informatics, (pp. 31–37). https://doi.org/10.1145/2791405.2791443
Wang C, Huang Y, Ding W, Cao Z (2021) Attribute reduction with fuzzy rough self-information measures. Inf Sci 549:68–86. https://doi.org/10.1016/j.ins.2020.11.021
Wang F, Liang J, Dang C (2013) Attribute reduction for dynamic data sets. Appl Soft Comput 13(1):676–689. https://doi.org/10.1016/j.asoc.2012.07.018
Alam MT, Kumar V, Kumar A (2021) A multi-view convolutional neural network approach for image data classification, In: International Conference on Communication information and Computing Technology (ICCICT), IEEE, (pp. 1–6). https://doi.org/10.1109/ICCICT50803.2021.9509943
Ning X, Wang X, Xu S, Cai W, Zhang L, Yu L, Li W (2021) A review of research on co-training. Concurr Comput Pract Experience 32:e6276. https://doi.org/10.1002/cpe.6276
Hussain T, Muhammad K, Ding W, Lloret J, Baik SW, de Albuquerque VHC (2021) A comprehensive survey of multi-view video summarization. Pattern Recogn 109:107567. https://doi.org/10.1016/j.patcog.2020.107567
Woźniak M, Krawczyk B (2012) Combined classifier based on feature space partitioning. Int J Appl Math Comput Sci 22(4):855–866. https://doi.org/10.2478/v10006-012-0063-0
Kim H, Kim H, Moon H, Ahn H (2011) A weight-adjusted voting algorithm for ensembles of classifiers. J Korean Stat Soc 40(4):437–449. https://doi.org/10.1016/j.jkss.2011.03.002
Dasgupta S, Littman M, McAllester D (2001) Pac generalization bounds for co-training, In: Advances in neural information processing systems, vol. 14
Gonçalves CA, Vieira AS, Gonçalves CT, Camacho R, Iglesias EL, Diz LB (2022) A novel multi-view ensemble learning architecture to improve the structured text classification. Information 13(6):283. https://doi.org/10.3390/info13060283
Garcia-Ceja E, Galván-Tejada CE, Brena R (2018) Multi-view stacking for activity recognition with sound and accelerometer data. Inf Fus 40:45–56. https://doi.org/10.1016/j.inffus.2017.06.004
Chang X, Yang Y, WangH (2018) Multi-view construction for clustering based on feature set partitioning, In: International Joint Conference on Neural Networks (IJCNN), IEEE, (pp. 1–8). https://doi.org/10.1109/IJCNN.2018.8489615
Pagliaro P, Femminò S, Penna C (2019) Redox aspects of myocardial ischemia/reperfusion injury and cardioprotection. Oxidative stress in heart diseases. Springer, Cham, pp 289–324. https://doi.org/10.1007/978-981-13-8273-4_13
Debie E, Shafi K, Lokan C, Merrick K (2013) Performance analysis of rough set ensemble of learning classifier systems with differential evolution based rule discovery. Evol Intel 6(2):109–126. https://doi.org/10.1007/s12065-013-0093-z
Stańczyk U, Zielosko B (2020) Heuristic-based feature selection for rough set approach. Int J Approx Reason 125:187–202. https://doi.org/10.1016/j.ijar.2020.07.005
Omuya EO, Okeyo GO, Kimwele MW (2021) Feature selection for classification using principal component analysis and information gain. Expert Syst Appl 174:114765. https://doi.org/10.1016/j.eswa.2021.114765
Y. Piao, M. Piao, C. H. Jin, H. S. Shon, J.-M. Chung, B. Hwang, K. H. Ryu, A new ensemble method with feature space partitioning for high-dimensional data classification, Mathematical Problems in Engineering 2015 (2015). https://doi.org/10.1155/2015/590678
Kumar A, Kumar V, Kumari S (2021) A graph coloring based framework for views construction in multi-view ensemble learning, In: 2021 2nd International Conference on Secure Cyber Computing and Communications (ICSCCC), IEEE, (pp. 84–89). https://doi.org/10.1109/ICSCCC51823.2021.9478138
Kumar V, Aydav PSS, Minz S (2021) Multi-view ensemble learning using multi-objective particle swarm optimization for high dimensional data classification. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2021.08.029
Rokach L (2008) Genetic algorithm-based feature set partitioning for classification problems. Pattern Recogn 41(5):1676–1700. https://doi.org/10.1016/j.patcog.2007.10.013
Amini F, Hu G (2021) A two-layer feature selection method using genetic algorithm and elastic net. Expert Syst Appl 166:114072. https://doi.org/10.1016/j.eswa.2020.114072
Calzavara S, Lucchese C, Marcuzzi F, Orlando S (2021) Feature partitioning for robust tree ensembles and their certification in adversarial scenarios. EURASIP J Inf Secur 2021:1–17
Guggari S, Kadappa V, Umadevi V (2018) Non-sequential partitioning approaches to decision tree classifier. Future Comput Inf J 3(2):275–285. https://doi.org/10.1016/j.fcij.2018.06.003
Nutheti PSD, Hasyagar N, Shettar R, Guggari S, Umadevi V (2020) Ferrer diagram based partitioning technique to decision tree using genetic algorithm. Int J Math Sci Comput 6:25–32. https://doi.org/10.5815/ijmsc.2020.01.03
Guggari S, Kadappa V, Umadevi V, Abraham A (2020) Music rhythm tree based partitioning approach to decision tree classifier. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2020.03.015
Imani V, Sevilla-Salcedo C, Fortino V, Tohka J (2023) Multi-objective genetic algorithm for multi-view feature selection, arXiv preprint arXiv:2305.18352. https://doi.org/10.48550/arXiv.2004.03295
Du X, Zhang W, Alvarez JM (2021) Boosting supervised learning performance with co-training, In: 2021 IEEE Intelligent Vehicles Symposium (IV). IEEE (pp. 540–545). https://doi.org/10.1109/IV48863.2021.9575963
Mohammed AM, Onieva E, Woźniak M (2019) Vertical and horizontal data partitioning for classifier ensemble learning, In: International Conference on Computer Recognition Systems, Springer, (pp. 86–97)
Lopez-Garcia P, Masegosa AD, Osaba E, Onieva E, Perallos A (2019) Ensemble classification for imbalanced data based on feature space partitioning and hybrid metaheuristics. Appl Intell 49(8):2807–2822
Raza K (2019) Improving the prediction accuracy of heart disease with ensemble learning and majority voting rule. U-healthcare monitoring systems. Elsevier, pp 179–196
Liu Y, Jiang C, Zhao H (2018) Using contextual features and multi-view ensemble learning in product defect identification from online discussion forums. Decis Support Syst 105:1–12
Seetha H, Murty MN, Saravanan R (2016) Classification by majority voting in feature partitions. Int J Inf Decis Sci 8(2):109–124
C. Christoudias, R. Urtasun, T. Darrell, Multi-view learning in the presence of view disagreement, arXiv preprint arXiv:1206.3242 (2012)
Christoudias CM, Urtasun R, Kapoorz A, Darrell T (2009) Co-training with noisy perceptual observations, In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (pp. 2844–2851). https://doi.org/10.1109/CVPR.2009.5206572
Shahzad RK, Lavesson N (2013) Comparative analysis of voting schemes for ensemble-based malware detection. J Wirel Mob Netw Ubiquitous Comput Depend Appl 4(1):98–117
Uci machine learning repository, http://archive.ics.uci.edu/ml/index.php
Kent ridge bio-medical dataset, http://datam.i2r.a-star.edu.sg/datasets/krbd/index.html
Uci machine learning repository: Arcene data set, http://archive.ics.uci.edu/ml/datasets/Arcene?ref=datanews.io
Central nervous system - iccr, https://www.iccr-cancer.org/datasets/published-datasets/central-nervous-system/
Colon cancer datasets biogps, http://biogps.org/dataset/tag/colon%20cancer/
Data repository – dlbcl (stanford), https://leo.ugr.es/elvira/DBCRepository/DLBCL/DLBCL-Stanford.html
Leukemia classification kaggle, https://www.kaggle.com/datasets/andrewmvd/leukemia-classification
Air quality-lung cancer data - harvard dataverse, https://dataverse.harvard.edu/dataset.xhtml?persistentId=https://doi.org/10.7910/DVN/HMOEJO.
Data repository – lung cancer, https://leo.ugr.es/elvira/DBCRepository/LungCancer/LungCancer-Michigan.html
Lofters AK, Gatov E, Lu H, Baxter NN, Guilcher SJ, Kopp A, Vahabi M, Datta GD (2021) Lung cancer inequalities in stage of diagnosis in Ontario, Canada. Curr Oncol 28(3):1946–1956
Uci machine learning repository:madelon data set, http://archive.ics.uci.edu/ml/datasets/Madelon?ref=datanews.io
Prostate - datasets - plco - the cancer data access system, https://cdas.cancer.gov/datasets/plco/20/
Uci machine learning repository: Secom data set, https://archive.ics.uci.edu/ml/datasets/SECOM
Uci machine learning repository: Gisette data set, http://archive.ics.uci.edu/ml/datasets/Gisette?ref=datanews.io
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Bryll R, Gutierrez-Osuna R, Quek F (2003) Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recogn 36(6):1291–1302
Janusz A, Slezak D (2014) Rough set methods for attribute clustering and selection. Appl Artif Intel 28(3):220–242
L. Comtet, Advanced Combinatorics: The art of finite and infinite expansions, Springer Science & Business Media, 2012
Tichenor T (2016) Bounds on graph compositions and the connection to the bell triangle. Discret Math 339(4):1419–1423
Garcia S, Herrera F (2008) An extension on" statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons. J Mach Learn Res. 9(12):2677–2694
García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Inf Sci 180(10):2044–2064
Luengo J, García S, Herrera F (2009) A study on the use of statistical tests for experimentation with neural networks: Analysis of parametric test conditions and non-parametric tests. Expert Syst Appl 36(4):7798–7808
Derrac J, García S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol Comput 1(1):3–18
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Singh, R., Kumar, V. Ensemble multi-view feature set partitioning method for effective multi-view learning. Knowl Inf Syst (2024). https://doi.org/10.1007/s10115-024-02114-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10115-024-02114-6