Skip to main content
Log in

Ensemble multi-view feature set partitioning method for effective multi-view learning

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Multi-view learning consistently outperforms traditional single-view learning by leveraging multiple perspectives of data. However, the effectiveness of multi-view learning heavily relies on how the data are partitioned into feature sets. In many cases, different datasets may require different partitioning methods to capture their unique characteristics, making a single partitioning method insufficient. Finding an optimal feature set partitioning (FSP) for each dataset may be a time-consuming process, and the optimal FSP may still not be sufficient for all types of datasets. Therefore, the paper presents a novel approach called ensemble multi-view feature set partitioning (EMvFSP) to improve the performance of multi-view learning, a technique that uses multiple data sources to make predictions. The proposed EMvFSP method combines the different views produced by multiple partitioning methods to achieve better classification performance than any single partitioning method alone. The experiments were conducted on 15 structured datasets with varying ratios of samples, features, and labels, and the results showed that the proposed EMvFSP method effectively improved classification performance. The paper also includes statistical analyses using Friedman ranking and Holms procedure to demonstrate the effectiveness of the proposed method. This approach provides a robust solution for multi-view learning that can adapt to different types of datasets and partitioning methods, making it suitable for a wide range of applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

The datasets analyzed during the current study are available in the following repositories: Uci machine learning repository, http://archive.ics.uci.edu/ml/index.php. Kent ridge bio-medical dataset, http://datam.i2r.a-star.edu.sg/datasets/krbd/index.html. Uci machine learning repository: Gisette data set, http://archive.ics.uci.edu/ml/datasets/Gisette?ref=datanews.io. Uci machine learning repository: Arcene data set, http://archive.ics.uci.edu/ml/datasets/Arcene?ref=datanews.io. Centralnervoussystem-iccr, https://www.iccr-cancer.org/datasets/published-datasets/central-nervous-system/. Colon cancer datasets-biogps, http://biogps.org/dataset/tag/colon%20cancer/. Data repository-debt (Stanford), https://leo.ugr.es/elvira/DBCRepository/DLBCL/DLBCL-Stanford.html. Leukemia classification-kaggle, https://www.kaggle.com/datasets/andrewmvd/leukemia-classification. Air quality-lung cancer data-Harvard dataverse, https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/HMOEJO. Data repository-lung cancer, https://leo.ugr.es/elvira/DBCRepository/LungCancer/LungCancer-Michigan.html. Uci machine learning repository: Madelon data set, http://archive.ics.uci.edu/ml/datasets/Madelon?ref=datanews.io. Prostate - datasets - plco - the cancer data access system, https://cdas.cancer.gov/datasets/plco/20/. Uci machine learning repository: Secom data set, https://archive.ics.uci.edu/ml/datasets/SECOM.

Notes

  1. https://www.anaconda.com/.

  2. https://scikit-learn.org/stable/.

References

  1. Zhao J, Xie X, Xu X, Sun S (2017) Multi-view learning overview: recent progress and new challenges. Inf Fus 38:43–54. https://doi.org/10.1016/j.inffus.2017.02.007

    Article  Google Scholar 

  2. Yang Y, Wang H (2018) Multi-view clustering a survey. Big Data Min Anal 1(2):83–107

    Article  Google Scholar 

  3. Xu C, Tao D, Xu C (2013) A survey on multi-view learning, arXiv preprint arXiv:1304.5634. https://doi.org/10.48550/arXiv.1304.5634

  4. Nan F, Tang Y, Yang P, He Z, Yang Y (2021) A novel sub-kmeans based on co-training approach by transforming single-view into multi-view. Futur Gener Comput Syst 125:831–843. https://doi.org/10.1016/j.future.2021.07.019

    Article  Google Scholar 

  5. Liu J, Liu X, Yang Y, Guo X, Kloft M, He L (2021) Multiview subspace clustering via co-training robust data representation. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2021.3069424

    Article  Google Scholar 

  6. Zhang X, Zhao Dy, Chen Lw, Min Wh (2009) Batch mode active learning based multi-view text classification, In: Sixth International Conference on Fuzzy Systems and Knowledge Discovery, Vol. 7, IEEE, , (pp. 472–476). https://doi.org/10.1109/FSKD.2009.495

  7. Foster DP, Kakade SM, Zhang T (2008) Multi-view dimensionality reduction via canonical correlation analysis

  8. Rokach L (2010) Pattern classification using ensemble methods, vol 75. World Scientific, Singapore

    Google Scholar 

  9. Cai W, Zhou H, Xu L (2021) A multi-view co-training clustering algorithm based on global and local structure preserving. IEEE Access 9:29293–29302. https://doi.org/10.1109/ACCESS.2021.3056677

    Article  Google Scholar 

  10. Tao J, Wu Z-G, Su H, Wu Y, Zhang D (2018) Asynchronous and resilient filtering for Markovian jump neural networks subject to extended dissipativity. IEEE Trans Cybern 49(7):2504–2513. https://doi.org/10.1109/TCYB.2018.2824853

    Article  Google Scholar 

  11. Kumar V, Minz S (2016) Multi-view ensemble learning: an optimal feature set partitioning for high-dimensional data classification. Knowl Inf Syst 49(1):1–59. https://doi.org/10.1007/s10115-015-0875-y

    Article  Google Scholar 

  12. Kumar V, Minz S (2014) Feature selection: a literature review. Smart Comput Rev 4(3):211–229. https://doi.org/10.6029/smartcr.2014.03.007

    Article  Google Scholar 

  13. Muslea I, Minton S, Knoblock CA (2002) Active+ semi-supervised learning= robust multi-view learning, In: ICML, Vol. 2, Citeseer, (pp. 435–442)

  14. Ding Z, Shao M, Fu Y (2018) Robust multi-view representation: a unified perspective from multi-view learning to domain adaption, In: IJCAI, https://doi.org/10.24963/ijcai.2018/767

  15. Kumar V, S. Minz S (2015) Multi-view ensemble learning: a supervised feature set partitioning for high dimensional data classification, In: Proceedings of the Third International Symposium on Women in Computing and Informatics, (pp. 31–37). https://doi.org/10.1145/2791405.2791443

  16. Wang C, Huang Y, Ding W, Cao Z (2021) Attribute reduction with fuzzy rough self-information measures. Inf Sci 549:68–86. https://doi.org/10.1016/j.ins.2020.11.021

    Article  MathSciNet  Google Scholar 

  17. Wang F, Liang J, Dang C (2013) Attribute reduction for dynamic data sets. Appl Soft Comput 13(1):676–689. https://doi.org/10.1016/j.asoc.2012.07.018

    Article  Google Scholar 

  18. Alam MT, Kumar V, Kumar A (2021) A multi-view convolutional neural network approach for image data classification, In: International Conference on Communication information and Computing Technology (ICCICT), IEEE, (pp. 1–6). https://doi.org/10.1109/ICCICT50803.2021.9509943

  19. Ning X, Wang X, Xu S, Cai W, Zhang L, Yu L, Li W (2021) A review of research on co-training. Concurr Comput Pract Experience 32:e6276. https://doi.org/10.1002/cpe.6276

    Article  Google Scholar 

  20. Hussain T, Muhammad K, Ding W, Lloret J, Baik SW, de Albuquerque VHC (2021) A comprehensive survey of multi-view video summarization. Pattern Recogn 109:107567. https://doi.org/10.1016/j.patcog.2020.107567

    Article  Google Scholar 

  21. Woźniak M, Krawczyk B (2012) Combined classifier based on feature space partitioning. Int J Appl Math Comput Sci 22(4):855–866. https://doi.org/10.2478/v10006-012-0063-0

    Article  MathSciNet  Google Scholar 

  22. Kim H, Kim H, Moon H, Ahn H (2011) A weight-adjusted voting algorithm for ensembles of classifiers. J Korean Stat Soc 40(4):437–449. https://doi.org/10.1016/j.jkss.2011.03.002

    Article  MathSciNet  Google Scholar 

  23. Dasgupta S, Littman M, McAllester D (2001) Pac generalization bounds for co-training, In: Advances in neural information processing systems, vol. 14

  24. Gonçalves CA, Vieira AS, Gonçalves CT, Camacho R, Iglesias EL, Diz LB (2022) A novel multi-view ensemble learning architecture to improve the structured text classification. Information 13(6):283. https://doi.org/10.3390/info13060283

    Article  Google Scholar 

  25. Garcia-Ceja E, Galván-Tejada CE, Brena R (2018) Multi-view stacking for activity recognition with sound and accelerometer data. Inf Fus 40:45–56. https://doi.org/10.1016/j.inffus.2017.06.004

    Article  Google Scholar 

  26. Chang X, Yang Y, WangH (2018) Multi-view construction for clustering based on feature set partitioning, In: International Joint Conference on Neural Networks (IJCNN), IEEE, (pp. 1–8). https://doi.org/10.1109/IJCNN.2018.8489615

  27. Pagliaro P, Femminò S, Penna C (2019) Redox aspects of myocardial ischemia/reperfusion injury and cardioprotection. Oxidative stress in heart diseases. Springer, Cham, pp 289–324. https://doi.org/10.1007/978-981-13-8273-4_13

    Chapter  Google Scholar 

  28. Debie E, Shafi K, Lokan C, Merrick K (2013) Performance analysis of rough set ensemble of learning classifier systems with differential evolution based rule discovery. Evol Intel 6(2):109–126. https://doi.org/10.1007/s12065-013-0093-z

    Article  Google Scholar 

  29. Stańczyk U, Zielosko B (2020) Heuristic-based feature selection for rough set approach. Int J Approx Reason 125:187–202. https://doi.org/10.1016/j.ijar.2020.07.005

    Article  MathSciNet  Google Scholar 

  30. Omuya EO, Okeyo GO, Kimwele MW (2021) Feature selection for classification using principal component analysis and information gain. Expert Syst Appl 174:114765. https://doi.org/10.1016/j.eswa.2021.114765

    Article  Google Scholar 

  31. Y. Piao, M. Piao, C. H. Jin, H. S. Shon, J.-M. Chung, B. Hwang, K. H. Ryu, A new ensemble method with feature space partitioning for high-dimensional data classification, Mathematical Problems in Engineering 2015 (2015). https://doi.org/10.1155/2015/590678

  32. Kumar A, Kumar V, Kumari S (2021) A graph coloring based framework for views construction in multi-view ensemble learning, In: 2021 2nd International Conference on Secure Cyber Computing and Communications (ICSCCC), IEEE, (pp. 84–89). https://doi.org/10.1109/ICSCCC51823.2021.9478138

  33. Kumar V, Aydav PSS, Minz S (2021) Multi-view ensemble learning using multi-objective particle swarm optimization for high dimensional data classification. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2021.08.029

    Article  Google Scholar 

  34. Rokach L (2008) Genetic algorithm-based feature set partitioning for classification problems. Pattern Recogn 41(5):1676–1700. https://doi.org/10.1016/j.patcog.2007.10.013

    Article  Google Scholar 

  35. Amini F, Hu G (2021) A two-layer feature selection method using genetic algorithm and elastic net. Expert Syst Appl 166:114072. https://doi.org/10.1016/j.eswa.2020.114072

    Article  Google Scholar 

  36. Calzavara S, Lucchese C, Marcuzzi F, Orlando S (2021) Feature partitioning for robust tree ensembles and their certification in adversarial scenarios. EURASIP J Inf Secur 2021:1–17

    Google Scholar 

  37. Guggari S, Kadappa V, Umadevi V (2018) Non-sequential partitioning approaches to decision tree classifier. Future Comput Inf J 3(2):275–285. https://doi.org/10.1016/j.fcij.2018.06.003

    Article  Google Scholar 

  38. Nutheti PSD, Hasyagar N, Shettar R, Guggari S, Umadevi V (2020) Ferrer diagram based partitioning technique to decision tree using genetic algorithm. Int J Math Sci Comput 6:25–32. https://doi.org/10.5815/ijmsc.2020.01.03

    Article  Google Scholar 

  39. Guggari S, Kadappa V, Umadevi V, Abraham A (2020) Music rhythm tree based partitioning approach to decision tree classifier. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2020.03.015

    Article  Google Scholar 

  40. Imani V, Sevilla-Salcedo C, Fortino V, Tohka J (2023) Multi-objective genetic algorithm for multi-view feature selection, arXiv preprint arXiv:2305.18352. https://doi.org/10.48550/arXiv.2004.03295

  41. Du X, Zhang W, Alvarez JM (2021) Boosting supervised learning performance with co-training, In: 2021 IEEE Intelligent Vehicles Symposium (IV). IEEE (pp. 540–545). https://doi.org/10.1109/IV48863.2021.9575963

  42. Mohammed AM, Onieva E, Woźniak M (2019) Vertical and horizontal data partitioning for classifier ensemble learning, In: International Conference on Computer Recognition Systems, Springer, (pp. 86–97)

  43. Lopez-Garcia P, Masegosa AD, Osaba E, Onieva E, Perallos A (2019) Ensemble classification for imbalanced data based on feature space partitioning and hybrid metaheuristics. Appl Intell 49(8):2807–2822

    Article  Google Scholar 

  44. Raza K (2019) Improving the prediction accuracy of heart disease with ensemble learning and majority voting rule. U-healthcare monitoring systems. Elsevier, pp 179–196

    Google Scholar 

  45. Liu Y, Jiang C, Zhao H (2018) Using contextual features and multi-view ensemble learning in product defect identification from online discussion forums. Decis Support Syst 105:1–12

    Article  Google Scholar 

  46. Seetha H, Murty MN, Saravanan R (2016) Classification by majority voting in feature partitions. Int J Inf Decis Sci 8(2):109–124

    Google Scholar 

  47. C. Christoudias, R. Urtasun, T. Darrell, Multi-view learning in the presence of view disagreement, arXiv preprint arXiv:1206.3242 (2012)

  48. Christoudias CM, Urtasun R, Kapoorz A, Darrell T (2009) Co-training with noisy perceptual observations, In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (pp. 2844–2851). https://doi.org/10.1109/CVPR.2009.5206572

  49. Shahzad RK, Lavesson N (2013) Comparative analysis of voting schemes for ensemble-based malware detection. J Wirel Mob Netw Ubiquitous Comput Depend Appl 4(1):98–117

    Google Scholar 

  50. Uci machine learning repository, http://archive.ics.uci.edu/ml/index.php

  51. Kent ridge bio-medical dataset, http://datam.i2r.a-star.edu.sg/datasets/krbd/index.html

  52. Uci machine learning repository: Arcene data set, http://archive.ics.uci.edu/ml/datasets/Arcene?ref=datanews.io

  53. Central nervous system - iccr, https://www.iccr-cancer.org/datasets/published-datasets/central-nervous-system/

  54. Colon cancer datasets biogps, http://biogps.org/dataset/tag/colon%20cancer/

  55. Data repository – dlbcl (stanford), https://leo.ugr.es/elvira/DBCRepository/DLBCL/DLBCL-Stanford.html

  56. Leukemia classification kaggle, https://www.kaggle.com/datasets/andrewmvd/leukemia-classification

  57. Air quality-lung cancer data - harvard dataverse, https://dataverse.harvard.edu/dataset.xhtml?persistentId=https://doi.org/10.7910/DVN/HMOEJO.

  58. Data repository – lung cancer, https://leo.ugr.es/elvira/DBCRepository/LungCancer/LungCancer-Michigan.html

  59. Lofters AK, Gatov E, Lu H, Baxter NN, Guilcher SJ, Kopp A, Vahabi M, Datta GD (2021) Lung cancer inequalities in stage of diagnosis in Ontario, Canada. Curr Oncol 28(3):1946–1956

    Article  Google Scholar 

  60. Uci machine learning repository:madelon data set, http://archive.ics.uci.edu/ml/datasets/Madelon?ref=datanews.io

  61. Prostate - datasets - plco - the cancer data access system, https://cdas.cancer.gov/datasets/plco/20/

  62. Uci machine learning repository: Secom data set, https://archive.ics.uci.edu/ml/datasets/SECOM

  63. Uci machine learning repository: Gisette data set, http://archive.ics.uci.edu/ml/datasets/Gisette?ref=datanews.io

  64. Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844

    Article  Google Scholar 

  65. Bryll R, Gutierrez-Osuna R, Quek F (2003) Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recogn 36(6):1291–1302

    Article  Google Scholar 

  66. Janusz A, Slezak D (2014) Rough set methods for attribute clustering and selection. Appl Artif Intel 28(3):220–242

    Article  Google Scholar 

  67. L. Comtet, Advanced Combinatorics: The art of finite and infinite expansions, Springer Science & Business Media, 2012

  68. Tichenor T (2016) Bounds on graph compositions and the connection to the bell triangle. Discret Math 339(4):1419–1423

    Article  MathSciNet  Google Scholar 

  69. Garcia S, Herrera F (2008) An extension on" statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons. J Mach Learn Res. 9(12):2677–2694

    Google Scholar 

  70. García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Inf Sci 180(10):2044–2064

    Article  Google Scholar 

  71. Luengo J, García S, Herrera F (2009) A study on the use of statistical tests for experimentation with neural networks: Analysis of parametric test conditions and non-parametric tests. Expert Syst Appl 36(4):7798–7808

    Article  Google Scholar 

  72. Derrac J, García S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol Comput 1(1):3–18

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vipin Kumar.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Singh, R., Kumar, V. Ensemble multi-view feature set partitioning method for effective multi-view learning. Knowl Inf Syst (2024). https://doi.org/10.1007/s10115-024-02114-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10115-024-02114-6

Keywords

Navigation