Abstract
The aim of data transformation is to transform the original feature space of data into another space with better properties. This is typically combined with dimensionality reduction, so that the dimensionality of the transformed space is smaller. A widely used method for data transformation and dimensionality reduction is Principal Component Analysis (PCA). PCA finds a subspace that explains most of the data variance. While the new PCA feature space has interesting properties, such as removing linear correlation, PCA is an unsupervised method. Therefore, there is no guarantee that the PCA feature space will be the most appropriate for supervised tasks, such as classification or regression. On the other hand, 3-layer Multi Layer Perceptrons (MLP), which are supervised methods, can also be understood as a data transformation carried out by the hidden layer, followed by a classification/regression operation performed by the output layer. Given that the hidden layer is obtained after a supervised training process, it can be considered that it is performing a supervised data transformation. And if the number of hidden neurons is smaller than the input, also dimensionality reduction. Despite this kind of transformation being widely available (any neural network package that allows access to the hidden layer weights can be used), no extensive experimentation on the quality of 3-layer MLP data transformation has been carried out. The aim of this article is to carry out this research for classification problems. Results show that, overall, this transformation offers better results than the PCA unsupervised transformation method.
Similar content being viewed by others
Notes
Please note that the doughnut dataset available in the package is smaller than the one used for this article, because of space constraints at the CRAN repository.
References
Aissa FB et al (2017) Unsupervised features extraction using a multi-view self organizing map for image classification. In: 2017 IEEE/ACS 14th international conference on computer systems and applications (AICCSA). IEEE
Aler R, Valls JM, Galván IM, Camacho D (2020) nntrf: supervised data transformation by means of neural network hidden layers. R package version, no 1, pp 3. https://CRAN.R-project.org/package=nntrf
Aswolinskiy W (2018) Learning in the model space of neural networks. Universitat Bielefeld, Bielefeld
Bischl B, Lang M, Kotthoff L, Schiffner J, Richter J, Studerus E, Casalicchio G, Jones ZM (2016) mlr: machine learning in r. J Mach Learn Res 17(1):5938–5942
Bluche T, Ney H, Kermorvant C (2013) Feature extraction with convolutional neural networks for handwritten word recognition. In: 2013 12th international conference on document analysis and recognition. IEEE, pp 285–289
Chen Y, Jiang H, Li C, Jia X, Ghamisi P (2016) Deep feature extraction and classification of hyperspectral images based on convolutional neural networks. IEEE Trans Geosci Remote Sens 54(10):6232–6251
Chen M, Shi X, Zhang Y, Wu D, Guizani M (2017) Deep features learning for medical image analysis with convolutional autoencoder neural network. In: IEEE Transactions on Big Data, pp 1–10. https://doi.org/10.1109/TBDATA.2017.2717439
Echeverría A, Valls JM, Aler R (2012) Evolving linear transformations with a rotation-angles/scaling representation. Expert Syst Appl 39.3:3276–3282
Febrianto RT, Saputra DM, Jambak MI (2020) Dimension reduction with extraction methods (principal component analysis-self organizing map-isometric mapping) in indonesian language text documents clustering. In: Hybrid intelligent systems: 19th international conference on hybrid intelligent systems (HIS 2019) held in Bhopal, India, December 10–12, 2019, vol 1179. Springer Nature, p. 1
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Huertas-Tato J, Martín A, Camacho D (2020) Cloud type identification using data fusion and ensemble learning. In: International conference on intelligent data engineering and automated learning. Springer, Cham
Khan MA et al (2019) Multi-model deep neural network based features extraction and optimal selection approach for skin lesion classification. In: 2019 international conference on computer and information sciences (ICCIS). IEEE
Kuhn M, Kjell J (2013) Applied predictive modeling, vol 26. Springer, New York
Lv N, Chen C, Qiu T, Sangaiah AK (2018) Deep learning and superpixel feature extraction based on contractive autoencoder for change detection in SAR images. IEEE Trans Ind Inform 14(12):5530–5538
Martín A, Lara-Cabrera R, Fuentes-Hurtado F, Naranjo V, Camacho D (2018) Evodeep: a new evolutionary approach for automatic deep neural networks parametrisation. J Parallel Distrib Comput 117:180–191
Martín A, Vargas VM, Gutiérrez PA, Camacho D, Hervás-Martínez C (2020) Optimising convolutional neural networks using a hybrid statistically-driven coral reef optimisation algorithm. Appl Soft Comput 90:106144
Mejri M, Mejri A (2020) RandomForestMLP: an ensemble-based multi-layer perceptron against curse of dimensionality. arXiv preprint arXiv:2011.01188
Min R, Stanley DA, Yuan Z, Bonner A, Zhang Z (2009) A deep non-linear feature mapping for large-margin knn classification. In: 2009 ninth IEEE international conference on data mining. IEEE, pp 357–366
Min M, van der Maaten L, Yuan Z, Bonner AJ, Zhang Z (2010) Deep supervised t-distributed embedding. In: ICML
Parviainen E (2011) Studies on dimension reduction and feature spaces. Doctoral Dissertation. Aalto University publication series. 94. Helsinki: Spoo. ISBN: 978-952-60-4311-1
Parviainen E, Vehtari A (2009) Features and metric from a classifier improve visualizations with dimension reduction. In: International conference on artificial neural networks. Springer, pp 225–234
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536
Sakkari M, Zaied M (2020) A convolutional deep self-organizing map feature extraction for machine learning. Multimed Tools Appl 79:19451–19470
Salakhutdinov R, Hinton G (2007) Learning a nonlinear embedding by preserving class neighbourhood structure. In: Proceedings of the eleventh international conference on artificial intelligence and statistics, vol 2, pp. 412–419. PMLR
Valls José M, Aler Ricardo (2009) Optimizing linear and quadratic data transformations for classification tasks. In: 2009 ninth international conference on intelligent systems design and applications. IEEE
Van Der Maaten L (2009) Learning a parametric embedding by preserving local structure. In: Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics, vol 5, pp 384–391. PMLR
Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New York. http://www.stats.ox.ac.uk/pub/MASS4. ISBN 0-387-95457-0
Yang A, Yang X, Wu W, Liu H, Zhuansun Y (2019) Research on feature extraction of tumor image based on convolutional neural network. IEEE Access 7:24204–24213
Acknowledgements
This work has been supported by Agencia Estatal de Investigación (PID2019-107455RB-C22 /AEI/ 10.13039/501100011033), and Spanish Ministry of Science and Education under TIN2017-85727-C4-3-P (DeepBio) grant.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Valls, J.M., Aler, R., Galván, I.M. et al. Supervised data transformation and dimensionality reduction with a 3-layer multi-layer perceptron for classification problems. J Ambient Intell Human Comput 12, 10515–10527 (2021). https://doi.org/10.1007/s12652-020-02841-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02841-y