Abstract
A common method to perform the object classification with different images being taken at different views is to extract the features from each image without performing the fusion. On the other hand, this paper proposes a multivariate two dimensional singular spectrum analysis (M2DSSA) based approach to fuse the features in different images together to perform the object classification. First, a four channel two dimensional signal is formed using four images taken at four different views. Second, the M2DSSA is applied to the four channel two dimensional signal. Next, the histogram of the oriented gradient (Hog) is computed on each channel of each M2DSSA component. Then, the selection of the M2DSSA components is performed based on the correlation coefficients among these Hogs and the fusion of these images is performed via the M2DSSA. Next, the Hog of each reconstructed image is recomputed and these Hogs are employed as the features for the support vector machine to perform the object classification. Our proposed method yields the classification accuracies at 92.5925% and 97.8723% for the images in the first dataset and the second dataset, respectively. Since the information of the objects in different images is fused together, the computer numerical simulation results show that the classification accuracies of our proposed method are higher than those of the baseline method without performing the fusion and those of the other fusion methods.
Similar content being viewed by others
References
Atrish A, Singh N, Kumar K, Kumar V (2017) An automated hierarchical framework for player recognition in sports image[C], Proceedings of the international conference on video and image processing, IEEE
Cheng M, Jing L, Michael KN (2019) Tensor-Based Low-Dimensional Representation Learning for Multi-View Clustering[J]. IEEE Trans Image Process 28(5):2399–2414
Farfade SS, Saberian M, Li L-J (2015) Multi-view face detection using deep convolutional neural networks[C], Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. ACM
Golyandina N (1930) On the choice of parameters in Singular Spectrum Analysis and related subspace-based methods. Stat Interface 1:403–413
Golyandina N, Korobeynikov A, Shlemov A, Usevich K (2015) Multivariate and 2D Extensions of Singular Spectrum Analysis with the Rssa Package[J]. J Stat Softw 67(2):1–78
Golyandina N, Korobeynikov A, Zhigljavsky A (2018) Singular Spectrum Analysis with R[M]. Springer, Berlin
Hassani H, Mahmoudvand R (2013) Multivariate singular spectrum analysis: A general view and new vector forecasting approach[J]. Int J Energy Stat 1(01):55–83
Kanmani M, Narasimhan V (2017) Swarm intelligence based optimisation in thermal image fusion using dual tree discrete wavelet transform[J]. Quant Infrared Thermog J 14(1):24–43
Kanmani M, Narasimhan V (2017) An optimal weighted averaging fusion strategy for thermal and visible images using dual tree discrete wavelet transform and self tunning particle swarm optimization[J]. Multimed Tools Appl 76(20):20989–21010
Kanmani M, Narasimhan V (2018) Swarm intelligent based contrast enhancement algorithm with improved visual perception for color images[J]. Multimed Tools Appl 77(10):12701–12724
Kanmani M, Narasimhan V (2019) An optimal weighted averaging fusion strategy for remotely sensed images[J]. Multidim Syst Sign Process 30(4):1911–1935
Kanmani M, Narasimhan V (2019) Particle swarm optimisation aided weighted averaging fusion strategy for CT and MRI medical images[J]. Int J Biomed Eng Technol 31(3):278–291
Kanmani M, Narasimhan V (2020) Optimal fusion aided face recognition from visible and thermal face images[J]. Multimed Tools Appl 79(25):17859–17883
Kannan S (2020) Intelligent object recognition in underwater images using evolutionary-based Gaussian mixture model and shape matching. SIViP 14:877–885
Kolda TG, Bader BW (2009) Tensor decompositions and applications[J]. SIAM Rev 51(3):455–500
Koppanati RK, Kumar K (2020) P-MEC: Polynomial Congruence-Based Multimedia Encryption Technique Over Cloud[J]. IEEE Consum Electron Mag 10(5):41–46
Korn MR, Dyer CR (1987) 3-D multiview object representations for model-based object recognition[J]. Pattern Recogn 20(1):91–103
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet Classification with Deep Convolutional Neural Networks[J]. Adv Neural Inf Process Syst 25
Kumar K (2021) Text query based summarized event searching interface system using deep learning over cloud[J]. Multimed Tools Appl 80(7):11079–11094
Kumar K, Shrimankar DD (2018) F-DES: Fast and Deep Event Summarization[J]. IEEE Trans Multimed 20(2):323–334
Kumar A, Singh N, Kumar P, Vijayvergia A, Kumar K (2017) A novel superpixel based color spatial feature for salient object detection[C], Conference on Information and Communication Technology. IEEE
Kumar K, Kumar A, Bahuguna A (2017) D-CAD: Deep and Crowded Anomaly Detection[C]. In: Proceedings of the 7th International Conference on Computer and Communication Technology, Association for Computing Machinery, New York, NY, USA
Kumar K, Shrimankar DD, Singh N (2018) Eratosthenes sieve based key-frame extraction technique for event summarization in videos[J]. Multimed Tools Appl 77(6):7383–7404
Kumar K, Shrimankar DD, Singh N (2019) Key-lectures: keyframes extraction in video lectures[J]. Mach Intell Signal Anal 748:453–459
Li S, Kwok JT, Wang Y (2001) Combination of images with diverse focuses using the spatial frequency[J]. Information Fusion 2(3):169–176
Lin Y, Ling BW-K, Nuo X, Lam RW-K, Ho CY-F (2020) Effectiveness analysis of bio-electronic stimulation therapy to Parkinson’s diseases via joint singular spectrum analysis and discrete fourier transform approach [J]. Biomed Signal Process Control 62:102131
Lin Y, Ling BW-K, Lingyue H, Zheng Y, Nuo X, Zhou X, Wang X (2021) Hyperspectral Image Enhancement by Two Dimensional Quaternion Valued Singular Spectrum Analysis for Object Recognition[J]. Remote Sens 13(3):405
Mario Christoudias C, Urtasun R, Darrell T (2008) Unsupervised feature selection via distributed coding for multi-view object recognition[C], IEEE Conference on Computer Vision and Pattern Recognition, IEEE
Medvedev AV, Kainerstorfer JM, Borisov SV, VanMeter J (2011) Functional connectivity in the prefrontal cortex measured by near-infrared spectroscopy during ultrarapid object recognition[J]. J Biomed Opt 16(1):016008
Rothganger F, Lazebnik S, Schmid C, Ponce J (2006) 3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints. Int J Comput Vis Kluwer Academic Publishers 66(3):231–259
Sharma S, Kumar K (2021) ASL-3DCNN: American sign language recognition technique using 3-D convolutional neural networks[J]. Multimed Tools Appl 80(17):26319–26331
Sharma S, Kumar K, Singh N (2017) D-FES: Deep facial expression recognition system[C], Conference on Information and Communication Technology. IEEE
Sharma S, Kumar K, Singh N (2020) Deep eigen space based ASL recognition system[J]. IETE J Res 68:3798–3808
Shlemov A, Golyandina N, Holloway D, Spirov A (2015) Shaped 3D singular spectrum analysis for quantifying gene expression, with application to the early zebrafish embryo. Biomed Res Int 2015:1–18
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition[C], Proceedings of the IEEE international conference on computer vision. IEEE
The multi view data set is downloaded in Guangdong Key Laboratory of Intellectual Property Big Data (n.d.) http://iplab.gpnu.edu.cn/info/1044/1608.htm
Thomas A, Ferrari V, Leibe B, Tuytelaars T, Schiele B, Van Gool L (2006) Towards multi-view object class detection[C], IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE
Tomiyama K, Orihara Y, Katayama M, Iwadate Y (2004) Algorithm for dynamic 3D object generation from multi-viewpoint images[J]. Proc SPIE - Int Soc Optical Eng 5599:153–161
Vuksanovic B (2015) GPR image decomposition using two dimensional singular spectrum analysis[C], 9th International Symposium on Image and Signal Processing and Analysis (ISPA). IEEE
Wang H, Zhang D, Miao Z (2019) Face recognition with single sample per person using HOG–LDB and SVDL. SIViP 13:985–992
Yang Z-X, Tang L, Zhang K, Wong PK (2018) Multi-view CNN feature aggregation with ELM auto-encoder for 3d shape recognition [J]. Cogn Comput 10(6):908–921
Yangyang X (2015) Alternating proximal gradient method for sparse nonnegative Tucker decomposition[J]. Math Program Comput 7(1):39–70
Zhang J, Hassani H, Xie H, Zhang X (2014) Estimating multi-country prosperity index: a two-dimensional singular spectrum analysis approach[J]. J Syst Sci Complex 27(1):56–74
Acknowledgements
This paper was supported partly by the National Nature Science Foundation of China with the grant numbers U1701266, 61671163 and 62071128, the Team Project of the Education Ministry of the Guangdong Province with the grant number 2017KCXTD011, the Guangdong Higher Education Engineering Technology Research Center for Big Data on Manufacturing Knowledge Patent with the grant number 501130144, and Hong Kong Innovation and Technology Commission, Enterprise Support Scheme with the grant number S/E/070/17.
Availability of data and materials
The datasets generated and analyzed during the current study are available in the public domain.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
The algorithm of our proposed method is shown below.
The algorithm of our proposed method | |
1. The image having Mviews is represented as a M-variate two dimensional signal \(\left[\begin{array}{c}{x}^1\\ {}{x}^2\\ {}\vdots \\ {}{x}^M\end{array}\right]\). | |
2. The M2DSSA is applied to \(\left[\begin{array}{c}{x}^1\\ {}{x}^2\\ {}\vdots \\ {}{x}^M\end{array}\right]\) to obtain the M2DSSA components \(\begin{bmatrix}x^1\\x^2\\\vdots\\x^M\end{bmatrix}=\sum_{n=1}^N\begin{bmatrix}\widetilde x_n^1\\\widetilde x_n^2\\\vdots\\\widetilde x_n^M\end{bmatrix}\). | |
3. The Hog features are extracted from each view of each M2DSSA component. | |
4. The value of Tn is computed using (16). The corresponding M2DSSA components with Tn ≥ τ are selected. Define \(\mathcal{T}\) be the index set of the selected M2DSSA components. The four view image is reconstructed by summing up the M2DSSA components in the index set. That is, \(\left[\begin{array}{c}{\hat{x}}^1\\ {}{\hat{x}}^2\\ {}\vdots \\ {}{\hat{x}}^M\end{array}\right]=\sum_{n\in \mathcal{T}}\left[\begin{array}{c}{\tilde{x}}_n^1\\ {}{\tilde{x}}_n^2\\ {}\vdots \\ {}{\tilde{x}}_n^M\end{array}\right]\). | |
5. Re-compute the Hog features of the reconstructed M-variate two dimensional signal \(\left[\begin{array}{c}{\hat{x}}^1\\ {}{\hat{x}}^2\\ {}\vdots \\ {}{\hat{x}}^M\end{array}\right]\). | |
6. Apply the support vector machine to the re-computed Hog features for performing the object classification. |
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lin, Y., Ling, B.WK., Li, C. et al. Multivariate two dimensional singular spectrum analysis based fusion method for four view image based object classification. Multimed Tools Appl 82, 46403–46421 (2023). https://doi.org/10.1007/s11042-023-15712-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15712-3