Abstract
Currently, there are many problems in imbalanced big data classification based on rough set with virtual reality technology in cloud computing. For example, redundant big data cleaning is not clear, the effect is poor for big data denoising and feature extraction, and the precision of classification is low. In this paper, an imbalanced big data classification is proposed based on Hubness and K nearest neighbor to address such problems. First, the SNM algorithm is used in order to efficient cleaning of redundant big data. Then, wavelet threshold denoising algorithm is used to denoise the big data to improve the denoising effect. Meantime, feature of big data is extracted based on Lyapunov theorem. Moreover, the Hubness and K-nearest neighbor algorithms are used to achieve high precision of imbalanced big data classification. Experiments verify that the proposed method effectively strengthens current cleaning and denoising methods of redundant imbalanced big data, as well as improves accuracy of extraction and classification of big data.
Similar content being viewed by others
References
Ammari H, Chen J, Chen Z et al (2015) Detection and classification from electromagnetic induction data. J Comput Phys 301(C):201–217
Cestarelli V, Fiscon G, Felici G et al (2015) CAMUR: knowledge extraction from RNA-seq cancer data through equivalent classification rules. Bioinformatics 32(5):5–29
Ding Q, Han J, Zhao X et al (2015) Missing-data classification with the extended full-dimensional Gaussian mixture model: applications to EMG-based motion recognition. IEEE Trans Ind Electron 62(8):4994–5005
Duan M, Li K, Liao X, Li K (2017) A parallel multiclassification algorithm for big data using an extreme learning machine. IEEE Transactions on Neural Networks & Learning Systems, PP(99):1–15
Ellison DW (2015) Multiple molecular data sets and the classification of adult diffuse gliomas. N Engl J Med 372(26):2555–2557
Gaocheng L, Shuai L, Khan M et al (2018) Object tracking in vary lighting conditions for fog based intelligent surveillance of public spaces. IEEE Access 6(1):29283–29296
Guan H, Yu Y, Ji Z et al (2015) Deep learning-based tree classification using mobile LiDAR data. Remote Sens Lett 6(11):864–873
Lee YC, Wang MJ (2015) Taiwanese adult foot shape classification using 3D scanning data. Ergonomics 58(3):513–523
Lienert D, Anklam E, Panne U (2015) Gas chromatography-mass spectral analysis of roots of Echinacea species and classification by multivariate data analysis. Phytochem Anal 9(2):88–98
Lin KC, Zhang KY, Huang YH, Hung JC, Yen N (2016) Feature selection based on an improved cat swarm optimization algorithm for big data classification. J Supercomput 72(8):3210–3221
Liu ZQ, Li PC, Chen XW et al (2016) Classification of airborne LiDAR point cloud data based on information vector machine. Opt Precis Eng 24(1):210–219
Liu S, Zhang Z, Qi L et al (2016) A fractal image encoding method based on statistical loss used in agricultural image compression. Multimed Tools Appl 75(23):15525–15536
Liu S, Lu M, Liu G (2017) A novel distance metric: generalized relative entropy. Entropy 19(6):269
Liu S, Fu W, He L et al (2017) Distribution of primary additional errors in fractal encoding method. Multimed Tools Appl 76(4):5787–5802
Liu S, Bai W, Liu G et al (2018) Parallel fractal compression method for big video data. Complexity 2018:2016976
Mirza B, Lin Z (2016) Meta-cognitive online sequential extreme learning machine for imbalanced and concept-drifting data classification. Neural Netw 80(C):79–94
Reese H, Nordkvist K, Nyström M et al (2015) Combining point clouds from image matching with SPOT 5 multispectral data for mountain vegetation classification. Int J Remote Sens 36(2):403–416
Simmonds P (2015) Methods for virus classification and the challenge of incorporating metagenomic sequence data. J Gen Virol 96(6):1193–1206
Tu E, Zhang Y, Zhu L et al (2016) A graph-based semi-supervised k, nearest-neighbor method for nonlinear manifold distributed data classification. Inf Sci 367-368:673–688
Wang L, Guo NN (2017) Imbalanced telecom customer data classification method based on dissimilarity. J Comput Appl 37(4):1032–1037
Wei-Ping LI, Jie YANG, Hai-Yan WU (2017) Improvement of large data classification algorithm for text sensitive information. J Mines Met Fuels 65(3):169–178
Zhang SP, Liang XC (2015) Applications of imbalanced data classification based on optimized support vector machine ensemble classifier. J Comput Appl 35(5):1306–1309. https://doi.org/10.11772/j.issn.1001-9081.2015.05.1306.
Zheng P, Shuai L, Arun Kumar S et al (2018) Visual attention feature (VAF): a novel strategy for visual tracking based on cloud platform in intelligent surveillance systems. J Parallel Distrib Comput 120:182–194
Acknowledgments
This work is supported by Natural Science Foundation of Inner Mongolia [No. 2018MS6010]; Foundation Science Research Start-up Fund of Inner Mongolia Agriculture University. [JC2016005]; Scientific Research Foundation for Doctors of Inner Mongolia Agriculture University. [NDYB2016-11].
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xie, Wd., Cheng, X. Imbalanced big data classification based on virtual reality in cloud computing. Multimed Tools Appl 79, 16403–16420 (2020). https://doi.org/10.1007/s11042-019-7317-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-7317-x