Abstract
Unsupervised learning methods, as one of the important machine learning methods, have been developing rapidly, receiving more and more attention since they can automatically classify the data according to their attributes. However, most current studies of the unsupervised learning are focused on specific techniques and application scenarios, while few summarize its development and typical algorithms systematically. This paper is devoted to a comprehensive summarization of the unsupervised learning methods. According to different data processing methods, unsupervised learning can be divided into dimensionality reduction, clustering and deep learning-based methods. The methods of dimensionality reduction focus on reducing the complexity and removing redundant features of the data, while keeping the original data structure as much as possible. Clustering methods can automatically classify data according to the data features, which is useful for data analysis. As for deep learning-based methods, deep neural network is used to train the data to achieve higher data processing performance. For each category of the unsupervised learning methods, the typical algorithms and their applications are explained and the recent researches are summarized. Finally, the future development of the unsupervised learning is provided.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alpaydin, E.: Introduction to Machine Learning. MIT press, Cambridge (2020)
Caruana, R., Niculescu-Mizil, A.: An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd International Conference on Machine Learning, New York, pp. 161–168 (2006). https://doi.org/10.1145/1143844.1143865
Ghahramani, Z.: Unsupervised learning. In: Bousquet, O., von Luxburg, U., Rätsch, G. (eds.) ML -2003. LNCS (LNAI), vol. 3176, pp. 72–112. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28650-9_5
Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.: Mixmatch: a holistic approach to semi-supervised learning. arXiv preprint arXiv:1905.02249 (2019)
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996). https://doi.org/10.1613/jair.301
Singh, S., Prasad, S.V.A.V.: Techniques and challenges of face recognition: a critical review. Procedia Comput. Sci. 143, 536–543 (2018). https://doi.org/10.1016/j.procs.2018.10.427
Keysers, D., Deselaers, T., Rowley, H.A., Wang, L.L., Carbune, V.: Multi-language online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1180–1194 (2016). https://doi.org/10.1109/TPAMI.2016.2572693
Yao, L., Mao, C., Luo, Y.: Graph convolutional networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7370–7377 (2019). https://doi.org/10.1609/aaai.v33i01.33017370
Cohen, G., Afshar, S., Tapson, J., Van Schaik, A.: EMNIST: extending MNIST to handwritten letters. In: 2017 International Joint Conference on Neural Networks, pp. 2921–2926 (2017). https://doi.org/10.1109/IJCNN.2017.7966217
Khan, S., Liew, C.F., Yairi, T., McWilliam, R.: Unsupervised anomaly detection in unmanned aerial vehicles. Appl. Soft Comput. 83, 105650 (2019). https://doi.org/10.1016/j.asoc.2019.105650
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT press, Cambridge (2018)
Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdisc. Rev. Comput. Stat. 2(4), 433–459 (2010). https://doi.org/10.1002/wics.101
Child, D.: The Essentials of Factor Analysis. Cassell Educational, London (1990)
Abdi, H.: Singular value decomposition (SVD) and generalized singular value decomposition. Ency. Meas. Stat., 907–912 (2007)
Common, P.: Independent component analysis, a new concept. Signal Process 36(3), 287–314 (1994). https://doi.org/10.1016/0165-1684(94)90029-9
Machado, J.T., Lopes, A.M.: Multidimensional scaling and visualization of patterns in prime numbers. Commun. Nonlinear Sci. Numer. Simul. 83, 10512 (2020). https://doi.org/10.1016/j.cnsns.2019.105128
Lee, J.M., Yoo, C., Choi, S.W., Vanrolleghem, P.A., Lee, I.B.: Nonlinear process monitoring using kernel principal component analysis. Chem. Eng. Sci. 59(1), 223–234 (2004). https://doi.org/10.1016/j.ces.2003.09.012
Huang, X., Wu, L., Ye, Y.: A review on dimensionality reduction techniques. Int. J. Pattern Recogn. Artif. Intell. 33(10), 1950017 (2019). https://doi.org/10.1142/S0218001419500174
Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000). https://doi.org/10.1126/science.290.5500.2319
Liu, Q., Cai, Y., Jiang, H., Lu, J., Chen, L.: Traffic state prediction using ISOMAP manifold learning. Physica A Stat. Mech. Appl. 506, 532–541 (2018). https://doi.org/10.1016/j.physa.2018.04.031
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000). https://doi.org/10.1126/science.290.5500.2323
Saul, L. K., Roweis, S. T.: Think globally, fit locally: unsupervised learning of low dimensional manifolds. Departmental Papers, 12 (2003)
Donoho, D.L., Grimes, C.: Hessian eigenmaps: locally linear embedding techniques for high-dimensional data. Proc. Natl. Acad. Sci. 100(10), 5591–5596 (2003). https://doi.org/10.1073/pnas.1031596100
Hajizadeh, R., Aghagolzadeh, A., Ezoji, M.: Manifold based Persian digit recognition using the modified locally linear embedding and linear discriminative analysis. In: 2015 2nd International Conference on Knowledge-Based Engineering and Innovation, pp. 614–618 (2015). https://doi.org/10.1109/KBEI.2015.7436115
Zhang, Z., Zha, H.: Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM J. Sci. Comput. 26(1), 313–338 (2004)
Hinton, G., Roweis, S.T.: Stochastic neighbor embedding. In: NIPS, vol. 15, pp. 833–840 (2002). https://doi.org/10.1137/S1064827502419154
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11) (2008)
Belkina, A.C., Ciccolella, C.O., Anno, R., Halpert, R., Spidlen, J., Snyder-Cappione, J.E.: Automated optimized parameters for T-distributed stochastic neighbor embedding improve visualization and analysis of large datasets. Nat. Commun. 10(1), 1–12 (2019). https://doi.org/10.1038/s41467-019-13055-y
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)
Huang, Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining Knowl. Disc. 2(3), 283–304 (1998). https://doi.org/10.1023/A:1009769707641
Arthur, D., Vassilvitskii, S.: k-means++: The advantages of careful seeding. Stanford (2006)
OOlukanmi, P. O., Twala, B.: K-means-sharp: modified centroid update for outlier-robust k-means clustering. In: 2017 Pattern Recognition Association of South Africa and Robotics and Mechatronics, PRASA-RobMech, pp. 14–19 (2017). https://doi.org/10.1109/RoboMech.2017.8261116
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol. 96, pp. 226–231 (1996)
Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: OPTICS: ordering points to identify the clustering structure. ACM Sigmod Rec. 28(2), 49–60 (1999). https://doi.org/10.1145/304181.304187
Smiti, A., Elouedi, Z.: Dbscan-gm: an improved clustering method based on gaussian means and dbscan techniques. In: 2012 IEEE 16th International Conference on Intelligent Engineering Systems, pp. 573–578 (2012). https://doi.org/10.1109/INES.2012.6249802
Yang, K., Gao, Y., Ma, R., Chen, L., Wu, S., Chen, G.: Dbscan-ms: distributed density-based clustering in metric spaces. In 2019 IEEE 35th International Conference on Data Engineering, pp. 1346–1357 (2019). https://doi.org/10.1109/ICDE.2019.00122
Wang, W., Yang, J., Muntz, R.: STING: a statistical information grid approach to spatial data mining. In: VLDB, vol. 97, pp. 186–195 (1997)
Sheikholeslami, G., Chatterjee, S., Zhang, A.: Wavecluster: a multi-resolution clustering approach for very large spatial databases. In: VLDB, vol. 98, pp. 428–439 (1998)
Yu, D., Chatterjee, S., Zhang, A.: Efficiently detecting arbitrary shaped clusters in image databases. In: Proceedings 11th International Conference on Tools with Artificial Intelligence, pp. 187–194 (1999). https://doi.org/10.1109/TAI.1999.809785
Yue, S., Huang, X.: A gird-based fuzzy cluster approach. In: 2013 International Conference on Machine Learning and Cybernetics, vol. 1, pp. 148–154 (2013). https://doi.org/10.1109/ICMLC.2013.6890460
Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. ACM Sigmod Rec. 27(2), 73–84 (1998). https://doi.org/10.1145/276305.276312
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. ACM Sigmod Rec. 25(2), 103–114 (1996). https://doi.org/10.1145/235968.233324
He, X., Cai, D., Shao, Y., Bao, H., Han, J.: Laplacian regularized gaussian mixture model for data clustering. IEEE Trans. Knowl. Data Eng. 23(9), 1406–1418 (2010). https://doi.org/10.1109/TKDE.2010.259
Kohonen, T.: The self-organizing map. Proc. IEEE 78(9), 1464–1480 (1990)
Rhouma, M.B.H., Frigui, H.: Self-organization of pulse-coupled oscillators with application to clustering. IEEE Trans. Pattern Anal. Mach. Intell. 23(2), 180–195 (2001). https://doi.org/10.1109/34.908968
Sun, A. X.: Improved SOM algorithm-HDSOM applied in text clustering. In: 2010 International Conference on Multimedia Information Networking and Security, pp. 306–309 (2010). https://doi.org/10.1109/MINES.2010.74
Goss, S., Aron, S., Deneubourg, J.L., Pasteels, J.M.: Self-organized shortcuts in the Argentine ant. Naturwissenschaften 76(12), 579–581 (1989)
Devarajan, M., Fatima, N.S., Vairavasundaram, S., Ravi, L.: Swarm intelligence clustering ensemble based point of interest recommendation for social cyber-physical systems. J. Intell. Fuzzy Syst. 36(5), 4349–4360 (2019). https://doi.org/10.3233/JIFS-169991
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pp. 94–105 (1998). https://doi.org/10.1145/276304.276314
Chan, E.Y., Ching, W.K., Ng, M.K., Huang, J.Z.: An optimization algorithm for clustering using weighted dissimilarity measures. Pattern Recogn. 37(5), 943–952 (2004). https://doi.org/10.1016/j.patcog.2003.11.003
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006). https://doi.org/10.1126/science.1127647
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986). https://doi.org/10.1038/323533a0
Ng, A.: Sparse autoencoder. CS294A Lect. Notes 72, 1–19 (2011)
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P. A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103 (2008). https://doi.org/10.1145/1390156.1390294
Bengio, Y.: Learning Deep Architectures for AI. Now Publishers Inc. (2009)
Kingma, D. P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
Devlin, J., Chang, M. W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805(2018)
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., Le, Q. V.: Xlnet: generalized autoregressive pretraining for language understanding. arXiv preprint arXiv:1906.08237 (2019)
Hinton, G. E., Sejnowski, T. J.: Optimal perceptual inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 448 (1983)
Salakhutdinov, R., Hinton, G.: Deep boltzmann machines. In: Artificial Intelligence and Statistics, pp. 448–455 (2009)
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput 18(7), 1527–1554 (2006). https://doi.org/10.1162/neco.2006.18.7.1527
Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)
Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp. 7354–7363 (2019)
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018)
Usama, M., et al.: Unsupervised machine learning for networking: techniques, applications and research challenges. IEEE Access 7, 65579–65615 (2019). https://doi.org/10.1109/ACCESS.2019.2916648
Doersch, C., Zisserman, A.: Multi-task self-supervised visual learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2051–2060 (2017)
Yao, Y., Liu, C., Luo, D., Zhou, Y., Ye, Q.: Video playback rate perception for self-supervised spatio-temporal representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6548–6557 (2020)
He, K., Fan, H., Wu, Y., Xie, S, Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607 (2020)
Acknowledgment
This work was partially supported under the National Natural Science Foundation of China Project Grant Ref. U1913201 and 61973296.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wu, X., Liu, X., Zhou, Y. (2022). Review of Unsupervised Learning Techniques. In: Jia, Y., Zhang, W., Fu, Y., Yu, Z., Zheng, S. (eds) Proceedings of 2021 Chinese Intelligent Systems Conference. Lecture Notes in Electrical Engineering, vol 804. Springer, Singapore. https://doi.org/10.1007/978-981-16-6324-6_59
Download citation
DOI: https://doi.org/10.1007/978-981-16-6324-6_59
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-6323-9
Online ISBN: 978-981-16-6324-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)