An Approach to Feature Space Construction from Clustering Feature Tree

Dudarin, Pavel; Samokhvalov, Mikhail; Yarushkina, Nadezhda

doi:10.1007/978-3-030-00617-4_17

Pavel Dudarin¹¹,
Mikhail Samokhvalov¹¹ &
Nadezhda Yarushkina¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 934))

Included in the following conference series:

Russian Conference on Artificial Intelligence

960 Accesses
3 Citations

Abstract

Generally, clustering feature tree consists of nodes given as vectors. In case of non-vector nodes a transformation into feature vectors is needed. Feature extraction algorithm determines the volume and quality of information enclosed in features and quality of clustering. Thus this kind of transformation is important part of clustering procedure. In this paper an approach to clustering feature space construction from clustering feature tree is proposed. Presented approach allows to save hierarchy information and reduce feature space dimension. An efficiency of proposed approach is shown in the experiment part with different clustering algorithms. Result analysis is provided at the end of the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Feature Selection Method Using Hierarchical Clustering

A new feature subset selection using bottom-up clustering

Article 18 June 2016

Implementation of FAST Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data

References

Amorim, R.: Feature weighting for clustering: using K-means and the Minkowski. LAP Lambert Academic Publishing (2012)
Google Scholar
Ball, G.H., Hall, David J.: Isodata: a method of data analysis and pattern classification, Stanford Research Institute, Menlo Park, United States. Office of Naval Re-search, Information Sciences Branch (1965)
Google Scholar
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315, 972–976 (2007)
Article MathSciNet Google Scholar
Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 603–619 (2002)
Article Google Scholar
Dudarin, P., Pinkov, A., Yarushkina, N.: Methodology and the algorithm for clustering economic analytics object. Autom. Control. Process. 47(1), 85–93 (2017)
Google Scholar
Dudarin, P., Yarushkina, N.: Features construction from hierarchical classifier for short text fragments clustering. Fuzzy Syst. Soft Comput. 12, 87–96 (2018). https://doi.org/10.26456/fssc26
Article Google Scholar
Dudarin, P.V., Yarushkina, N.G.: Algorithm for constructing a hierarchical classifier of short text fragments based on the clustering of a fuzzy graph. Radio Eng. 2017(6), 114–121 (2017)
Google Scholar
Dudarin, P.V., Yarushkina, N.G.: An approach to fuzzy hierarchical clustering of short text fragments based on fuzzy graph clustering. In: Abraham, A., Kovalev, S., Tarassov, V., Snasel, V., Vasileva, M., Sukhanov, A. (eds.) IITI 2017. AISC, vol. 679, pp. 295–304. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-68321-8_30
Chapter Google Scholar
Ester M., Kriegel H. P., SanderJ., Xu X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226–231. AAAI Press, Portland (1996)
Google Scholar
Federal law “About strategic planning in Russian Federation" (2014). http://pravo.gov.ru/proxy/ips/?docbody=&nd=102354386
Han, X., Ma, J., Wu, Y., Cui, C.: A novel machine learning approach to rank web forum posts. Soft Comput. 18(5), 941–959 (2014)
Article Google Scholar
Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985). https://doi.org/10.1007/BF01908075
Article MATH Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999)
Article Google Scholar
Jolliffe, I.T.: Principal Component Analysis, p. 487. Springer, Heidelberg (1986). https://doi.org/10.1007/b98835. ISBN 978-0-387-95442-4
Book MATH Google Scholar
Li, J., Wang, K., Xu, L.: Chameleon based on clustering feature tree and its application in customer segmentation. Ann. Oper. Res. 168, 225 (2009). https://doi.org/10.1007/s10479-008-0368-4
Article MATH Google Scholar
Mansoori, E.G.: GACH: a grid based algorithm for hierarchical clustering of high-dimensional data. Soft Comput. 18(5), 905–922 (2014)
Article Google Scholar
Modha, D.S., Spangler, W.S.: Feature weighting in k-means clustering. Mach. Learn. 52, 217 (2003). https://doi.org/10.1023/A:1024016609528
Article MATH Google Scholar
Mikolov T., Sutskever I., Chen K., Corrado G., Dean J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, 05–10 December, Lake Tahoe, Nevada, pp. 3111–3119 (2013)
Google Scholar
Pedregosa, F.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning, PMLR, vol. 32, no. 2, pp. 1188–1196 (2014)
Google Scholar
Yeh, R.T., Bang, S.Y.: Fuzzy relation, fuzzy graphs and their applications to clustering analysis. In: Fuzzy Sets and their Applications to Cognitive and Decision Processes, pp. 125–149. Academic Press (1975). ISBN 9780127752600
Google Scholar
Rokach, L., Maimon, O.: Clustering methods. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook. Springer, Boston (2005). https://doi.org/10.1007/0-387-25465-X_15
Chapter MATH Google Scholar
Rosenfeld, A.: Fuzzy graphs. In: Zadeh, L.A., Fu, K.S., Tanaka, K., Shimura, M. (eds.) Fuzzy Sets and Their Applications to Cognitive and Decision Processes, pp. 77–95. Academic Press, New York (1975)
Chapter Google Scholar
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Comput. Appl. Math. 20, 53–65 (1987). https://doi.org/10.1016/0377-0427(87)90125-7
Article MATH Google Scholar
Ruspini, E.H.: A new approach to clustering. Inform. Control 15(1), 22–32 (1969)
Article Google Scholar
Arthur, V., et al.: K-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics (2007)
Google Scholar
Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. 2008, P10008 (2008)
Article Google Scholar
Zhang, J., Wang, Y., Feng, J.: A hybrid clustering algorithm based on PSO with dynamic crossover. Soft Comput. 18(5), 961–979 (2014)
Article Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data - SIGMOD 1996, pp. 103–114 (1996). https://doi.org/10.1145/233269.233324

Download references

Acknowledgment

This study was supported Ministry of Education and Science of Russia in framework of project No 2.1182.2017/4.6 and Russian Foundation of base Research in framework of project No 16-47-732120 r_ofi_m.

Author information

Authors and Affiliations

Ulyanovsk State Technical University, Ulyanovksk, 432027, Russian Federation
Pavel Dudarin, Mikhail Samokhvalov & Nadezhda Yarushkina

Authors

Pavel Dudarin
View author publications
You can also search for this author in PubMed Google Scholar
Mikhail Samokhvalov
View author publications
You can also search for this author in PubMed Google Scholar
Nadezhda Yarushkina
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pavel Dudarin .

Editor information

Editors and Affiliations

Department of Data Analysis and Artificial Intelligence, National Research University Higher School of Economics, Moscow, Russia
Sergei O. Kuznetsov
Federal Research Center Computer Science and Control, Institute of Informatics Problems, Moscow, Russia
Gennady S. Osipov
Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia
Vadim L. Stefanuk

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dudarin, P., Samokhvalov, M., Yarushkina, N. (2018). An Approach to Feature Space Construction from Clustering Feature Tree. In: Kuznetsov, S., Osipov, G., Stefanuk, V. (eds) Artificial Intelligence. RCAI 2018. Communications in Computer and Information Science, vol 934. Springer, Cham. https://doi.org/10.1007/978-3-030-00617-4_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-00617-4_17
Published: 08 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00616-7
Online ISBN: 978-3-030-00617-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Approach to Feature Space Construction from Clustering Feature Tree

Abstract

Access this chapter

Similar content being viewed by others

A Feature Selection Method Using Hierarchical Clustering

A new feature subset selection using bottom-up clustering

Implementation of FAST Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

An Approach to Feature Space Construction from Clustering Feature Tree

Abstract

Access this chapter

Similar content being viewed by others

A Feature Selection Method Using Hierarchical Clustering

A new feature subset selection using bottom-up clustering

Implementation of FAST Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation