Development of new agglomerative and performance evaluation models for classification

Vijaya Prabhagar, M.; Punniyamoorthy, M.

doi:10.1007/s00521-019-04297-4

Development of new agglomerative and performance evaluation models for classification

Original Article
Published: 27 June 2019

Volume 32, pages 2589–2600, (2020)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

M. Vijaya Prabhagar¹ &
M. Punniyamoorthy¹

314 Accesses
4 Citations
Explore all metrics

Abstract

This study proposes two new hierarchical clustering methods, namely weighted and neighbourhood to overcome the issues such as getting less accuracy, inability to separate the clusters properly and the grouping of more number of clusters which exist in present hierarchical clustering methods. We have also proposed three new criteria to assess the performance of clustering methods: (1) overall effectiveness which means the product of overall efficiency and accuracy of the clusters which is used to evaluate the performance of the hierarchical clustering methods for the class label datasets, (2) modified structure strength S(c) to overcome the usage problem in hierarchical clustering methods to determine the number of clusters for non-class label datasets and (3) R-value which is the ratio of the determinant of the sum of square and cross product matrix of between-clusters to the determinant of the sum of square and cross product matrix of within-clusters. This will help us to validate the performance of hierarchical clustering methods for non-class label datasets. The evolved algorithms provided high accuracy, ability to separate the clusters properly and the grouping of less number of clusters. The performance of the new algorithms with existing algorithms is compared in terms of newly developed performance criteria. The new algorithms thus performed better than the existing algorithms. The whole exercise is done with the help of twelve class label and six non-class label datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

Article 27 November 2022

A review of unsupervised feature selection methods

Article 29 January 2019

References

Day WHE, Edelsbrunner H (1984) Efficient algorithms for agglomerative hierarchical clustering methods. J Classif 1:7–24
Article Google Scholar
Murthy N, Devi S (2011) Pattern recognition: an algorithmic approach. Springer, Berlin
Book Google Scholar
Frigui H, Krishnapuram R (1997) Clustering by competitive agglomeration. Pattern Recogn 30:1109–1119
Article Google Scholar
Clarke MRB, Duda RO, Hart PE (2006) Pattern classification and scene analysis. J R Stat Soc Ser A 137:442–443. https://doi.org/10.2307/2344977
Article Google Scholar
Jain AK, Dubes C (1988) Algorithms for clustering data_Jain.pdf. Prentice Hall, Englewood Cliffs
MATH Google Scholar
Bouguettaya A, Yu Q, Liu X et al (2015) Efficient agglomerative hierarchical clustering. Expert Syst Appl 42(5):2785–2797. https://doi.org/10.1016/j.eswa.2014.09.054
Article Google Scholar
Guha S, Rastogi R, Shim K (2001) CURE: an efficient clustering algorithm for large databases. Inf Syst 26(1):35–58. https://doi.org/10.1016/S0306-4379(01)00008-4
Article MATH Google Scholar
Zhang T, Ramakrishnan R, Livny M (1996) BIRCH: an efficient data clustering databases method for very large. ACM SIGMOD Rec. https://doi.org/10.1145/233269.233324
Article Google Scholar
Müllner D (2011) Modern hierarchical, agglomerative clustering algorithms. http://arXiv.org/abs/1109.2378v1
Müllner D (2015) Fastcluster: fast hierarchical, agglomerative clustering routines for R and Python. J Stat Softw 53(9):1–18. https://doi.org/10.18637/jss.v053.i09
Article Google Scholar
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31:264–323. https://doi.org/10.1145/331499.331504
Article Google Scholar
Malhotra NK, Birks DF (2009) Marketing research: an applied approach. Pearson Education, London
Book Google Scholar
Murtagh F (1983) A survey of recent advances in hierarchical clustering algorithms. Comput J 26(4):354–359
Article Google Scholar
Sarle WS, Jain AK, Dubes RC (2006) Algorithms for clustering data. Technometrics. https://doi.org/10.2307/1268876
Article Google Scholar
Johnson RA, Wichern DW (1988) Multivariate linear regression models, 2nd edn. Prentice Hall, Englewood Cliffs
Google Scholar
Shalom SA, Dash M (2013) Efficient partitioning based hierarchical agglomerative clustering using graphics accelerators with Cuda. Int J Artif Intell Appl 4:13. https://doi.org/10.5121/ijaia.2013.4202
Article Google Scholar
Sebban M, Nock R, Lallich S et al (2002) Stopping criterion for boosting-based data reduction techniques: from binary to multiclass problems. J Mach Learn Res 3:863–885
MathSciNet MATH Google Scholar
Rodrigues PP, Pedroso P (2007) Hierarchical clustering of time series data streams. IEEE Trans Knowl Data Eng 10:1–12
Google Scholar
Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241–254. https://doi.org/10.1007/BF02289588
Article MATH Google Scholar
Murtagh F, Contreras P (2012) Algorithms for hierarchical clustering: an overview. Wiley Interdiscip Rev Data Min Knowl Discov. https://doi.org/10.1002/widm.53
Article MATH Google Scholar
Fung BCM, Wang K, Ester M (2011) Hierarchical document clustering. In: Encyclopedia of data warehousing and mining, Second edition, pp 970–975
Moore AW (2001) K-means and hierarchical clustering. Stat Data Min Tutorials 1–24
Rui-Ping L, Mukaidono M (2002) A maximum-entropy approach to fuzzy clustering. In: Proceedings of 1995 IEEE International conference on fuzzy systems. IEEE, Yokohama, pp 2227–2232. https://doi.org/10.1109/fuzzy.1995.409989
Anderberg MR (1978) Cluster analysis for applications: probability and mathematical statistics: a series of monographs and textbooks. Academic Press, Cambridge
Google Scholar
Gordon AD (2006) A review of hierarchical classification. J R Stat Soc Ser A 150(2):119–137. https://doi.org/10.2307/2981629
Article MathSciNet MATH Google Scholar
Batagelj V (1981) Note on ultrametric hierarchical clustering algorithms. Psychometrika 46(3):351–352. https://doi.org/10.1007/BF02293743
Article MathSciNet Google Scholar
Milligan GW, Romesburg HC (2006) Cluster analysis for researchers. J Mark Res. https://doi.org/10.2307/3151374
Article Google Scholar
Al-Dabooni S, Wunsch D (2018) Model order reduction based on agglomerative hierarchical clustering. IEEE Trans Neural Netw Learn, Syst
Google Scholar
Liu H, Fen L, Jian J, Chen L (2017) Overlapping community discovery algorithm based on hierarchical agglomerative clustering. Int J Pattern Recognit Artif Intell 32(03):1850008. https://doi.org/10.1142/s0218001418500088
Article MathSciNet Google Scholar
Ying Z, Karypis G (2002) Evaluation of hierarchical clustering algorithms for document datasets. CIKM. ACM, New York, pp 515–524
Google Scholar
Nazari Z, Kang D, Asharif MR et al (2015) A new hierarchical clustering algorithm. Int Conf Intell Inform Biomed Sci 2015:148–152. https://doi.org/10.1109/ICIIBMS.2015.7439517
Article Google Scholar
Fan J (2015) OPE-HCA: an optimal probabilistic estimation approach for hierarchical clustering algorithm. Neural Comput Appl. https://doi.org/10.1007/s00521-015-1998-5
Article Google Scholar
Cheng D, Zhu Q, Wu Q (2018) A local cores-based hierarchical clustering algorithm for data sets with complex structures. Proc Int Comput Softw Appl Conf 1:410–419. https://doi.org/10.1109/COMPSAC.2018.00063
Article Google Scholar
Koga H, Ishibashi T, Watanabe T (2007) Fast agglomerative hierarchical clustering algorithm using locality-sensitive hashing. Knowl Inf Syst 12(1):25–53. https://doi.org/10.1007/s10115-006-0027-5
Article MATH Google Scholar
Zahoránszky LA, Katona GY, Hári P et al (2009) Breaking the hierarchy—a new cluster selection mechanism for hierarchical clustering methods. Algorithms Mol Biol 4(1):12. https://doi.org/10.1186/1748-7188-4-12
Article Google Scholar
Fisher RA (2011) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188. https://doi.org/10.1111/j.1469-1809.1936.tb02137.x
Article Google Scholar
Fischer I, Poland J (2005) Amplifying the block matrix structure for spectral clustering. In: van Otterlo M, Poel M, Nijholt A (eds) Proceedings of the 14th annual machine learning conference of Belgium and the Netherlands, pp 21–28
Uysal I, Güvenir HA (2004) Instance-based regression by partitioning feature projections. Appl Intell 21(1):57–79. https://doi.org/10.1023/B:APIN.0000027767.87895.b2
Article MATH Google Scholar
Cohen I, Cozman FG, Sebe N et al (2004) Semisupervised learning of classifiers: theory, algorithms, and their application to human–computer interaction. IEEE Trans Pattern Anal Mach Intell 26:1553–1567
Article Google Scholar
Caruana R, Niculescu-Mizil A (2004) Data mining in metric space: an empirical analysis of supervised learning performance criteria. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, Seattle, pp 69–78, 22–25 Aug 2004. https://doi.org/10.1145/1014052.1014063
Ritter G (2018) Robust cluster analysis and variable selection. Chapman and Hall, London
MATH Google Scholar
Asuncion A, Newman DJ (2015) UCI machine learning repository: data sets. UCI

Download references

Author information

Authors and Affiliations

National Institute of Technology, Tiruchirappalli, India
M. Vijaya Prabhagar & M. Punniyamoorthy

Authors

M. Vijaya Prabhagar
View author publications
You can also search for this author in PubMed Google Scholar
M. Punniyamoorthy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Punniyamoorthy.

Ethics declarations

Conflict of interest

The authors declare that we have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vijaya Prabhagar, M., Punniyamoorthy, M. Development of new agglomerative and performance evaluation models for classification. Neural Comput & Applic 32, 2589–2600 (2020). https://doi.org/10.1007/s00521-019-04297-4

Download citation

Received: 31 August 2017
Accepted: 17 June 2019
Published: 27 June 2019
Issue Date: April 2020
DOI: https://doi.org/10.1007/s00521-019-04297-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Development of new agglomerative and performance evaluation models for classification

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

A review of unsupervised feature selection methods

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Development of new agglomerative and performance evaluation models for classification

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

A review of unsupervised feature selection methods

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation