Performance Analysis of Clustering Algorithm in Data Mining in R Language

Jayaram Reddy, Avulapalli; Tripathy, Balakrushna; Nimje, Seema; Sree Ganga, Gopalam; Varnasree, Kamireddy

doi:10.1007/978-981-13-1936-5_39

Avulapalli Jayaram Reddy¹²,
Balakrushna Tripathy¹³,
Seema Nimje¹²,
Gopalam Sree Ganga¹² &
…
Kamireddy Varnasree¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 837))

Included in the following conference series:

International Conference on Soft Computing Systems

1610 Accesses
3 Citations

Abstract

Data mining is the extraction of different data of intriguing as such (constructive, relevant, constructive, previously unexplored and considerably valuable) patterns or information from very large stack of data or different dataset. In other words, it is the experimental exploration of associations, links, and mainly the overall patterns that prevails in large datasets but is hidden or unknown. So, to explore the performance analysis using different clustering techniques we used R Language. This R language is a tool, which allows the user to analyse the data from various and different perspective and angles, in order to get a proper experimental results and in order to derive a meaningful relationships. In this paper, we are studying, analysing and comparing various algorithms and their techniques used for cluster analysis using R language. Our aim in this paper, is to present the comparison of 5 different clustering algorithms and validating those algorithms in terms of internal and external validation such as Silhouette plot, dunn index, Connectivity and much more. Finally as per the basics of the results that obtained we analyzed and compared, validated the efficiency of many different algorithms with respect to one another.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Smith, T.F., Waterman, M.S.: Identification of Common Molecular Subsequences. J. Mol. Biol. 147, 195–197 (1981)
Article Google Scholar
May, P., Ehrlich, H.-C., Steinke, T.: ZIB structure prediction pipeline: composing a complex biological workflow through web services. In: Nagel, W.E., Walter, W.V., Lehner, W. (eds.) Euro-Par 2006. LNCS, vol. 4128, pp. 1148–1158. Springer, Heidelberg (2006). https://doi.org/10.1007/11823285_121
Chapter Google Scholar
Foster, I., Kesselman, C.: The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Francisco (1999)
Google Scholar
Czajkowski, K., Fitzgerald, S., Foster, I., Kesselman, C.: Grid ınformation services for distributed resource sharing. In: 10th IEEE International Symposium on High Performance Distributed Computing, pp. 181–184. IEEE Press, New York (2001)
Google Scholar
Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The physiology of the grid: an open grid services architecture for distributed systems ıntegration. Technical report, Global Grid Forum (2002)
Google Scholar
National Center for Biotechnology Information. http://www.ncbi.nlm.nih.gov
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 881–892 (2002)
Article Google Scholar
Liu, Y., Li, Z., Xiong, H., Gao, X., Wu, J.: Understanding of internal clustering validation measures. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 911–916. IEEE, December 2010
Google Scholar
Liu, Y., Li, Z., Xiong, H., Gao, X., Wu, J., Wu, S.: Understanding and enhancement of internal clustering validation measures. IEEE Trans. Cybern. 43(3), 982–994 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technology and Engineering, VIT, Vellore, India
Avulapalli Jayaram Reddy, Seema Nimje, Gopalam Sree Ganga & Kamireddy Varnasree
School of Computer Science and Engineering, VIT, Vellore, India
Balakrushna Tripathy

Authors

Avulapalli Jayaram Reddy
View author publications
You can also search for this author in PubMed Google Scholar
Balakrushna Tripathy
View author publications
You can also search for this author in PubMed Google Scholar
Seema Nimje
View author publications
You can also search for this author in PubMed Google Scholar
Gopalam Sree Ganga
View author publications
You can also search for this author in PubMed Google Scholar
Kamireddy Varnasree
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Avulapalli Jayaram Reddy .

Editor information

Editors and Affiliations

Department of Computer Science, Faculty of Electrical Engineering and Computer Science VŠB-TUO, Ostrava-Poruba, Czech Republic
Ivan Zelinka
Faculty of Applied Informatics, Tomas Bata University in Zlín, Zlín, Czech Republic
Roman Senkerik
School of Electrical Sciences, Indian Institute of Technology Bhubaneswar, Bhubaneswar, Odisha, India
Ganapati Panda
Baselios Mathews II College of Engineering, Kerala, India
Padma Suresh Lekshmi Kanthan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jayaram Reddy, A., Tripathy, B., Nimje, S., Sree Ganga, G., Varnasree, K. (2018). Performance Analysis of Clustering Algorithm in Data Mining in R Language. In: Zelinka, I., Senkerik, R., Panda, G., Lekshmi Kanthan, P. (eds) Soft Computing Systems. ICSCS 2018. Communications in Computer and Information Science, vol 837. Springer, Singapore. https://doi.org/10.1007/978-981-13-1936-5_39

Download citation

DOI: https://doi.org/10.1007/978-981-13-1936-5_39
Published: 25 September 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1935-8
Online ISBN: 978-981-13-1936-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics