Abstract
Clustering algorithm in big data was designed, and its idea was based on defining similarity measure. Traditional similarity measure on overlapped data was illustrated, and application to non-overlapped data was carried out. Similarity measure on high dimension data was obtained through getting information from neighbor data. Its usefulness was proved, and verified by calculation of similarity for artificial data example.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Fisher DH (1987) Knowledge acquisition via incremental conceptual clustering. Mach Learn 2:139–172
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Englewood Cliffs
Murtagh F (1983) A survey of recent hierarchical clustering algorithms. Comput J 26:354–359
Michalski RS, Stepp RE (1983) Learning from observation: conceptual clustering. In: Machine learning: an artificial intelligence approaches. Tioga, Palo Alto, pp 331–363
Friedman HP, Rubin J (1967) On some invariant criteria for grouping data. J Am Stat Assoc 62:1159–1178
Fukunaga K (1990) Introduction to statistical pattern recognition. Academic Press, San Diego
Advancing Discovery in Science and Engineering (2011) Computing Community Consortium, Spring 2011
Advancing Personalized Education (2011) Computing Community Consortium, Spring 2011
Smart Health and Wellbeing (2011) Computing Community Consortium, Spring 2011
Liu X (1992) Entropy, distance measure and similarity measure of fuzzy sets and their relations. Fuzzy Sets Syst 52:305–318
Lee SH, Pedrycz W, Sohn G (2009) Design of similarity and dissimilarity measures for fuzzy sets on the basis of distance measure. Int J Fuzzy Syst 11:67–72
Lee SH, Ryu KH, Sohn GY (2009) Study on entropy and similarity measure for fuzzy set. IEICE Trans Inf Syst E92-D:1783–1786
Lee SH, Kim SJ, Jang NY (2008) Design of fuzzy entropy for non convex membership function. CCIS 15:55–60
Cheng Y, Church G (2000) Biclustering of expression data, In: Proceedings of the 8th international conference on intelligent system for molecular biology. La Jolla
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Lee, S., Sun, Y. (2013). Similarity Measure Design on Big Data. In: Jung, HK., Kim, J., Sahama, T., Yang, CH. (eds) Future Information Communication Technology and Applications. Lecture Notes in Electrical Engineering, vol 235. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-6516-0_90
Download citation
DOI: https://doi.org/10.1007/978-94-007-6516-0_90
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-6515-3
Online ISBN: 978-94-007-6516-0
eBook Packages: EngineeringEngineering (R0)