Similarity Measure Design on Big Data

Lee, Sanghyuk; Sun, Yan

doi:10.1007/978-94-007-6516-0_90

Sanghyuk Lee⁵ &
Yan Sun⁶

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 235))

228 Accesses
1 Altmetric

Abstract

Clustering algorithm in big data was designed, and its idea was based on defining similarity measure. Traditional similarity measure on overlapped data was illustrated, and application to non-overlapped data was carried out. Similarity measure on high dimension data was obtained through getting information from neighbor data. Its usefulness was proved, and verified by calculation of similarity for artificial data example.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Fisher DH (1987) Knowledge acquisition via incremental conceptual clustering. Mach Learn 2:139–172
Google Scholar
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Englewood Cliffs
Google Scholar
Murtagh F (1983) A survey of recent hierarchical clustering algorithms. Comput J 26:354–359
Google Scholar
Michalski RS, Stepp RE (1983) Learning from observation: conceptual clustering. In: Machine learning: an artificial intelligence approaches. Tioga, Palo Alto, pp 331–363
Google Scholar
Friedman HP, Rubin J (1967) On some invariant criteria for grouping data. J Am Stat Assoc 62:1159–1178
Google Scholar
Fukunaga K (1990) Introduction to statistical pattern recognition. Academic Press, San Diego
Google Scholar
Advancing Discovery in Science and Engineering (2011) Computing Community Consortium, Spring 2011
Google Scholar
Advancing Personalized Education (2011) Computing Community Consortium, Spring 2011
Google Scholar
Smart Health and Wellbeing (2011) Computing Community Consortium, Spring 2011
Google Scholar
Liu X (1992) Entropy, distance measure and similarity measure of fuzzy sets and their relations. Fuzzy Sets Syst 52:305–318
Google Scholar
Lee SH, Pedrycz W, Sohn G (2009) Design of similarity and dissimilarity measures for fuzzy sets on the basis of distance measure. Int J Fuzzy Syst 11:67–72
Google Scholar
Lee SH, Ryu KH, Sohn GY (2009) Study on entropy and similarity measure for fuzzy set. IEICE Trans Inf Syst E92-D:1783–1786
Google Scholar
Lee SH, Kim SJ, Jang NY (2008) Design of fuzzy entropy for non convex membership function. CCIS 15:55–60
Google Scholar
Cheng Y, Church G (2000) Biclustering of expression data, In: Proceedings of the 8th international conference on intelligent system for molecular biology. La Jolla
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Electronic Engineering, Xi’an Jiaotong-Liverpool University, Suzhou, 215123, China
Sanghyuk Lee
School of Business Economic and Management, Xi’an Jiaotong-Liverpool University, Suzhou, 215123, China
Yan Sun

Authors

Sanghyuk Lee
View author publications
You can also search for this author in PubMed Google Scholar
Yan Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sanghyuk Lee .

Editor information

Editors and Affiliations

, Computer Engineering, Paichai University, Baejae-ro, Seo-gu 155-40, Daejeon, 302-735, Korea, Republic of (South Korea)
Hoe-Kyung Jung
, Electronic Engineering, Mokwon University, Doanbookro, Seo-Ku 88, Daejeon, 302-729, Korea, Republic of (South Korea)
Jung Tae Kim
PO Box 2434, Electrical Engineering, QUT Gardens Point, George Street 2, Brisbane, 4001, Queensland, Australia
Tony Sahama
Normal University, Graduate Institute of Information, National Kaohsiung, No.116, Heping 1st Rd., Lingya District, Kaohsiung City, 80201, Kao-hsiung, Taiwan R.O.C.
Chung-Huang Yang

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lee, S., Sun, Y. (2013). Similarity Measure Design on Big Data. In: Jung, HK., Kim, J., Sahama, T., Yang, CH. (eds) Future Information Communication Technology and Applications. Lecture Notes in Electrical Engineering, vol 235. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-6516-0_90

Download citation

DOI: https://doi.org/10.1007/978-94-007-6516-0_90
Published: 25 May 2013
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-6515-3
Online ISBN: 978-94-007-6516-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics