Soft Computing

, Volume 23, Issue 10, pp 3347–3364 | Cite as

Semantic distance between vague concepts in a framework of modeling with words

  • Weifeng Zhang
  • Hua HuEmail author
  • Haiyang HuEmail author
  • Jinglong Fang
Methodologies and Application


Effectively measuring the similarity or dissimilarity of two vague concepts plays a key step in reasoning and computing with vague concepts. In this paper, we define semantic distances between data instances and vague concepts based on modeling vagueness in a framework called label semantics. We also propose two clustering methods based on these sematic distances, which can cluster data instances and vague concepts simultaneously. To evaluate our approach, we conduct several experimental studies on three datasets including Corel images and labels, Reuters-21578, and TDT2. It is illustrated that the proposed distances have the ability to effectively evaluate sematic similarities between data instances and vague concepts.


Vague concepts Label semantics Semantic distance Clustering 



This work is supported by the Natural Science Foundation of China (Grant Nos. 61572162 and 61272188) and the Zhejiang Provincial Key Science and Technology Project Foundation (No. 2017C01010).

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.


  1. Bharti K, Singh P (2016) Chaotic gradient artificial bee colony for text clustering. Soft Comput 20(3):1113–1126CrossRefGoogle Scholar
  2. Bishop M (2006) Pattern recognition and machine learning. Springer, BerlinzbMATHGoogle Scholar
  3. Cambria E (2012) Sentic computing for socal media marketing. Multimed Tools Appl 59(2):557–577CrossRefGoogle Scholar
  4. Cambria E, Hussain A (2012) Sentic computing: techniues, tools, and applications. Springer, BerlinCrossRefGoogle Scholar
  5. Carneiro G, Chan A, Moreno P, Vasconcelos N (2006) Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans PAMI 29(3):394–410CrossRefGoogle Scholar
  6. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27CrossRefGoogle Scholar
  7. Chen Y, Garcia E, Gupta M, Rahimi A, Cazzanti L (2009) Similarity-based classification: concepts and algorithms. J Mach Learn Res 10(2):747–776MathSciNetzbMATHGoogle Scholar
  8. Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20(3):273–297zbMATHGoogle Scholar
  9. Crosscombe M, Lawry J (2016) A model of multi-agent consensus for vague and uncertain beliefs. Adapt Behav 24(4):249–260CrossRefGoogle Scholar
  10. Daniel R, Lawry J, Rico-Ramirez A, Clukie D (2007) Classification of weather radar images using linguistic decision trees with conditional labelling. In: FUZZ-IEEE, pp 1–6Google Scholar
  11. David A (2005) Statistical models: theory and practice. Cambridge University Press, CambridgezbMATHGoogle Scholar
  12. Deng C, He X, Han J (2005) Document clustering using locality preserving indexing. IEEE Trans Knowl Data Eng 17(12):1624–1637CrossRefGoogle Scholar
  13. Figueiredo F, Rocha L, Couto T, Salles T, Goncalves M (2011) Word co-occurrence features for text classification. Inf Syst 36(5):843–858CrossRefGoogle Scholar
  14. Francisco A, Martinez J, Aguilar C, Roldon C (2016) Estimation of a fuzzy regression model using fuzzy distances. IEEE Trans Fuzzy Syst 24(2):344–359CrossRefGoogle Scholar
  15. Goldberger J, Hinton G, Roweis S, Salakhutdinov R (2005) Neighbourhood components analysis. In: NIPS, pp 513–520Google Scholar
  16. Gu B, Sheng VS (2016) A robust regularization path algorithm for \(v\)-support vector classification. IEEE Trans Neural Netw Learn Syst 1:1–8Google Scholar
  17. Gu B, Sheng VS, Tay KY, Romano W, Li S (2015a) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26(7):1403–1416MathSciNetCrossRefGoogle Scholar
  18. Gu B, Sheng VS, Wang Z, Ho D, Osman S, Li S (2015b) Incremental learning for \(v\)-support vector regression. Neural Netw 67:140–150CrossRefzbMATHGoogle Scholar
  19. Gu B, Sun X, Sheng VS (2016) Structural minimax probability machine. IEEE Trans Neural Netw Learn Syst 28(7):1646–1656MathSciNetCrossRefGoogle Scholar
  20. H Druker CB (1997) Support vector regression machine. In: NIPS, pp 155–161Google Scholar
  21. Guo H, Wang X, Wang L (2016) Delphi method for estimating membership function of uncertain set. J Uncertain Anal Appl 4(1):1–17CrossRefGoogle Scholar
  22. He H, Lawry J (2014) The linguistic attribute hierarchy and its optimisation for classification. Soft Comput 18(10):1967–1984CrossRefGoogle Scholar
  23. Janis V, Montes S (2007) Distance between fuzzy sets as a fuzzy quantity. Acta Univ Matthiae Belii Ser Math 14:41–49MathSciNetzbMATHGoogle Scholar
  24. Jolliffe I (2005) Principal component analysis. Wiley Online Library, HobokenzbMATHGoogle Scholar
  25. Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. In: Technical report, Engineering faculty, Computer Engineering Department. Erciyes University Press, ErciyesGoogle Scholar
  26. Lavrenko V, Manmatha R, Jeon J (2004) A model for learning the semantics of pictures. In: NIPSGoogle Scholar
  27. Lawry J (2006) Modelling and reasoning with vague concepts. Springer, BerlinzbMATHGoogle Scholar
  28. Lawry J (2014) Probability, fuzziness and borderline cases. Int J Approx Reason 55(5):1164–1184MathSciNetCrossRefzbMATHGoogle Scholar
  29. Lawry J, Tang Y (2009) Uncertainty modelling for vague concepts: a prototype theory approach. Artif Intell 173:1539–1558MathSciNetCrossRefzbMATHGoogle Scholar
  30. Lewis M, Lawry J (2016) Hierarchical conceptual spaces for concept combination. Aritif Intell 237:204–227MathSciNetCrossRefzbMATHGoogle Scholar
  31. Li D (2004) Some measures fo dissimilarity in intuitionistic fuzzy structures. J Comput Syst Sci 8:115–122CrossRefzbMATHGoogle Scholar
  32. Hyung LK, Song KLYS (1994) Similarity measrue between fuzzy sets and between elements. Fuzzy Sets Syst 62:291–293CrossRefGoogle Scholar
  33. Lovasz L, Plummer M (1986) Matching theory. BudapestGoogle Scholar
  34. MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley symposium on mathematical statistics and probability, pp 281–297Google Scholar
  35. McCulloch J, Wagner C, Akckelin U (2013) Measuring the directional distance between fuzzy sets. In: UKCI 2013, the 13th annual workshop on computational intelligence, Surrey University, pp 38–45Google Scholar
  36. Ng A, Jordan M, Weiss Y (2009) On spectral clustering: analysis and an algorithm. J Mach Learn Res 10(2):747–776MathSciNetGoogle Scholar
  37. Nieradka G, Butkiewicz B (2007) A method for automatic membership function estimation based on fuzzy measures. Foundations of fuzzy logic and soft computing. Springer, Berlin, Heidelberg, pp 451–460Google Scholar
  38. P Groenen UK, Rosmalen JV (2007) Fuzzy clustering with minkowski distance function. In: Advances in fuzzy clustering and its applications, pp 53–68Google Scholar
  39. Pappis C, Karacapilidis N (1993) A comparative assessment of measures of similarity of fuzzy values. Fuzzy Sets Syst 56:171–174MathSciNetCrossRefzbMATHGoogle Scholar
  40. Qin Z, Lawry J (2005) Decision tree learning with fuzzy labels. Inf Sci 172(1–2):91–129MathSciNetCrossRefzbMATHGoogle Scholar
  41. Qin Z, Lawry J (2008) LFOIL: Linguistic rule induction in the label semantic framework. Fuzzy Sets Syst 159(4):435–448MathSciNetCrossRefzbMATHGoogle Scholar
  42. Qin Z, Tang Y (2014) Uncertainty modeling for data mining: a label semantics approach. Springer, BerlinCrossRefzbMATHGoogle Scholar
  43. Rosch E (1973) Natural categories. Cogn Psychol 4:328–350CrossRefGoogle Scholar
  44. Rosch E (1975) Cognitive representation of semantic categories. J Exp Psychol 104:192–233CrossRefGoogle Scholar
  45. Rosmalen JV (2006) Fuzzy clustering with minkowski distance. In: Econometric, pp 53–68Google Scholar
  46. Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326CrossRefGoogle Scholar
  47. Medasani S, Kim J, Krishnapuram R (1998) An overview of membership function generation techniques for pattern recognition. Int J Approx Reason 19:391–417MathSciNetCrossRefzbMATHGoogle Scholar
  48. Scott J (2012) Illusions in regression analysis. Int J Forecast 28(3):689CrossRefGoogle Scholar
  49. Smola A, Scholkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222MathSciNetCrossRefGoogle Scholar
  50. Szmidt E, Kacprzyk J (2000) Distances between intuitionistic fuzzy sets. Fuzzy Sets Syst 114:505–518MathSciNetCrossRefzbMATHGoogle Scholar
  51. Turnbull O, Lawry J, Lowengerg M, Richards A (2016) A cloned linguistic decision tree controller for real-time path planning in hostile environments. Fuzzy Sets Syst 293:1–29MathSciNetCrossRefGoogle Scholar
  52. V Srivastava, Tripathi BK, Pathak VK (2011) An evolutionaru fuzzy clustering with minkowski distances. In: International conference on neural information processing, pp 753–760Google Scholar
  53. Vapnik V (1998) Statistical learning theory. Wiley, HobokenzbMATHGoogle Scholar
  54. Victor S, Semyon V (2006) A theoretical introduction to numerical analysis. CRC Press, Boca RatonzbMATHGoogle Scholar
  55. Weinberger K, Saul L (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244zbMATHGoogle Scholar
  56. Wu H, Luk R, Wong K, Kwok K (2008) Interpreting tf-idf term weights as making relevance decisions. ACM Trans Inf Syst 26(3):55–59CrossRefGoogle Scholar
  57. Xiaohui C, Potok T (2005) Document clustering analysis based on hybrid PSO+ k-means algorithm. J Comput Sci Special issue (April 15):27–33Google Scholar
  58. Xing EP, Jordan MI, Russell SJ, Ng AY (2002) Distance metric learning with application to clustering with side-information. In: NIPS, pp 521–528Google Scholar
  59. Zadeh L (1965) Fuzzy sets. Inf Control 8(3):335–353CrossRefGoogle Scholar
  60. Zadeh L (1975) The concept of linguistic variable and its application to approximate reasoning part 2. Inf Sci 4:301–357CrossRefzbMATHGoogle Scholar
  61. Zadeh L (1996) Fuzzy logic = computing with words. IEEE Trans Fuzzy Syst 4:103–111CrossRefGoogle Scholar
  62. Zhang W, Qin Z, Tao W (2012) Semi-automatic image annotation using sparse coding. In: ICMLCGoogle Scholar
  63. Zhang Y, Schneider J (2012) Maximum margin output coding. In: ICMLGoogle Scholar
  64. Zheng Y, Jeon B, Xu D, Wu QJ, Zhang H (2015) Image segmentation by generalized hierarchical fuzzy c-means algorithm. Neural Netw 28(2):961–973Google Scholar
  65. Zhu G, Kwong S (2010) Gbest-guided artificial bee colony algorithm for numerical function optimization. Appl Math Comput 217(7):3166–3173MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyHangzhou Dianzi UniversityHangzhouChina
  2. 2.Science and Technology on Communication Information Security Control LaboratoryJiangnan Electronic Communication InstituteJiaxingChina

Personalised recommendations