Text Clustering Using Reference Centered Similarity Measure
The majority clustering skill must presume some cluster relationship relating to the data set. Similarity among the items is usually defined sometimes clearly or even absolutely. With this paper, we introduced some sort of novel numerous reference centered similarity measure and two related clustering approaches. The significant difference between a traditional dissimilarity/ similarity measure and our’s is to compared the performance of the former method using single viewpoint, which may be the source, the number of mention sources. Using several reference points, more useful assessment of similarity could possibly be achieved. Two qualification functions with regard to document clustering are proposed determined by this novel measure. We examine them with well-known clustering algorithm cosine similarity and exposed the development. Performance Analysis is conducted and compared.
KeywordsDocument Clustering Similarity Measure Cosine Similarity Multi View Point Similarity Measure
Unable to display preview. Download preview PDF.
- 4.Lakkaraju, P., Gauch, S., Speretta, M.: Document similarity based on concept tree distance. In: Proc. of the 19th ACM conf. on Hypertext and Hypermedia, pp. 127–132 (2008)Google Scholar
- 5.Ienco, D., Pensa, R.G., Meo, R.: Context-based distance learning for categorical data clustering. In: Proc. of the 8th Int. Symp. IDA, pp. 83–94 (2009)Google Scholar
- 6.Guyon, I., von Luxburg, U., Williamson, R.C.: Clustering: Science or Art? In: NIPS 2009 Workshop on Clustering Theory (2009)Google Scholar
- 7.Pękalska, E., Harol, A., Duin, R.P.W., Spillmann, B., Bunke, H.: Non-euclidean or non-metric measures can be informative. In: Yeung, D.-Y., Kwok, J.T., Fred, A., Roli, F., de Ridder, D. (eds.) SSPR & SPR 2006. LNCS, vol. 4109, pp. 871–880. Springer, Heidelberg (2006)Google Scholar
- 8.Pelillo, M.: What is a cluster? Perspectives from game theory. In: Proc. of the NIPS Workshop on Clustering Theory (2009)Google Scholar
- 10.Zhong, S.: Efficient online spherical K-means clustering. In: IEEE IJCNN, pp. 3180–3185 (2005)Google Scholar
- 13.Xu, W., Liu, X., Gong, Y.: Document clustering based on nonnegative matrix factorization. In: SIGIR, pp. 267–273 (2003)Google Scholar