Clustering based on words distances
- 129 Downloads
In order to find the relevance of the key words in the hot topics effectively, we proposed a clustering method based on words-distances. We calculated the distances between the words firstly, then calculated the sectional density of each words. We regarded the words which have higher sectional density and far away from sectional density point as the center point in the clustering. After find the center point, we start to clustering. This method through decision diagram on estimating the number of clusters. At last, we can find the results on the evaluating indicator of accuracy rate and recall rate.
KeywordsHot topic Clustering Accuracy rate Recall rate
This research is supported by the following fundings or programs: the National Natural Science Foundation of China (61402309), the Fundamental Research Funds for the Central Universities (No. XDJK2014B012), the National Social Science Foundation of China (13CGL146), the National Social Science Foundation of China (15BGL2729), the Study on the Key Common Characteristics of Network Transaction Fraud (14SKF01).
- 1.Qiao, Y.N., Yong, Q., Hui, H.: The research on term field based term co-occurrence model. In:Third International Conference on Semantics, Knowledge and Grid. IEEE, pp. 471–474 (2007)Google Scholar
- 2.Zhang, Y., Shi, K., Qingpeng, X.U.: Spam filter based on term co-occurrence model. J. Chin. Inf. Proc. 6, 010 (2009)Google Scholar
- 3.Gao, J., Zhou, M., et al.: Resolving query translation ambiguity using a decaying co-occurrence model and syntactic dependence relations. In: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. ACM (2002)Google Scholar
- 16.Deng, X.L., Wang, B., Wu, B., et al.: Modularity modeling and evaluation in community detecting of complex network based on information entropy. J. Comput. Res. Dev. 49(4), 725–734 (2012)Google Scholar
- 19.Coscia, M., Rossetti, G., Giannotti, F., et al.: Demon: a local-first discovery method for overlapping communities. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 615–623 (2012)Google Scholar
- 27.Ester, M., Kriegel, H.P., Sander, J., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96(34), 226–231 (1996)Google Scholar