Applications of Soft Computing for the Web pp 167-179 | Cite as

# An Improved Clustering Method for Text Documents Using Neutrosophic Logic

## Abstract

Clustering as a part of data mining automates the process of collecting similar documents in a single cluster by grouping like ones together. With the help of clusters, we can organize text documents which are similar at a single place and it helps us to group other unknown documents in future to be assigned to one of the known cluster based on the similarity measure. Automatic clustering is usually based on words. In this work, we have used two approaches for clustering using Neutrosophic logic. While using fuzzy logic we take into account only two values; degree of truth and degree of falsity, whereas, in Neutrosophic logic, a new factor called as indeterminacy is also involved. Indeterminacy applies to the situation when for a particular document it is not sure that to which cluster it belongs. The first approach added the indeterminacy factor of Neutrosophic logic to Fuzzy C Means clustering method and modified the formula which calculates the cluster centers and the truth membership of documents toward clusters. The second approach has three phases. First, generate the dataset according to the relative frequency of words in a document. Second, decide seed documents for different clusters with the help of Euclidean distance between different documents. Finally calculate the *T*, *I*, and *F* values for all documents with respect to all clusters. Then decide the cluster for each document on the basis of *T*, *I*, and *F* values.

## Keywords

Neutrosophic logic Fuzzy logic Clustering methods Fuzzy C Means clustering Neutrosophic clustering Text mining## References

- 1.Hartigan JA (1975) Clustering algorithms. Wiley, LondonGoogle Scholar
- 2.Olson, DL, Delen D (2008) Advanced data mining techniques, 1st edn. Springer, Berlin, p 138. (February 1, 2008), ISBN 3-540-76916-1Google Scholar
- 3.Akhtar N, Ahamad MV (2015) A modified fuzzy C means clustering using neutrosophic logic. In: Proceedings of IEEE fifth international conference on communication systems and network technologies (CSNT). ISSN/ISBN 978-1-4799-1797-6/15, 10.1109/CSNT.2015.164, pp 1124–1128Google Scholar
- 4.Hartigan JA, Wong MA (1979) Algorithm AS 136: a
*K*-means clustering algorithm. J Royal Stat Soc, Ser C 28(1):100–108. JSTOR 2346830Google Scholar - 5.Suganya R, Shanthi R (2012) Fuzzy C-means algorithm—a review. Inter J Sci Res Publ 2(11)Google Scholar
- 6.Bezdek J (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New YorkCrossRefzbMATHGoogle Scholar
- 7.Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice Hall, Upper Saddle River, NJ. ISBN:0-13-022278-XGoogle Scholar
- 8.Zadeh L (1965) Fuzzy sets. Inf Control 8:338–352CrossRefzbMATHGoogle Scholar
- 9.Dunn J (1973) A fuzzy relative of the Isodata process and its use in detecting compact, well-separated clusters. J Cybern 3(3):32–57 MathSciNetCrossRefzbMATHGoogle Scholar
- 10.Smarandache F (1998) Neutrosophy / neutrosophic probability, set, and logic. American Research Press, Rehoboth, NMGoogle Scholar
- 11.Bezdek J, Hathaway R (1988) Recent convergence results for the fuzzy c-means clustering algorithms. J Classif 5(2):237–247MathSciNetCrossRefGoogle Scholar