Comparison of Clustering Methods in Cotton Textile Industry
- 2 Citations
- 1.1k Downloads
Abstract
Clustering is the task of partitioning data objects into groups, so that the objects within a cluster are similar to one another and dissimilar to the objects in other clusters. The efficiency random algorithm for good k is used to estimate the optimal number of clusters. In this research two important clustering algorithms, namely centroid based k-means, and representative object based fuzzy c-means clustering algorithms are compared in the original real-world U.S. cotton textile and apparel imports data set. This data set is not analyzed very often, it is dictated by business, economics and politics environments and its behaviour is not well known. The analysis of several different real-world economies and industrial data sets of one country is possible to predict it’s economic development.
Keywords
Data clustering Number of clusters k-means algorithm Fuzzy c-means Random algorithmNotes
Acknowledgments
The authors acknowledge the support for research project TR 36030, funded by the Ministry of Science and Technological Development of Serbia.
References
- 1.Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Upper Saddle River (1988)zbMATHGoogle Scholar
- 2.http://otexa.trade.gov/Msrcat.htm. Accessed 29 April 2015
- 3.Larose, D.T.: Discovering Knowledge in Data: An Introduction to Data Mining. Wiley, New York (2005)zbMATHGoogle Scholar
- 4.Spath, H.: Cluster Analysis Algorithms. Ellis Horwood, Chichester (1980)zbMATHGoogle Scholar
- 5.Han, J., Kamber, M.: Data Mining. Morgan Kaufmann Publishers, Burlington (2001)zbMATHGoogle Scholar
- 6.Duda, R., Hart, P.: Pattern Classification and Scene Analysis. Wiley, New York (1973)zbMATHGoogle Scholar
- 7.Jain, A.K., Murty, N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)CrossRefGoogle Scholar
- 8.Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 39(1), 1–38 (1977)MathSciNetzbMATHGoogle Scholar
- 9.Gersho, A., Gray, R.M.: Vector quantization and Signal Compression. Communications and Information Theory. Kluwer Academic Publishers, Norwell (1992)CrossRefzbMATHGoogle Scholar
- 10.Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: 6th ACM SIGKDD, World Text Mining Conference, Boston (2000)Google Scholar
- 11.Ester, M., Frommlet, A., Kriegel, H.P., Sander, J.: Spatial data mining: database primitives, algorithms and efficient DBMS support. Data Min. Knowl. Discov. 4(2–3), 193–216 (2000)CrossRefGoogle Scholar
- 12.Heer, J., Chi, E.: Identification of web user traffic composition using multimodal clustering and information scent. In: 1st SIAM ICDM, Workshop on Web Mining, Chicago, pp. 51–58 (2001)Google Scholar
- 13.Petrov, N., Georgieva, A., Jordanov, I.: Self-organizing maps for texture classification. Neural Comput. Appl. 22(7–8), 1499–1508 (2013)CrossRefGoogle Scholar
- 14.Tibshirani, R., Hastie, T., Eisen, M., Ross, D., Botstein, D., Brown, P.: Clustering methods for the analysis of DNA microarray data. Department of Statistics, Stanford University, Stanford, Technical report. http://statweb.stanford.edu/~tibs/ftp/jcgs.ps. Accessed 29 April 2015
- 15.Piórkowski, A., Gronkowska–Serafin, J.: Towards precise segmentation of corneal endothelial cells. In: Ortuño, F., Rojas, I. (eds.) IWBBIO 2015, Part I. LNCS, vol. 9043, pp. 240–249. Springer, Heidelberg (2015)Google Scholar
- 16.Bigus, J.P.: Data Mining with Neural Networks. McGraw-Hill, New York (1996)Google Scholar
- 17.Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Upper Saddle River (1988)zbMATHGoogle Scholar
- 18.Mecca, G., Raunich, S., Pappalardo, A.: A New algorithm for clustering search results. Data Knowl. Eng. 62(3), 504–522 (2007)CrossRefGoogle Scholar
- 19.Valafar, F.: Pattern recognition techniques in microarray data analysis: a survey. Ann. N. Y. Acad. Sci. 980, 41–64 (2002)CrossRefGoogle Scholar
- 20.Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: a survey. IEEE Trans. Knowl. Data Eng. 16(11), 1370–1386 (2004)CrossRefGoogle Scholar
- 21.Das, N.: Hedge fund classification using k-means clustering method. In: 9th International Conference on Computing in Economics and Finance (2003) http://www.ijarcsms.com/docs/paper/volume1/issue6/V1I6-0015.pdf. Accessed 25 June 2015
- 22.Shi, W., Zeng, W.: Application of k-means clustering to environmental risk zoning of the chemical industrial area. Front. Environ. Sci. Eng. 8(1), 117–127 (2014)CrossRefGoogle Scholar
- 23.Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic Press, San Diego (1990)zbMATHGoogle Scholar
- 24.Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Springer, New York (1981)CrossRefzbMATHGoogle Scholar
- 25.Akaike, H.: A new look at statistical model identification. IEEE Trans. Autom. Control 19(6), 716–723 (1974)MathSciNetCrossRefzbMATHGoogle Scholar
- 26.Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)MathSciNetCrossRefzbMATHGoogle Scholar
- 27.Milligan, G.W., Cooper, M.C.: An examination of procedures for determining the number of clusters in a data set. Psychometrika 50(2), 159–179 (1985)CrossRefGoogle Scholar
- 28.Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc. 63(2), 411–423 (2001)MathSciNetCrossRefzbMATHGoogle Scholar