A novel rough value set categorical clustering technique for supplier base management


Significant business implications and effective handling of supply side exceptions require a successful Supplier Base Management (SBM). The process of clustering manages the number of suppliers by grouping them on the basis of similar characteristics that reduces the number of variables impacting the operations. Several existing categorical clustering techniques for such grouping contributed well than their predecessors however, the accuracy, uncertainty, entropy and computation are key measures need to be improved. Especially, the existing clustering techniques cluster randomly in case of independent and insignificant type of data. The aim of this research is to introduce a novel rough value set based categorical clustering technique named Maximum Value Attribute (MVA). The proposed MVA techniques overcome the issues of existing techniques by combining the concept of Number of Automated Clusters (NoACs) with rough value set which makes it novel and significant clustering idea. Few relevant and necessary propositions are illustrated to prove the effectiveness of NoACs. The existing and proposed rough sets based and classical categorical clustering techniques are compared in terms of purity, entropy, accuracy, rough accuracy, time and iterations. Experimental results based on a SBM and fifteen (15) benchmark data sets reveal better performance of MVA. The experimental results show significant overall percentage improvement of proposed MVA technique against existing rough based techniques for iterations (99.7%), time (99.4%), number of obtained clusters (84%), purity (32%), entropy (32%), accuracy (20%), and rough accuracy (13%).

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. 1.

    Darshit P et al (2010) A clustering algorithm for supplier base management. Int J Prod Res 48(13):3803. https://doi.org/10.1080/00207540902942891

    Article  MATH  Google Scholar 

  2. 2.

    Uddin J, Ghazali R, Deris MM, Naseem R, Shah H (2016) A survey on bug prioritization. Artif Intell Rev. https://doi.org/10.1007/s10462-016-9478-6

    Article  Google Scholar 

  3. 3.

    Naseem R, Maqbool O, Muhammad S (2013) Cooperative clustering for software modularization. J Syst Softw 86(8):2045. https://doi.org/10.1016/j.jss.2013.03.080

    Article  Google Scholar 

  4. 4.

    Wong KP, Feng D, Meikle SR, Fulham MJ (2000) Segmentation of dynamic PET images using cluster analysis. IEEE Symp Nuclear Sci 3:126. https://doi.org/10.1109/NSSMIC.2000.949251

    Article  Google Scholar 

  5. 5.

    Shuanhu W et al (2004) Cluster analysis of gene expression data based on self-splitting and merging competitive learning. IEEE Trans Inf Technol Biomed 8(1):5. https://doi.org/10.1109/TITB.2004.824724

    Article  Google Scholar 

  6. 6.

    Huang H, Meng F, Zhou S, Jiang F, Manogaran G (2019) Brain image segmentation based on FCM clustering algorithm and rough set. IEEE Access 7:12386. https://doi.org/10.1109/ACCESS.2019.2893063

    Article  Google Scholar 

  7. 7.

    Uddin J, Ghazali R, Deris MM (2017) An empirical analysis of rough set categorical clustering techniques. PLoS ONE 12(1):1. https://doi.org/10.1371/journal.pone.0164803

    Article  Google Scholar 

  8. 8.

    Gibson D, Kleinberg J (2000) Clustering categorical data: an approach based on dynamical systems. VLDB J 8:222

    Article  Google Scholar 

  9. 9.

    Ganti V, Ramakrishnan JGR (1999) In: Proceedings of the 5th ACM SIGKDD international conference on knowledge discovery and data mining, pp 73–83

  10. 10.

    Guha RKS, Rastogi S (1999) In: Proceedings 15th international conference on data engineering, pp 512–521. https://doi.org/10.1109/ICDE.1999.754967

  11. 11.

    Huang Z (1998) Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min Knowl Disc 2:283

    Article  Google Scholar 

  12. 12.

    Herawan T, Deris MM, Abawajy JH (2010) A rough set approach for selecting clustering attribute. Knowl-Based Syst 23(3):220. https://doi.org/10.1016/j.knosys.2009.12.003

    Article  Google Scholar 

  13. 13.

    Kim DW, Lee KH, Lee D (2004) Fuzzy clustering of categorical data using fuzzy centroids. Pattern Recogn Lett 25(11):1263. https://doi.org/10.1016/j.patrec.2004.04.004

    Article  Google Scholar 

  14. 14.

    Mazlack LJ, He A, Zhu Y (2000) In: Proceedings of the ISCA 13th, international conference, CAINE, pp 1–6

  15. 15.

    Parmar D, Wu T, Blackhurst J (2007) MMR: an algorithm for clustering categorical data using rough set theory. Data Knowl Eng 63(3):879. https://doi.org/10.1016/j.datak.2007.05.005

    Article  Google Scholar 

  16. 16.

    Herawan T, Deris MM (2009) A framework on rough set-based partitioning attribute selection, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 5755 LNAI, 91. https://doi.org/10.1007/978-3-642-04020-7_11

  17. 17.

    Hassanein W, Elmelegy A (2013) An algorithm for selecting clustering attribute using significance of attributes. Int J Database Theory Appl 6(5):53. https://doi.org/10.14257/ijdta.2013.6.5.06

    Article  Google Scholar 

  18. 18.

    Park IK, Choi GS (2015) Rough set approach for clustering categorical data using information-theoretic dependency measure. Inf Syst 48:289. https://doi.org/10.1016/j.is.2014.06.008

    Article  Google Scholar 

  19. 19.

    Wu J, Xiong H, Chen J (2009) Adapting the right measures for K-means clustering. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining–KDD’09, p 877. https://doi.org/10.1145/1557019.1557115. http://portal.acm.org/citation.cfm?doid=1557019.1557115

  20. 20.

    Pawlak Z (1996) In: Proceedings of Asian fuzzy systems symposium on soft computing in intelligent systems and information processing. IEEE, pp 1–6. https://doi.org/10.1109/AFSS.1996.583540. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=583540http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=583540

  21. 21.

    Davey J, Burd E (2000) In: Proceedings of 7th working conference on reverse engineering. IEEE Comput. Soc, pp 268–276. https://doi.org/10.1109/WCRE.2000.891478. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=891478

  22. 22.

    Wu J, Hassan AE, Holt RC (2005) In: IEEE international conference on software maintenance, ICSM, 2005, pp 525–535. https://doi.org/10.1109/ICSM.2005.31

  23. 23.

    Mehdizadeh E (2009) A fuzzy clustering pso algorithm for supplier base management. Int J Manag Sci Eng Manag 4(4):311. https://doi.org/10.1080/17509653.2009.10671084

    Article  Google Scholar 

  24. 24.

    Krause DR, Handfield RB, Scannell TV (1998) An empirical investigation of supplier development: reactive and strategic processes. J Oper Manag 17(1):39. https://doi.org/10.1016/S0272-6963(98)00030-8

    Article  Google Scholar 

  25. 25.

    Akman G (2015) Evaluating suppliers to include green supplier development programs via fuzzy c-means and VIKOR methods. Comput Ind Eng 86:69. https://doi.org/10.1016/j.cie.2014.10.013

    Article  Google Scholar 

  26. 26.

    Badi I, Pamucar D (2020) Supplier selection for steelmaking company by using combined Grey–Marcos methods. Decision Making Appl Manag Eng 3(2):37. https://doi.org/10.31181/dmame2003037b

    Article  Google Scholar 

  27. 27.

    Chattopadhyay R, Chakraborty S, Chakraborty S (2020) An integrated D-MARCOS method for supplier selection in an iron and steel industry. Decision Making Appl Manag Eng 3(2):49. https://doi.org/10.31181/dmame2003049c

    Article  Google Scholar 

  28. 28.

    Lu J, Zhao Z (2008) Improved TOPSIS based on rough set theory for selection of suppliers. In: 2008 International conference on wireless communications, networking and mobile computing, WiCOM 2008, pp 1–4. https://doi.org/10.1109/WiCom.2008.1537

  29. 29.

    Matić B, Jovanović S, Das DK, Zavadskas EK, Stević Z, Sremac S, Marinković M (2019) A new hybrid MCDM model: sustainable supplier selection in a construction company. Symmetry. https://doi.org/10.3390/sym11030353

    Article  Google Scholar 

  30. 30.

    Chatterjee K, Pamucar D, Zavadskas EK (2018) Evaluating the performance of suppliers based on using the R’AMATEL-MAIRCA method for green supply chain implementation in electronics industry. J Clean Prod 184(February):101. https://doi.org/10.1016/j.jclepro.2018.02.186

    Article  Google Scholar 

  31. 31.

    Đalić I, Stević Ž, Karamasa C, Puška A (2020) A novel integrated fuzzy PIPRECIA-interval rough saw model: green supplier selection. Decision Making Appl Manag Eng 3(1):80. https://doi.org/10.31181/dmame2003114d

    Article  Google Scholar 

  32. 32.

    Herawan T, Tri I, Yanto R, Deris MMAT (2010) ROSMAN : ROugh Set approach for clustering Supplier base MANagement. Biomed Soft Comput Hum Sci 16(2):105

    Google Scholar 

  33. 33.

    Guha S, Meyerson A, Mishra N, Motwani R, OCallaghan L (2003) Clustering data streams: theory and practice. IEEE Trans Knowl Data Eng 15(3):515. https://doi.org/10.1109/TKDE.2003.1198387

    Article  Google Scholar 

  34. 34.

    Akay Ö, Yüksel G (2018) Clustering the mixed panel dataset using Gower’s distance and k-prototypes algorithms. Commun Stat Simul Comput 47(10):3031. https://doi.org/10.1080/03610918.2017.1367806

    MathSciNet  Article  Google Scholar 

  35. 35.

    He Z (2004) In: Proceedings of the WAIM conference

  36. 36.

    Krishnapuram R, Keller JM (1993) A possibilistic approach to clustering. IEEE Trans Fuzzy Syst 1(2):98. https://doi.org/10.1109/91.227387

    Article  Google Scholar 

  37. 37.

    Pawlak Z et al (1995) Rough sets. Commun ACM 38(11):88. https://doi.org/10.1145/219717.219791

    Article  Google Scholar 

  38. 38.

    Yao YY (1998) Constructive and algebraic methods of the theory of rough sets. Inf Sci 109(1–4):21. https://doi.org/10.1016/S0020-0255(98)00012-7

    MathSciNet  Article  MATH  Google Scholar 

  39. 39.

    Bonikowski Z, Bryniarski E, Wybraniec-Skardowska U (1998) Extensions and intentions in the rough set theory. J Inform Sci 107(1–4):149. https://doi.org/10.1016/S0020-0255(97)10046-9

    MathSciNet  Article  MATH  Google Scholar 

  40. 40.

    Ali MI, Davvaz B, Shabir M (2013) Some properties of generalized rough sets. Inf Sci 224:170. https://doi.org/10.1016/j.ins.2012.10.026

    MathSciNet  Article  MATH  Google Scholar 

  41. 41.

    Wei W, Liang J (2019) Information fusion in rough set theory: an overview. Inf Fusion 48(January 2018):107. https://doi.org/10.1016/j.inffus.2018.08.007

    Article  Google Scholar 

  42. 42.

    Pamucar D (2020) The application of the hybrid interval rough weighted power-Heronian operator in multi-criteria decision-making. Oper Res Eng Sci Theory Appl 3(2):54. https://doi.org/10.31181/oresta2003049p

    MathSciNet  Article  Google Scholar 

  43. 43.

    Pawlak Z, Skowron A (2007) Rudiments of rough sets. Inf Sci 177(1):3. https://doi.org/10.1016/j.ins.2006.06.003

    MathSciNet  Article  MATH  Google Scholar 

  44. 44.

    Kumar P, Tripathy B (2009) MMeR an algorithm for clustering heterogeneous data using rough set theory. Int J Rapid Manuf 1(2)

  45. 45.

    Yanto I, Herawan T, Deris M (2011) Data clustering using variable precision rough set. Intell Data Anal 15:465. https://doi.org/10.3233/IDA-2011-0478

    Article  Google Scholar 

  46. 46.

    Tripathy B, Ghosh A (2011) SDR: an algorithm for clustering categorical data using rough set theory. IEEE Recent Adv Intell Comput Syst. https://doi.org/10.1109/RAICS.2011.6069433

    Article  Google Scholar 

  47. 47.

    Jyoti (2013) Clustering categorical data using rough st: a review. Int J Adv Res IT Eng 2(12):30

    Google Scholar 

  48. 48.

    Park IK, Choi GS (2015) A variable-precision information-entropy rough set approach for job searching. Inf Syst 48:279. https://doi.org/10.1016/j.is.2014.05.012

    Article  Google Scholar 

  49. 49.

    Yanto ITR, Ismail MA, Herawan T (2016) A modified Fuzzy k-Partition based on indiscernibility relation for categorical data clustering. Eng Appl Artif Intell 53:41. https://doi.org/10.1016/j.engappai.2016.01.026

    Article  Google Scholar 

  50. 50.

    Tripathy BK, Goyal A, Sourav PA (2016) A comparative analysis of rough intuitionistic fuzzy k-mode algorithm for clustering categorical data. Res J Pharm Biol Chem Sci 7(5):2787

    Google Scholar 

  51. 51.

    Tripathy B, Goyal A, Chowdhury R, Sourav PA (2017) MMeMeR: an algorithm for clustering heterogeneous data using rough set theory. Int J Intell Syst Appl 8:25. https://doi.org/10.5815/ijisa.2017.08.03

    Article  Google Scholar 

  52. 52.

    Tan PN, Steinbach M, Kumar V (2006) Introduction to data mining. Addison-Wesley. https://doi.org/10.1016/0022-4405(81)90007-8. http://www-users.cs.umn.edu/~kumar/

  53. 53.

    Garcia HV, Shihab E (2014) In: Proceedings of the 11th working conference on mining software repositories, pp 72–81

  54. 54.

    Christopher PR, Manning D, Schütze H (2009) Introduction to information retrieval. Cambridge University Press

  55. 55.

    Maqbool O, Babri HA (2007) Hierarchical clustering for software architecture recovery. IEEE Trans Softw Eng 33(11):759. https://doi.org/10.1109/TSE.2007.70732

    Article  Google Scholar 

  56. 56.

    Wang Y, Liu P, Guo H, Li H, Chen X (2010) In: International conference on intelligent computing and cognitive informatics, pp 1–4. https://doi.org/10.1109/ICICCI.2010.45

  57. 57.

    Rissino S, Lambert-torres G (2009). In: Julio P, Adem K (eds) Data mining and knowledge discovery in real life applications. Austria, I-Tech, Vienna, pp 35–58

  58. 58.

    Grzymala-busse JW (2005) Rough set theory with applications to data mining. Real World Appl Comput Intell 179:221

    Google Scholar 

  59. 59.

    Zhao Y (2001) Criterion functions for document clustering: experiments and analysis. Tech. rep., Department of Computer Science, University of Minnesota, USA

  60. 60.

    Aggarwal C, Reddy C (2014) Data clustering: algorithms and applications. CRC Press Taylor & Francis Group,

  61. 61.

    Amigó E, Gonzalo J, Artiles J, Verdejo F (2009) A comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf Retrieval 12(4):461. https://doi.org/10.1007/s10791-008-9066-8

    Article  Google Scholar 

  62. 62.

    Li T, Ogihara M (2004) In: Proceedings of the 21st international conference on machine learning, Banff, Canada

  63. 63.

    Beaubouef T, Petry FE, Arora G (1998) Information-theoretic measures of uncertainty for rough sets and rough relational databases. J Inf Sci 5

  64. 64.

    MacQueen JB (1967) K means some methods for classification and analysis of multivariate observations. In: 5th Berkeley symposium on mathematical statistics and probability, vol 1(233), p 281. http://projecteuclid.org/euclid.bsmsp/1200512992

  65. 65.

    Ahmad A, Dey L (2007) A k-mean clustering algorithm for mixed numeric and categorical data. Data Knowl Eng 63(2):503. https://doi.org/10.1016/j.datak.2007.03.016

    Article  Google Scholar 

  66. 66.

    Dua D, Graff C 2017, UCI Machine Learning Repository. http://archive.ics.uci.edu/ml

Download references

Author information



Corresponding author

Correspondence to Jamal Uddin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Uddin, J., Ghazali, R., Deris, M.M. et al. A novel rough value set categorical clustering technique for supplier base management. Computing (2021). https://doi.org/10.1007/s00607-021-00950-w

Download citation


  • Clustering
  • Indiscernibility relation
  • Rough sets
  • Supplier base management
  • Value set

Mathematics Subject Classification

  • 03Exx
  • 03E99
  • 00-01
  • 00-02
  • 00Axx
  • 00Bxx