Skip to main content
Log in

Multi-valued attribute and multi-labeled data decision tree algorithm

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

This paper analyzes the existing decision tree algorithms for dealing with multi-valued and multi-labeled data. These algorithms have the following shortcomings: The choice of which attributes is difficult and the calculation for similarity is not precise enough. Based on these deficiencies, this paper proposes a new decision tree algorithm for multi-valued and multi-labeled data (AMDT). In the algorithm, firstly a new formula sim5 is proposed for calculating the similarity between two label-sets in the child nodes. It comprehensively considers the condition which the elements appear and not appear in both of the two label-sets at the same time and adjusts the proportion of them by the coefficient α, so that the similarity calculations of the label-sets are more comprehensive and accurate. Secondly, we propose the new conditions of the corresponding node to stop splitting. Lastly, we give the prediction method. Results of comparison experiments with the existing algorithms (MMC, SSC and SCC_SP_1) show that AMDT has the higher predictive accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Quinlan JR (1986) Introduction of decision trees. Mach Learn 1:81–106

    Google Scholar 

  2. Quinlan JR (1987) Simplifying decision trees. Int J Man Mach Stud 27(3):221–234

    Article  Google Scholar 

  3. Chandra B, Kothari R, Paul P (2010) A new node splitting measure for decision tree construction. Pattern Recognit 43:2725–2731

    Article  MATH  Google Scholar 

  4. Chang PC, Fan CY, Dzan WY (2010) A CBR-based fuzzy decision tree approach for database classification. Expert Syst Appl 37:214–225

    Article  Google Scholar 

  5. Nandgaonkar S, Vahida Z, Pradip K (2009) Efficient decision tree construction for classifying numerical data. In: International conference on advances in recent technologies in communication and computing, pp 761–765

  6. Wang XZ, Yang CX (2007) Merging-branches impact on decision tree induction. Chin J Comput 30(8):1251–1258

    Google Scholar 

  7. Wei JM et al (2006) Rough set based approach for inducing decision trees. In: RSKT 2006, LNAI, vol 4062, pp 421–429

  8. Pang HL, Gao ZW et al (2008) Study on constructing method of classifying decision tree based on variable precision rough set. Syst Eng Electron 30(11):2160–2163 (in Chinese)

    MATH  Google Scholar 

  9. Miao DQ, Wang J (1997) Rough sets based approach for multivariate decision tree construction. J Softw 8(6):425–431 (in Chinese)

    Google Scholar 

  10. Liang DL, Huang GX et al (2008) A new multivariate decision tree algorithm. Comput Sci 5(1):211–212 (in Chinese)

    MathSciNet  Google Scholar 

  11. Chen Y, Hsu C (2003) Constructing a multi-valued and multi-labeled decision tree. Expert Syst Appl 25(2):199–209

    Google Scholar 

  12. Chou S, Hsu C (2005) MMDT: a multi-valued and multi-labeled decision tree classifier for data mining. Expert Syst Appl 28(2):799–812

    Article  Google Scholar 

  13. Zhao R, Li H (2007) Algorithm of multi-valued attribute and multi-labeled data decision tree. Comput Eng 33(13):87–89 (in Chinese)

    Google Scholar 

  14. Li H, Chen SQ et al (2007) A multi-valued attribute and multi-labeled data decision tree algorithm. PR & AI 20(6):815–820 (in Chinese)

    Google Scholar 

  15. Shafer JC, Agrawal R, Mehta M (1996) SPRINT: a scalable parallel classifier for data mining. In: Proceedings of the 22th international conference on very large databases, pp 544–555

  16. Agrawal R, Ghosh S, Imielinski T et al (2005) An interval classifier for database mining applications. In: Proceedings of the 18th international conference on very large databases. pp 560–573

  17. Wang H, Zaniolo C (2002) CMP: a fast decision tree classifier using multivariate predictions. In: Proceedings of the 16th international conference on data engineering, pp 449–460

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant No. 61073133, 60773084, 60603023, and the National Research Foundation for the Doctoral Program of Higher Education of China under Grant No.20070151009.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiguo Yi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yi, W., Lu, M. & Liu, Z. Multi-valued attribute and multi-labeled data decision tree algorithm. Int. J. Mach. Learn. & Cyber. 2, 67–74 (2011). https://doi.org/10.1007/s13042-011-0015-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-011-0015-2

Keywords

Navigation