Abstract
Many real-world applications require the simultaneous prediction of multiple target attributes. The techniques currently available for these problems either employ a global model that simultaneously predicts all target attributes or rely on the aggregation of individual models, each predicting one target. This paper introduces a novel solution. Our approach employs an iterative classification strategy to exploit the relationships among multiple target attributes to achieve higher accuracy. The computation scheme is developed as a wrapper in which many standard single-target classification algorithms can be simply “plugged-in” to simultaneously predict multiple targets. An empirical evaluation using eight data sets shows that the proposed method outperforms (1) an approach that constructs independent classifiers for each target, (2) a multitask neural network method, and (3) ensembles of multi-objective decision trees in terms of simultaneously predicting all target attributes correctly.
Similar content being viewed by others
References
Alimoglu, F., & Alpaydin, E. (1997). Combining multiple representations and classifiers for handwritten digit recognition. In Proceedings of the international conference on document analysis and recognition, ICDAR (pp. 637–640).
Ando, R.K., & Zhang, T. (2005). Structures from multiple tasks and unlabeled data. Journal of Machine Learning Research, 6, 1817–1853.
Bakker, B., & Heskes, T. (2003). Task clustering and gating for bayesian multitask learning. Journal of Machine Learning Research, 4, 83–99.
Baxter, J. (2000). A model of inductive bias learning. Journal of Artificial Intelligence Research, 12, 149–198.
Ben-David, S., & Schuller, R. (2003). Exploiting task relatedness for multiple task learning. In COLT’03 (pp. 567–580).
Bishop, C.M. (1996). Neural networks for pattern recognition. UK: Oxford University Press.
Blockeel, H., Raedt, L.D., Ramong, J. (1998). Top-down induction of clustering trees. In ICML’98 (pp. 55–63). Morgan Kaufmann.
Boutell, M.R., Luo, J., Shen, X., Brown, C.M. (2004). Learning multi-label scene classification. Pattern Recognition, 37(9), 1757–1771.
Caruana, R. (1997). Multitask learning. Machine Learning, 28(1), 41–75.
Chakrabarti, S., Dom, B., Indyk, P. (1998). Enhanced hypertext categorization using hyperlinks. In SIGMOD ’98 (pp. 307–318). New York, NY, USA: ACM.
Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3(4), 261–283.
Crammer, K., & Singer, Y. (2003). A family of additive online algorithms for category ranking. Journal of Machine Learning Research, 3, 1025–1058.
Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30.
Elisseeff, A., & Weston, J. (2001). A kernel method for multi-labelled classification. In Advances in neural info. proc. systems (Vol. 14, pp. 681–687).
Evgeniou, T., & Pontil, M. (2004). Regularized multi–task learning. In KDD’04 (pp. 109–117).
Fang, J., Ji, S., Xue, Y., Carin, L. (2008). Multitask classification by learning the task relevance. IEEE Signal Processing Letters, 15, 593–596.
Friedman, M. (1937). The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association, 32(200), 675–701.
Friedman, M. (1940). A comparison of alternative tests of significance for the problem of m rankings. The Annals of Mathematical Statistics, 11(1), 86–92.
Fürnkranz, J., Hüllermeier, E., Loza Mencía, E., Brinker, K. (2008). Multilabel classification via calibrated label ranking. Machine Learning, 73(2), 133–153.
van der Gaag, L.C., & de Waal, P.R. (2006). Multi-dimensional bayesian network classifiers. In Probabilistic graphical models (pp. 107–114).
Gaag, L.C.V.D., Renooij, S., Witteman, C., Aleman, B.M.P., Taal, B.G. (2001). Probabilities for a probabilistic network: A case-study in oesophageal carcinoma. In AI in medicine (pp. 123–148).
Getoor, L., & Taskar, B. (2007). Introduction to statistical relational learning (adaptive computation and machine learning). The MIT Press.
Ghamrawi, N., & McCallum, A. (2005). Collective multi-label classification. In CIKM ’05 (pp. 195–200). ACM.
Godbole, S., & Sarawagi, S. (2004). Discriminative methods for multi-labeled classification. In Proceedings of the 8th Pacific-Asia conference on knowledge discovery and data mining (pp. 22–30). Springer.
Guo, H., & Viktor, H.L. (2006). Mining relational data through correlation-based multiple view validation. In KDD ’06 (pp. 567–573). New York, NY, USA.
Guo, H., Viktor, H.L., Paquet, E. (2011). Privacy disclosure and preservation in learning with multi-relational databases. JCSE, 5(3), 183–196.
Hollander, M., & Wolfe, D.A. (1999). Nonparametric statistical methods 2nd edn. New York: John Wiley & Sons.
Hüllermeier, E., Fürnkranz, J., Cheng, W., Brinker, K. (2008). Label ranking by learning pairwise preferences. Artificial Intelligence, 172(16–17), 1897–1916.
Ji, S., Tang, L., Yu, S., Ye, J. (2008). Extracting shared subspace for multi-label classification. In KDD ’08 (pp. 381–389). New York, NY, USA: ACM.
Kocev, D., Vens, C., Struyf, J., Džeroski, S. (2007). Ensembles of multi-objective decision trees. In ECML’07 (pp. 624–631).
Last, M. (2004). Multi-objective classification with info-fuzzy networks. In ECML (pp. 239–249).
Lipkin, M., Engle, Jr., R.L., Flehinger, B.J., Gerstman, L.J., Atamer, M.A. (1969). Computer-aided differential diagnosis of hematologic diseases. Annals of the New York Academy of Sciences, 161(2), 670–679. doi:10.1111/j.1749-6632.1969.tb34098.x.
Lu, Q., & Getoor, L. (2003). Link-based classification. In ICML’03.
Macskassy, S.A., & Provost, F. (2003). A simple relational classifier. In Proceedings of the second workshop on multi-relational data mining (MRDM-2003) at KDD-2003 (pp. 64–76).
Mitchell, M.T. (1996). Machine learning. McGraw Hill, New York, USA.
Neville, J., & Jensen, D. (2000). Iterative classification in relational data. In AAAI workshop on learning statistical models from relational data (pp. 13-20).
Oh, H.J., Myaeng, S.H., Lee, M.H. (2000). A practical hypertext catergorization method using links and incrementally available class information. In SIGIR ’00 (pp. 264–271). New York, NY, USA: ACM.
Quinlan, J.R. (1993). C4.5: programs for machine learning. USA: Morgan Kaufmann Publishers Inc.
Read, J. (2008). A pruned problem transformation method for multi-label classification. In NZCSRS 2008 (pp. 143–150).
Read, J., Pfahringer, B., Holmes, G., Frank, E. (2009). Classifier chains for multi-label classification. In ECML/PKDD (2)’09 (pp. 254–269).
Siegel, S., & Castellan, N.J. (1988). Nonparametric statistics for the behavioral sciences. McGraw-Hill Humanities.
Snoek, C.G.M., Worring, M., van Gemert, J.C., Geusebroek, J.M., Smeulders, A.W.M. (2006). The challenge problem for automated detection of 101 semantic concepts in multimedia. In MULTIMEDIA ’06 (pp. 421–430). ACM.
Suzuki, E., Gotoh, M., Choki, Y. (2001). Bloomy decision tree for multi-objective classification. In PKDD ’01 (pp. 436–447). London, UK: Springer-Verlag.
Taskar, B., Segal, E., Koller, D. (2001). Probabilistic classification and clustering in relational data. In IJCAI’01 (pp. 870–878).
Thabtah, F.A., Cowling, P., Peng, Y. (2004). Mmac: A new multi-class, multi-label associative classification approach. In ICDM ’04 (pp. 217–224).
Thrun, S., Bala, J., Bloedorn, E., Bratko, I., Cestnik, B., Cheng, J., Jong, K.D., Dzeroski, S., Fahlman, S., Fisher, D., Hamann, R., Kaufman, K., Keller, S., Kononenko, I., Kreuziger J, Michalski, R., Mitchell, T., Pachowicz, P., Reich, Y., Vafaie, H., Welde, W.V.D., Wenzel, W., Wnek, J., Zhang, J. (1991). The monk’s problems a performance comparison of different learning algorithms. Tech. rep.
Tsoumakas, G., & Katakis, I. (2007). Multi label classification: an overview. International Journal of Data Warehouse and Mining, 3(3), 1–13.
Tsoumakas, G., & Vlahavas, I. (2007). Random k-labelsets: An ensemble method for multilabel classification. In ECML 2007 (pp. 406–417). Warsaw, Poland.
Tsoumakas, G., Katakis, I., Vlahavas, I. (2010). Mining multi-label data. In Data Mining and Knowledge Discovery Handbook (pp. 667–685). doi:10.1007/978-0-387-09823-4_34.
Ueda, N., & Saito, K. (2006). Parametric mixture model for multitopic text. Systems and Computers in Japan, 37(2), 56–66.
Wieczorkowska, A., Synak, P., Raś, Z. (2006). Multi-label classification of emotions in music. In Intelligent Information Processing and Web Mining (Vol. 35, pp. 307–315). doi:10.1007/3-540-33521-8_30.
Witten, I.H., & Frank, E. (2000). Data mining: Practical machine learning tools and techniques with Java implementations. CA, USA: Morgan Kaufmann.
Xi, P., Lee, W.S., Shu, C. (2007). Analysis of segmented human body scans. In GI ’07 (pp. 19–26).
Xue, Y., Liao, X., Carin, L., Krishnapuram, B. (2007). Multi-task learning for classification with dirichlet process priors. Journal of Machine Learning Research, 8, 35–63.
Yang, Y. (2001). A study on thresholding strategies for text categorization. In SIGIR (pp. 137–145). ACM Press.
Yu, K., Tresp, V., Schwaighofer, A. (2005). Learning gaussian processes from multiple tasks. In ICML ’05 (pp. 1012–1019). New York, NY, USA: ACM.
Zenko, B., & Dzeroski, S. (2008). Learning classification rules for multiple target attributes. In PAKDD’08 (pp. 454–465).
Zhang, M.L., & Zhou, Z.H. (2006). Multilabel neural networks with applications to functional genomics and text categorization. IEEE Transactions on Knowledge and Data Engineering, 18(10), 1338–1351.
Zhang, M.L., & Zhou, Z.H. (2007a). Ml-knn: a lazy learning approach to multi-label learning. Pattern Recognition, 40 2007.
Zhang, M.L., & Zhou, Z.H. (2007b). Multi-label learning by instance differentiation. In AAAI’07 (pp. 669–674). AAAI Press.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Guo, H., Létourneau, S. Iterative classification for multiple target attributes. J Intell Inf Syst 40, 283–305 (2013). https://doi.org/10.1007/s10844-012-0224-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-012-0224-5