Iterative classification for multiple target attributes

Guo, Hongyu; Létourneau, Sylvain

doi:10.1007/s10844-012-0224-5

Iterative classification for multiple target attributes

Published: 13 October 2012

Volume 40, pages 283–305, (2013)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Hongyu Guo¹ &
Sylvain Létourneau¹

337 Accesses
3 Citations
Explore all metrics

Abstract

Many real-world applications require the simultaneous prediction of multiple target attributes. The techniques currently available for these problems either employ a global model that simultaneously predicts all target attributes or rely on the aggregation of individual models, each predicting one target. This paper introduces a novel solution. Our approach employs an iterative classification strategy to exploit the relationships among multiple target attributes to achieve higher accuracy. The computation scheme is developed as a wrapper in which many standard single-target classification algorithms can be simply “plugged-in” to simultaneously predict multiple targets. An empirical evaluation using eight data sets shows that the proposed method outperforms (1) an approach that constructs independent classifiers for each target, (2) a multitask neural network method, and (3) ensembles of multi-objective decision trees in terms of simultaneously predicting all target attributes correctly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

http://store.sae.org/caesar/

References

Alimoglu, F., & Alpaydin, E. (1997). Combining multiple representations and classifiers for handwritten digit recognition. In Proceedings of the international conference on document analysis and recognition, ICDAR (pp. 637–640).
Ando, R.K., & Zhang, T. (2005). Structures from multiple tasks and unlabeled data. Journal of Machine Learning Research, 6, 1817–1853.
MathSciNet MATH Google Scholar
Bakker, B., & Heskes, T. (2003). Task clustering and gating for bayesian multitask learning. Journal of Machine Learning Research, 4, 83–99.
Google Scholar
Baxter, J. (2000). A model of inductive bias learning. Journal of Artificial Intelligence Research, 12, 149–198.
MathSciNet MATH Google Scholar
Ben-David, S., & Schuller, R. (2003). Exploiting task relatedness for multiple task learning. In COLT’03 (pp. 567–580).
Bishop, C.M. (1996). Neural networks for pattern recognition. UK: Oxford University Press.
MATH Google Scholar
Blockeel, H., Raedt, L.D., Ramong, J. (1998). Top-down induction of clustering trees. In ICML’98 (pp. 55–63). Morgan Kaufmann.
Boutell, M.R., Luo, J., Shen, X., Brown, C.M. (2004). Learning multi-label scene classification. Pattern Recognition, 37(9), 1757–1771.
Article Google Scholar
Caruana, R. (1997). Multitask learning. Machine Learning, 28(1), 41–75.
Article MathSciNet Google Scholar
Chakrabarti, S., Dom, B., Indyk, P. (1998). Enhanced hypertext categorization using hyperlinks. In SIGMOD ’98 (pp. 307–318). New York, NY, USA: ACM.
Chapter Google Scholar
Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3(4), 261–283.
Google Scholar
Crammer, K., & Singer, Y. (2003). A family of additive online algorithms for category ranking. Journal of Machine Learning Research, 3, 1025–1058.
MathSciNet MATH Google Scholar
Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30.
MATH Google Scholar
Elisseeff, A., & Weston, J. (2001). A kernel method for multi-labelled classification. In Advances in neural info. proc. systems (Vol. 14, pp. 681–687).
Evgeniou, T., & Pontil, M. (2004). Regularized multi–task learning. In KDD’04 (pp. 109–117).
Fang, J., Ji, S., Xue, Y., Carin, L. (2008). Multitask classification by learning the task relevance. IEEE Signal Processing Letters, 15, 593–596.
Article Google Scholar
Friedman, M. (1937). The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association, 32(200), 675–701.
Article Google Scholar
Friedman, M. (1940). A comparison of alternative tests of significance for the problem of m rankings. The Annals of Mathematical Statistics, 11(1), 86–92.
Article Google Scholar
Fürnkranz, J., Hüllermeier, E., Loza Mencía, E., Brinker, K. (2008). Multilabel classification via calibrated label ranking. Machine Learning, 73(2), 133–153.
Article Google Scholar
van der Gaag, L.C., & de Waal, P.R. (2006). Multi-dimensional bayesian network classifiers. In Probabilistic graphical models (pp. 107–114).
Gaag, L.C.V.D., Renooij, S., Witteman, C., Aleman, B.M.P., Taal, B.G. (2001). Probabilities for a probabilistic network: A case-study in oesophageal carcinoma. In AI in medicine (pp. 123–148).
Getoor, L., & Taskar, B. (2007). Introduction to statistical relational learning (adaptive computation and machine learning). The MIT Press.
Ghamrawi, N., & McCallum, A. (2005). Collective multi-label classification. In CIKM ’05 (pp. 195–200). ACM.
Godbole, S., & Sarawagi, S. (2004). Discriminative methods for multi-labeled classification. In Proceedings of the 8th Pacific-Asia conference on knowledge discovery and data mining (pp. 22–30). Springer.
Guo, H., & Viktor, H.L. (2006). Mining relational data through correlation-based multiple view validation. In KDD ’06 (pp. 567–573). New York, NY, USA.
Guo, H., Viktor, H.L., Paquet, E. (2011). Privacy disclosure and preservation in learning with multi-relational databases. JCSE, 5(3), 183–196.
Google Scholar
Hollander, M., & Wolfe, D.A. (1999). Nonparametric statistical methods 2nd edn. New York: John Wiley & Sons.
MATH Google Scholar
Hüllermeier, E., Fürnkranz, J., Cheng, W., Brinker, K. (2008). Label ranking by learning pairwise preferences. Artificial Intelligence, 172(16–17), 1897–1916.
Article MathSciNet MATH Google Scholar
Ji, S., Tang, L., Yu, S., Ye, J. (2008). Extracting shared subspace for multi-label classification. In KDD ’08 (pp. 381–389). New York, NY, USA: ACM.
Chapter Google Scholar
Kocev, D., Vens, C., Struyf, J., Džeroski, S. (2007). Ensembles of multi-objective decision trees. In ECML’07 (pp. 624–631).
Last, M. (2004). Multi-objective classification with info-fuzzy networks. In ECML (pp. 239–249).
Lipkin, M., Engle, Jr., R.L., Flehinger, B.J., Gerstman, L.J., Atamer, M.A. (1969). Computer-aided differential diagnosis of hematologic diseases. Annals of the New York Academy of Sciences, 161(2), 670–679. doi:10.1111/j.1749-6632.1969.tb34098.x.
Article Google Scholar
Lu, Q., & Getoor, L. (2003). Link-based classification. In ICML’03.
Macskassy, S.A., & Provost, F. (2003). A simple relational classifier. In Proceedings of the second workshop on multi-relational data mining (MRDM-2003) at KDD-2003 (pp. 64–76).
Mitchell, M.T. (1996). Machine learning. McGraw Hill, New York, USA.
MATH Google Scholar
Neville, J., & Jensen, D. (2000). Iterative classification in relational data. In AAAI workshop on learning statistical models from relational data (pp. 13-20).
Oh, H.J., Myaeng, S.H., Lee, M.H. (2000). A practical hypertext catergorization method using links and incrementally available class information. In SIGIR ’00 (pp. 264–271). New York, NY, USA: ACM.
Chapter Google Scholar
Quinlan, J.R. (1993). C4.5: programs for machine learning. USA: Morgan Kaufmann Publishers Inc.
Google Scholar
Read, J. (2008). A pruned problem transformation method for multi-label classification. In NZCSRS 2008 (pp. 143–150).
Read, J., Pfahringer, B., Holmes, G., Frank, E. (2009). Classifier chains for multi-label classification. In ECML/PKDD (2)’09 (pp. 254–269).
Siegel, S., & Castellan, N.J. (1988). Nonparametric statistics for the behavioral sciences. McGraw-Hill Humanities.
Snoek, C.G.M., Worring, M., van Gemert, J.C., Geusebroek, J.M., Smeulders, A.W.M. (2006). The challenge problem for automated detection of 101 semantic concepts in multimedia. In MULTIMEDIA ’06 (pp. 421–430). ACM.
Suzuki, E., Gotoh, M., Choki, Y. (2001). Bloomy decision tree for multi-objective classification. In PKDD ’01 (pp. 436–447). London, UK: Springer-Verlag.
Google Scholar
Taskar, B., Segal, E., Koller, D. (2001). Probabilistic classification and clustering in relational data. In IJCAI’01 (pp. 870–878).
Thabtah, F.A., Cowling, P., Peng, Y. (2004). Mmac: A new multi-class, multi-label associative classification approach. In ICDM ’04 (pp. 217–224).
Thrun, S., Bala, J., Bloedorn, E., Bratko, I., Cestnik, B., Cheng, J., Jong, K.D., Dzeroski, S., Fahlman, S., Fisher, D., Hamann, R., Kaufman, K., Keller, S., Kononenko, I., Kreuziger J, Michalski, R., Mitchell, T., Pachowicz, P., Reich, Y., Vafaie, H., Welde, W.V.D., Wenzel, W., Wnek, J., Zhang, J. (1991). The monk’s problems a performance comparison of different learning algorithms. Tech. rep.
Tsoumakas, G., & Katakis, I. (2007). Multi label classification: an overview. International Journal of Data Warehouse and Mining, 3(3), 1–13.
Article Google Scholar
Tsoumakas, G., & Vlahavas, I. (2007). Random k-labelsets: An ensemble method for multilabel classification. In ECML 2007 (pp. 406–417). Warsaw, Poland.
Chapter Google Scholar
Tsoumakas, G., Katakis, I., Vlahavas, I. (2010). Mining multi-label data. In Data Mining and Knowledge Discovery Handbook (pp. 667–685). doi:10.1007/978-0-387-09823-4_34.
Ueda, N., & Saito, K. (2006). Parametric mixture model for multitopic text. Systems and Computers in Japan, 37(2), 56–66.
Article Google Scholar
Wieczorkowska, A., Synak, P., Raś, Z. (2006). Multi-label classification of emotions in music. In Intelligent Information Processing and Web Mining (Vol. 35, pp. 307–315). doi:10.1007/3-540-33521-8_30.
Witten, I.H., & Frank, E. (2000). Data mining: Practical machine learning tools and techniques with Java implementations. CA, USA: Morgan Kaufmann.
Google Scholar
Xi, P., Lee, W.S., Shu, C. (2007). Analysis of segmented human body scans. In GI ’07 (pp. 19–26).
Xue, Y., Liao, X., Carin, L., Krishnapuram, B. (2007). Multi-task learning for classification with dirichlet process priors. Journal of Machine Learning Research, 8, 35–63.
MathSciNet MATH Google Scholar
Yang, Y. (2001). A study on thresholding strategies for text categorization. In SIGIR (pp. 137–145). ACM Press.
Yu, K., Tresp, V., Schwaighofer, A. (2005). Learning gaussian processes from multiple tasks. In ICML ’05 (pp. 1012–1019). New York, NY, USA: ACM.
Chapter Google Scholar
Zenko, B., & Dzeroski, S. (2008). Learning classification rules for multiple target attributes. In PAKDD’08 (pp. 454–465).
Zhang, M.L., & Zhou, Z.H. (2006). Multilabel neural networks with applications to functional genomics and text categorization. IEEE Transactions on Knowledge and Data Engineering, 18(10), 1338–1351.
Article Google Scholar
Zhang, M.L., & Zhou, Z.H. (2007a). Ml-knn: a lazy learning approach to multi-label learning. Pattern Recognition, 40 2007.
Zhang, M.L., & Zhou, Z.H. (2007b). Multi-label learning by instance differentiation. In AAAI’07 (pp. 669–674). AAAI Press.

Download references

Author information

Authors and Affiliations

National Research Council Canada, 1200 Montreal Road, Ottawa, ON, Canada
Hongyu Guo & Sylvain Létourneau

Authors

Hongyu Guo
View author publications
You can also search for this author in PubMed Google Scholar
Sylvain Létourneau
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongyu Guo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, H., Létourneau, S. Iterative classification for multiple target attributes. J Intell Inf Syst 40, 283–305 (2013). https://doi.org/10.1007/s10844-012-0224-5

Download citation

Received: 10 May 2012
Revised: 13 August 2012
Accepted: 19 September 2012
Published: 13 October 2012
Issue Date: April 2013
DOI: https://doi.org/10.1007/s10844-012-0224-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Iterative classification for multiple target attributes

Abstract

Access this article

Similar content being viewed by others

Development and Application of Artificial Neural Network

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey of transfer learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Iterative classification for multiple target attributes

Abstract

Access this article

Similar content being viewed by others

Development and Application of Artificial Neural Network

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey of transfer learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation