Fast block-wise partitioning for extreme multi-label classification

Liang, Yuefeng; Hsieh, Cho-Jui; Lee, Thomas C. M.

doi:10.1007/s10618-023-00945-5

Fast block-wise partitioning for extreme multi-label classification

Published: 26 July 2023

Volume 37, pages 2192–2215, (2023)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

254 Accesses
1 Altmetric
Explore all metrics

Abstract

Extreme multi-label classification aims to learn a classifier that annotates an instance with a relevant subset of labels from an extremely large label set. Many existing solutions embed the label matrix to a low-dimensional linear subspace, or examine the relevance of a test instance to every label via a linear scan. In practice, however, those approaches can be computationally exorbitant. To alleviate this drawback, we propose a Block-wise Partitioning (BP) pretreatment that divides all instances into disjoint clusters, to each of which the most frequently tagged label subset is attached. One multi-label classifier is trained on one pair of instance and label clusters, and the label set of a test instance is predicted by first delivering it to the most appropriate instance cluster. Experiments on benchmark multi-label data sets reveal that BP pretreatment significantly reduces prediction time, and retains almost the same level of prediction accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Biclustering-based multi-label classification

Article 23 April 2024

Predictive Bi-clustering Trees for Hierarchical Multi-label Classification

Exploiting Instance Relationship for Effective Extreme Multi-label Learning

Data availability

All data are publicly available with references provided in the paper.

Code availability

The code can be obtained from the authors. It will also be made publicly available in github once the paper is accepted for publication.

Notes

These choices are adopted from the Extreme Classification Repository.

References

Agrawal R, Gupta A, Prabhu Y, Varma M (2013) Multi-label learning with millions of labels: recommending advertiser bid phrases for web pages. In: Proceedings of the 22nd international conference on World Wide Web, ACM, pp 13–24
Babbar R, Schölkopf B (2017) Dismec: distributed sparse machines for extreme multi-label classification. In: Proceedings of the tenth ACM international conference on web search and data mining, ACM, pp 721–729
Babbar R, Schölkopf B (2019) Data scarcity, robustness and extreme multi-label classification. Mach Learn, 1–23
Bhatia K, Dahiya K, Jain H, Kar P, Mittal A, Prabhu Y, Varma M (2016) The extreme classification repository: multi-label datasets and code. URL http://manikvarma.org/downloads/XC/XMLRepository.html
Bhatia K, Jain H, Kar P, Varma M, Jain P (2015) Sparse local embeddings for extreme multi-label classification. Adv Neural Inf Process Syst, 730–738
Chang W-C, Jiang D, Yu H-F, Teo CH, Zhang J, Zhong K, Kolluri K, Hu Q, Shandilya N, Ievgrafov V et al (2021) Extreme multi-label learning for semantic matching in product search. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, 2643–2651
Chang W-C, Yu H-F, Zhong K, Yang Y, Dhillon IS (2020) Taming pretrained transformers for extreme multi-label text classification. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp 3163–3171
Crammer K, Singer Y (2001) On the algorithmic implementation of multiclass kernel-based vector machines. J Mach Learn Res 2:265–292
MATH Google Scholar
Dahiya K, Agarwal A, Saini D, Gururaj K, Jiao J, Singh A, Agarwal S, Kar P, Varma M (2021a) Siamesexml: siamese networks meet extreme classifiers with 100m labels. In: International conference on machine learning, PMLR, pp 2330–2340
Dahiya K, Saini D, Mittal A, Shaw A, Dave K, Soni A, Jain H, Agarwal S, Varma M (2021b) Deepxml: A deep extreme multi-label learning framework applied to short text documents. In: Proceedings of the 14th ACM international conference on web search and data mining, pp 31–39
Day WH, Edelsbrunner H (1984) Efficient algorithms for agglomerative hierarchical clustering methods. J Classif 1:7–24
Article MATH Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B (Methodological) 39:1–22
MathSciNet MATH Google Scholar
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: Computer vision and pattern recognition, 2009. CVPR 2009. IEEE Conference on, IEEE, pp 248–255
Evron I, Moroshko E, Crammer K (2018) Efficient loss-based decoding on graphs for extreme classification. Adv Neural Inf Process Syst, 31
Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) Liblinear: a library for large linear classification. J Mach Learn Resarch 9:1871–1874
MATH Google Scholar
Gupta V, Wadbude R, Natarajan N, Karnick H, Jain P, Rai P (2019) Distributional semantics meets multi-label learning. Proc AAAI Conf Artif Intell 33:3747–3754
Google Scholar
Hsu DJ, Kakade SM, Langford J, Zhang T (2009) Multi-label prediction via compressed sensing. In: Advances in neural information processing systems, pp 772–780
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall Inc
MATH Google Scholar
Jain H, Balasubramanian V, Chunduri B, Varma M (2019) Slice: scalable linear extreme classifiers trained on 100 million labels for related searches. In: Proceedings of the twelfth ACM international conference on web search and data mining, pp 528–536
Jain H, Prabhu Y, Varma M (2016) Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 935–944
Jalan A, Kar P (2019) Accelerating extreme classification via adaptive feature agglomeration. In: Proceedings of the 28th international joint conference on artificial intelligence, pp 2600–2606
Jasinska K, Dembczynski K, Busa-Fekete R, Pfannschmidt K, Klerx T, Hullermeier E (2016) Extreme f-measure maximization using sparse probability estimates. In: International conference on machine learning, pp 1435–1444
Jiang T, Wang D, Sun L, Yang H, Zhao Z, Zhuang F (2021) Lightxml: transformer with dynamic negative sampling for high-performance extreme multi-label text classification. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 7987–7994
Khandagale S, Xiao H, Babbar R (2019) Bonsai-diverse and shallow trees for extreme multi-label classification. arXiv preprint arXiv:1904.08249
Khandagale S, Xiao H, Babbar R (2020) Bonsai: diverse and shallow trees for extreme multi-label classification. Mach Learn 109:2099–2119
Article MathSciNet MATH Google Scholar
Liu J, Chang W-C, Wu Y, Yang Y (2017) Deep learning for extreme multi-label text classification. In Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, ACM, pp 115–124
McAuley J, Leskovec J (2013) Hidden factors and hidden topics: understanding rating dimensions with review text. In Proceedings of the 7th ACM conference on recommender systems, ACM, pp 165–172
Mittal A, Dahiya K, Agrawal S, Saini D, Agarwal S, Kar P, Varma M (2021) Decaf: deep extreme classification with label features. In Proceedings of the 14th ACM international conference on web search and data mining, pp 49–57
Mittal A, Dahiya K, Malani S, Ramaswamy J, Kuruvilla S, Ajmera J, Chang K-h, Agarwal S, Kar P, Varma M (2022) Multi-modal extreme classification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12393–12402
Nasierding G, Tsoumakas G, Kouzani AZ (2009) Clustering based multi-label classification for image annotation and retrieval. In: 2009 IEEE international conference on systems, man and cybernetics SMC , IEEE, pp 4514–4519
Niculescu-Mizil A, Abbasnejad E (2017) Label filters for large scale multilabel classification. In: Artificial intelligence and statistics, pp 1448–1457
Panos A, Dellaportas P, Titsias MK (2021) Large scale multi-label learning using gaussian processes. Mach Learn 110:965–987
Article MathSciNet MATH Google Scholar
Partalas I, Kosmopoulos A, Baskiotis N, Artieres T, Paliouras G, Gaussier E, Androutsopoulos I, Amini M-R, Galinari P (2015) Lshtc: A benchmark for large-scale text classification. arXiv preprint arXiv:1503.08581
Prabhu Y, Kag A, Harsola S, Agrawal R, Varma M (2018) Parabel: partitioned label trees for extreme classification with application to dynamic search advertising. In: Proceedings of the 2018 world wide web conference, International world wide web conferences steering committee, pp 993–1002
Prabhu Y, Varma M (2014) Fastxml: a fast, accurate and stable tree-classifier for extreme multi-label learning. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 263–272
Qaraei M, Schultheis E, Gupta P, Babbar R (2021) Convex surrogates for unbiased loss functions in extreme classification with missing labels. In: Proceedings of the web conference, vol 2021, pp 3711–3720
Si S, Zhang H, Keerthi SS, Mahajan D, Dhillon IS, Hsieh C-J (2017) Gradient boosted decision trees for high dimensional sparse output. In: International conference on machine learning, pp 3182–3190
Siblini W, Kuntz P, Meyer F (2018) Craftml, an efficient clustering-based random forest for extreme multi-label learning
Snoek CG, Worring M, Van Gemert JC, Geusebroek J-M, Smeulders AW (2006) The challenge problem for automated detection of 101 semantic concepts in multimedia. In: Proceedings of the 14th ACM international conference on multimedia, ACM, pp 421–430
Tagami Y (2017) Annexml: Approximate nearest neighbor search for extreme multi-label classification. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 455–464
Wei T, Tu W-W, Li Y-F, Yang G-P (2021) Towards robust prediction on tail labels. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pp 1812–1820
Weston J, Makadia A, Yee H (2013) Label partitioning for sublinear ranking. In: International conference on machine learning, pp 181–189
Wetzker R, Zimmermann C, Bauckhage C (2008) Analyzing social bookmarking systems: a del. icio. us cookbook. In: Proceedings of the ECAI 2008 mining social data workshop, pp 26–30
Wydmuch M, Jasinska K, Kuznetsov M, Busa-Fekete R, Dembczynski K (2018) A no-regret generalization of hierarchical softmax to extreme multi-label classification. In: Advances in neural information processing systems, pp 6355–6366
Yen IE, Huang X, Dai W, Ravikumar P, Dhillon I, Xing E (2017) Ppdsparse: a parallel primal-dual sparse method for extreme classification. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 545–553
Yen I E-H, Huang X, Ravikumar P, Zhong K, Dhillon I (2016) Pd-sparse: a primal and dual sparse approach to extreme multiclass and multilabel classification. In: International conference on machine learning, pp 3069–3077
You R, Dai S, Zhang Z, Mamitsuka H, Zhu S (2018) Attentionxml: extreme multi-label text classification with multi-label attention based recurrent neural networks. arXiv preprint arXiv:1811.01727
Yu H-F, Jain P, Kar P, Dhillon I (2014) Large-scale multi-label learning with missing labels. In: International conference on machine learning, pp 593–601
Zubiaga A (2012) Enhancing navigation on wikipedia with social tags. arXiv preprint arXiv:1202.5469

Download references

Funding

Liang and Lee were supported by the National Science Foundation under Grants CCF-1934568, DMS-1916125 and DMS-2113605. Hsieh was supported by the National Science Foundation under Grants CCF-1934568, IIS-1901527 and IIS-2008173.

Author information

Authors and Affiliations

Department of Statistics, University of California at Davis, Davis, CA, 95616, USA
Yuefeng Liang & Thomas C. M. Lee
Department of Computer Science, University of California at Los Angeles, Los Angeles, CA, 90095, USA
Cho-Jui Hsieh

Authors

Yuefeng Liang
View author publications
You can also search for this author in PubMed Google Scholar
Cho-Jui Hsieh
View author publications
You can also search for this author in PubMed Google Scholar
Thomas C. M. Lee
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the development of the proposed method and the writing of the manuscript. YL carried out most of the numerical experiments. All authors reviewed the manuscript.

Corresponding author

Correspondence to Thomas C. M. Lee.

Ethics declarations

Conflict of interest

The authors do not have any conflicts of interest/competing interests to declare.

Additional information

Responsible editor: Dragi Kocev.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liang, Y., Hsieh, CJ. & Lee, T.C.M. Fast block-wise partitioning for extreme multi-label classification. Data Min Knowl Disc 37, 2192–2215 (2023). https://doi.org/10.1007/s10618-023-00945-5

Download citation

Received: 22 June 2022
Accepted: 06 April 2023
Published: 26 July 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s10618-023-00945-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast block-wise partitioning for extreme multi-label classification

Abstract

Access this article

Similar content being viewed by others

Biclustering-based multi-label classification

Predictive Bi-clustering Trees for Hierarchical Multi-label Classification

Exploiting Instance Relationship for Effective Extreme Multi-label Learning

Data availability

Code availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fast block-wise partitioning for extreme multi-label classification

Abstract

Access this article

Similar content being viewed by others

Biclustering-based multi-label classification

Predictive Bi-clustering Trees for Hierarchical Multi-label Classification

Exploiting Instance Relationship for Effective Extreme Multi-label Learning

Data availability

Code availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation