Abstract
Current state-of-the-art methods for text classification rely on large deep neural networks. For use cases such as product cataloging, their required computational resources and lack of explainability may be problematic: not every online shop can afford a vast IT infrastructure to guarantee low latency in their web applications and some of them require sensitive categories not to be confused. This motivates alternative methods that can perform close to the mentioned methods while being explainable and less resource-demanding. In this work, we evaluate an explainable framework consisting of a representation learning model for article descriptions and a similarity-based classifier. We contrast its results with those obtained by DistilBERT, a solid low-resource baseline for deep learning-based models, on two different retail article categorization datasets; and we finally discuss the suitability of the different presented models when they need to be deployed considering not only their classification performance but also their implied resource costs and explainability aspects.
Keywords
- Multi-label text classification
- Representation learning
- Resource awareness
- Explainability
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bhatia, K., et al.: The extreme classification repository: multi-label datasets and code (2016). http://manikvarma.org/downloads/XC/XMLRepository.html
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017). https://doi.org/10.1162/tacl_a_00051, https://aclanthology.org/Q17-1010
Brito, E., Georgiev, B., Domingo-Fernández, D., Hoyt, C.T., Bauckhage, C.: Ratvec: a general approach for low-dimensional distributed vector representations via rational kernels. In: LWDA, pp. 74–78 (2019)
Gallagher, R.J., Reing, K., Kale, D., Ver Steeg, G.: Anchored correlation explanation: Topic modeling with minimal domain knowledge. Trans. Assoc. Comput. Linguist. 5, 529–542 (2017). https://doi.org/10.1162/tacl_a_00078, https://aclanthology.org/Q17-1037
García-Martín, E., Rodrigues, C.F., Riley, G., Grahn, H.: Estimation of energy consumption in machine learning. J. Parallel Distrib. Comput. 134, 75–88 (2019)
Hong, D., Baek, S.S., Wang, T.: Interpretable sequence classification via prototype trajectory (2021)
Honnibal, M., Montani, I., Van Landeghem, S., Boyd, A.: spaCy: industrial-strength natural language processing in python (2020). https://doi.org/10.5281/zenodo.1212303
Jagarlamudi, J., Daumé III, H., Udupa, R.: Incorporating lexical priors into topic models. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 204–213. Association for Computational Linguistics, Avignon (2012). http://aclanthology.org/E12-1021
Jain, H., Prabhu, Y., Varma, M.: Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 935–944. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2939672.2939756
Molnar, C.: Interpretable machine learning (2020). http://christophm.github.io/interpretable-ml-book/
Pluciński, K., Lango, M., Stefanowski, J.: Prototypical convolutional neural network for a phrase-based explanation of sentiment classification. In: Kamp, M., et al. (eds.) ECML PKDD 2021. Communications in Computer and Information Science, vol. 1524, pp. 457–472. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-93736-2_35
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3982–3992. Association for Computational Linguistics, Hong Kong (2019). https://doi.org/10.18653/v1/D19-1410, http://aclanthology.org/D19-1410
Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1(5), 206–215 (2019)
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter (2020)
Strubell, E., Ganesh, A., McCallum, A.: Energy and policy considerations for deep learning in NLP. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3645–3650. Association for Computational Linguistics, Florence (2019). https://doi.org/10.18653/v1/P19-1355, http://aclanthology.org/P19-1355
Szymański, P., Kajdanowicz, T.: A scikit-based Python environment for performing multi-label classification. ArXiv e-prints (2017)
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)
Acknowledgements
This work was funded by the German Federal Ministry of Education and Research, ML2R - no. 01S18038B.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Brito, E., Gupta, V., Hahn, E., Giesselbach, S. (2022). Assessing the Performance Gain on Retail Article Categorization at the Expense of Explainability and Resource Efficiency. In: Bergmann, R., Malburg, L., Rodermund, S.C., Timm, I.J. (eds) KI 2022: Advances in Artificial Intelligence. KI 2022. Lecture Notes in Computer Science(), vol 13404. Springer, Cham. https://doi.org/10.1007/978-3-031-15791-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-15791-2_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15790-5
Online ISBN: 978-3-031-15791-2
eBook Packages: Computer ScienceComputer Science (R0)