Skip to main content

Extreme Multi-label Classification with Hierarchical Multi-task for Product Attribute Identification

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13282))

Included in the following conference series:

  • 1549 Accesses

Abstract

Identification of product attributes (product type, brand, color, gender, etc.) from a query is critically important for e-commerce search systems, especially the identification of brand intent. Recently, Named Entity Recognition (NER) method has been used to address this issue. However, the limitation of NER method is that it can only identify brand intent specified by terms of a query and cannot work appropriately if brand terms are not provided explicitly. To overcome this limitation, we propose a novel Extreme Multi-label based hierarchical Multi-tAsk (EMMA) framework, where we treat the brand identification as an issue of extreme multi-label classification; thereafter, a deep learning model is also developed to jointly learn query’s product intent and brand intent in a coarse-to-fine approach. The results from both online A/B test and offline experiment on real industrial dataset demonstrate the effectiveness of our proposed framework. Additionally, this framework may be extended potentially from e-commerce system to other search scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Putthividhya, D., Hu, J.: Bootstrapped named entity recognition for product attribute extraction. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1557–1567 (2011)

    Google Scholar 

  2. Zheng, G., Mukherjee, S., Dong, X.L., Li, F.: OpenTag: open attribute value extraction from product profiles. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1049–1058 (2018)

    Google Scholar 

  3. Jansen, B.J., Booth, D.L., Spink, A.: Determining the user intent of web search engine queries. In: Proceedings of the 16th International Conference on World Wide Web, pp. 1149–1150 (2007)

    Google Scholar 

  4. Ha, J.W., Pyo, H., Kim, J.: Large-scale item categorization in e-commerce using multiple recurrent neural networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 107–115 (2016)

    Google Scholar 

  5. Ashkan, A., Clarke, C.L.A., Agichtein, E., Guo, Q.: Classifying and characterizing query intent. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 578–586. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00958-7_53

    Chapter  Google Scholar 

  6. Yu, W., Sun, Z., Liu, H., Li, Z., Zheng, Z.: Multi-level deep learning based e-commerce product categorization. In: The SIGIR 2018 Workshop On eCommerce co-located with the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval (2018)

    Google Scholar 

  7. Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997)

    Article  MathSciNet  Google Scholar 

  8. Ruder, S.: An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 (2017)

  9. Subramanian, S., Trischler, A., Bengio, Y., Pal, C.J.: Learning general purpose distributed sentence representations via large scale multi-task learning. arXiv preprint arXiv:1804.00079 (2018)

  10. Sanh, V., Wolf, T., Ruder, S.: A hierarchical multi-task approach for learning embeddings from semantic tasks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6949–6956 (2019)

    Google Scholar 

  11. Ahmadvand, A., Kallumadi, S., Javed, F., Agichtein, E.: Jointmap: joint query intent understanding for modeling intent hierarchies in e-commerce search. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrievalm, pp. 1509–1512 (2020)

    Google Scholar 

  12. Gao, C., et al.: Neural multi-task recommendation from multi-behavior data. In: 2019 IEEE 35th International Conference on Data Engineering (ICDE), pp. 1554–1557. IEEE (2019)

    Google Scholar 

  13. Wang, J., et al.: A multi-task learning approach for improving product title compression with user search log data. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)

    Google Scholar 

  14. Zhang, L., Wang, R., Zhou, J., Yu, J., Ling, Z., Xiong, H.: Joint intent detection and entity linking on spatial domain queries. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, pp. 4937–4947 (2020)

    Google Scholar 

  15. Babbar, R., Schölkopf, B.: Data scarcity, robustness and extreme multi-label classification. Mach. Learn. 108(8), 1329–1351 (2019)

    Google Scholar 

  16. Babbar, R., Schölkopf, B.: DiSMEC distributed sparse machines for extreme multi-label classification. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp. 721–729 (2017)

    Google Scholar 

  17. Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D., Wilson, A.G.: Averaging weights leads to wider optima and better generalization. arXiv preprint arXiv:1803.05407 (2018)

  18. Prabhu, Y., Varma, M.: FastXML: a fast, accurate and stable tree-classifier for extreme multi-label learning. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 263–272 (2014)

    Google Scholar 

  19. Bhatia, K., Jain, H., Kar, P., Varma, M., Jain, P.: Sparse local embeddings for extreme multi-label classification. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, vol. 29, pp. 730–738 (2015)

    Google Scholar 

  20. Tagami, Y.: AnnexML: approximate nearest neighbor search for extreme multi-label classification. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 455–464 (2017)

    Google Scholar 

  21. Huang, W., et al.: Hierarchical multi-label text classification: an attention-based recurrent network approach. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 1051–1060 (2019)

    Google Scholar 

  22. Kendall, A., Gal, Y., Cipolla, R.: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7482–7491 (2018)

    Google Scholar 

  23. Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11), 1–28 (2008)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, J. et al. (2022). Extreme Multi-label Classification with Hierarchical Multi-task for Product Attribute Identification. In: Gama, J., Li, T., Yu, Y., Chen, E., Zheng, Y., Teng, F. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2022. Lecture Notes in Computer Science(), vol 13282. Springer, Cham. https://doi.org/10.1007/978-3-031-05981-0_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-05981-0_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-05980-3

  • Online ISBN: 978-3-031-05981-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics