A Multi-task Approach to Neural Multi-label Hierarchical Patent Classification Using Transformers

Pujari, Subhash Chandra; Friedrich, Annemarie; Strötgen, Jannik

doi:10.1007/978-3-030-72113-8_34

Subhash Chandra Pujari^14,15,
Annemarie Friedrich¹⁴ &
Jannik Strötgen¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12656))

Included in the following conference series:

European Conference on Information Retrieval

2889 Accesses
3 Citations
3 Altmetric

Abstract

With the aim of facilitating internal processes as well as search applications, patent offices categorize documents into taxonomies such as the Cooperative Patent Categorization. This task corresponds to a multi-label hierarchical text classification problem. Recent approaches based on pre-trained neural language models have shown promising performance by focusing on leaf-level label prediction. Prior works using intrinsically hierarchical algorithms, which learn a separate classifier for each node in the hierarchy, have also demonstrated their effectiveness despite being based on symbolic feature inventories. However, training one transformer-based classifier per node is computationally infeasible due to memory constraints. In this work, we propose a Transformer-based Multi-task Model (TMM) overcoming this limitation. Using a multi-task setup and sharing a single underlying language model, we train one classifier per node. To the best of our knowledge, our work constitutes the first approach to patent classification combining transformers and hierarchical algorithms. We outperform several non-neural and neural baselines on the WIPO-alpha dataset as well as on a new dataset of 70k patents, which we publish along with this work. Our analysis reveals that our approach achieves much higher recall while keeping precision high. Strong increases on macro-average scores demonstrate that our model also performs much better for infrequent labels. An extended version of the model with additional connections reflecting the label taxonomy results in a further increase of recall especially at the lower levels of the hierarchy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

PatentNet: multi-label classification of patent documents using deep learning based language understanding

Article Open access 18 December 2021

Greek Patent Classification Using Deep Learning

A Survey of Automated Hierarchical Classification of Patents

Notes

1.
https://github.com/boschresearch/hierarchical_patent_classification_ecir2021.
2.
https://www.patentsview.org/download.
3.
https://www.wipo.int/classifications/ipc/en/ITsupport/Categorization/dataset/.
4.
See WIPO-alpha readme and personal correspondence with authors.
5.
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.average_precision_score.html.
6.
https://www.tensorflow.org.
7.
https://dublin.zhaw.ch/~benf/HPC.
8.
https://github.com/globality-corp/sklearn-hierarchical-classification.
9.
We double-checked the surprisingly low macro-scores of HARNN-orig and decided to present results of HARNN tuned for macro-performance as well.

References

Abdelgawad, L., Kluegl, P., Genc, E., Falkner, S., Hutter, F.: Optimizing neural networks for patent classification. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds.) ECML PKDD 2019. LNCS (LNAI), vol. 11908, pp. 688–703. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-46133-1_41
Chapter Google Scholar
Aggarwal, C.C., Zhai, C.: Mining Text Data, chap. A Survey of Text Classification Algorithms, pp. 163–222. Springer (2012). https://doi.org/10.1007/978-1-4614-3223-4_6
Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 3615–3620. Association for Computational Linguistics, November 2019
Google Scholar
Benites, F.: TwistBytes - hierarchical Classification at GermEval 2019: walking the fine line (of recall and precision). In: Proceedings of the 15th Conference on Natural Language Processing (KONVENS). Erlangen, Germany, October 2019
Google Scholar
Benites, F., Malmasi, S., Zampieri, M.: Classifying patent applications with ensemble methods. In: Proceedings of the Australasian Language Technology Association Workshop 2018, pp. 89–92, Dunedin, New Zealand (2018)
Google Scholar
Chollet, F., et al.: Keras (2015). https://github.com/fchollet/keras
Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240. ICML 2006, Association for Computing Machinery, New York, NY, USA (2006)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (2019)
Google Scholar
D’hondt, E., Verberne, S., Oostdijk, N., Beney, J., Koster, C., Boves, L.: Dealing with temporal variation in patent categorization: Inf. Retrieval 17, 520–544 (2014)
Google Scholar
Fall, C.J., Törcsvári, A., Benzineb, K., Karetka, G.: Automated categorization in the international patent classification. SIGIR Forum 37(1), 10–25 (2003)
Article Google Scholar
Gomez, J.C., Moens, M.-F.: A survey of automated hierarchical classification of patents. In: Paltoglou, G., Loizides, F., Hansen, P. (eds.) Professional Search in the Modern World. LNCS, vol. 8830, pp. 215–249. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12511-4_11
Chapter Google Scholar
Hepburn, J.: Universal language model fine-tuning for patent classification. In: Proceedings of the Australasian Language Technology Association Workshop 2018, pp. 93–96, Dunedin, New Zealand (2018)
Google Scholar
Howard, J., Ruder, S.: Universal Language Model Fine-tuning for Text Classification. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 328–339. Association for Computational Linguistics, Melbourne, Australia (2018)
Google Scholar
Huang, W., et al.: Hierarchical multi-label text classification: an attention-based recurrent network approach. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 1051–1060 (2019)
Google Scholar
Jalan, R., Gupta, M., Varma, V.: Medical forum question classification using deep learning. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds.) ECIR 2018. LNCS, vol. 10772, pp. 45–58. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76941-7_4
Chapter Google Scholar
Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, MD, USA, pp. 655–665 (2014)
Google Scholar
Kang, D.M., Lee, C.C., Lee, S., Lee, W.: Patent prior art search using deep learning language model. In: Proceedings of the 24th Symposium on International Database Engineering & Applications. IDEAS 2020, Association for Computing Machinery, New York, NY, USA (2020)
Google Scholar
Kim, Y.: Convolutional Neural Networks for Sentence Classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751. Association for Computational Linguistics, Doha, Qatar (2014)
Google Scholar
Kiritchenko, S., Matwin, S., Famili, A.F.: Functional annotation of genes using hierarchical text categorization. In: Proceedings of BioLINK SIG: Linking Literature, Information and Knowledge for Biology (2005)
Google Scholar
Kowsari, K., Brown, D.E., Heidarysafa, M., Meimandi, K.J., Gerber, M.S., Barnes, L.E.: Hdltex: hierarchical deep learning for text classification. In: 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 364–371. IEEE (2017)
Google Scholar
Lee, J.S., Hsiang, J.: PatentBERT: patent classification with fine-tuning a pre-trained BERT model. World Patent Inf. 61, 101965 (2020)
Google Scholar
Li, S., Hu, J., Cui, Y., Hu, J.: DeepPatent: patent classification with convolutional neural networks and word embedding. Scientometrics 117(2), 721–744 (2018)
Article Google Scholar
Lu, Z., Du, P., Nie, J.-Y.: VGCN-BERT: augmenting BERT with graph embedding for text classification. In: Jose, J.M., et al. (eds.) ECIR 2020. LNCS, vol. 12035, pp. 369–382. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45439-5_25
Chapter Google Scholar
Meng, Y., Shen, J., Zhang, C., Han, J.: Weakly-supervised hierarchical text classification. Proceedings of the AAAI Conference on Artificial Intelligence. vol. 33, 6826–6833 (2019)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Bengio, Y., LeCun, Y. (eds.) 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, 2–4 May 2013, Workshop Track Proceedings (2013)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Mollá, D., Seneviratne, D.: Overview of the 2018 ALTA shared task: classifying patent applications. In: Proceedings of the Australasian Language Technology Association Workshop 2018, Dunedin, New Zealand, pp. 84–88. (2018)
Google Scholar
Nanba, H., Kamaya, H., Takezawa, T., Okumura, M., Shinmori, A., Tanigawa, H.: Automatic translation of scholarly terms into patent terms. In: Proceedings of the 2nd International Workshop on Patent Information Retrieval, pp. 21–24. PaIR 2009, Association for Computing Machinery, New York, NY, USA (2009)
Google Scholar
Peng, H., et al.: Large-scale hierarchical text classification with recursively regularized deep graph-CNN. In: Proceedings of the 2018 World Wide Web Conference, pp. 1063–1072 (2018)
Google Scholar
Piroi, F., Hanbury, A.: Multilingual patent text retrieval evaluation: CLEF–IP. Information Retrieval Evaluation in a Changing World. TIRS, vol. 41, pp. 365–387. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22948-1_15
Chapter Google Scholar
Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers, pp. 61–74. MIT Press (1999)
Google Scholar
Risch, J., Garda, S., Krestel, R.: Hierarchical document classification as a sequence generation task. In: Proceedings of the Joint Conference on Digital Libraries (JCDL), pp. 147–155 (2020)
Google Scholar
Rogers, A., Kovaleva, O., Rumshisky, A.: A primer in bertology: What we know about how bert works. arXiv preprint arXiv:2002.12327 (2020)
Silla, C.N., Freitas, A.A.: A survey of hierarchical classification across different application domains. Data Mining Knowl. Discov. 22, 31–72 (2010)
Article MathSciNet Google Scholar
Tang, P., Jiang, M., Xia, B.N., Pitera, J.W., Welser, J., Chawla, N.V.: Multi-label patent categorization with non-local attention-based graph convolutional network. In: Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020) (2020)
Google Scholar
Tikk, D., Biro, G.: Experiment with a hierarchical text categorization method on the wipo-alpha patent collection. In: Fourth International Symposium on Uncertainty Modeling and Analysis (ISUMA 2003), pp. 104–109 (2003)
Google Scholar
Wang, P., Fan, Y., Niu, S., Yang, Z., Zhang, Y., Guo, J.: Hierarchical matching network for crime classification. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019), pp. 325–334. ACM (2019)
Google Scholar
Wehrmann, J., Cerri, R., Barros, R.: Hierarchical multi-label classification networks. In: Proceedings of the 35th International Conference on Machine Learning (ICML 2018), pp. 5075–5084 (2018)
Google Scholar
Wolf, T., et al.: Huggingface’s transformers: State-of-the-art natural language processing. ArXiv abs/1910.03771 (2019)
Google Scholar
Yao, L., Mao, C., Luo, Y.: Graph Convolutional networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2019), pp. 7370–7377 (2019)
Google Scholar

Download references

Acknowledgements

We thank Mark Giereth and Jona Ruthardt for the fruitful discussions. We are grateful to Patrick Fievet for his support with the WIPO-alpha dataset. We also thank Alexander Müller for sharing his ideas on patent classification, and Lukas Lange, Trung-Kien Tran and the anonymous reviewers for their comments on this paper.

Author information

Authors and Affiliations

Bosch Center for Artificial Intelligence, Renningen, Germany
Subhash Chandra Pujari, Annemarie Friedrich & Jannik Strötgen
Institute of Computer Science, Heidelberg University, Heidelberg, Germany
Subhash Chandra Pujari

Authors

Subhash Chandra Pujari
View author publications
You can also search for this author in PubMed Google Scholar
Annemarie Friedrich
View author publications
You can also search for this author in PubMed Google Scholar
Jannik Strötgen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Subhash Chandra Pujari .

Editor information

Editors and Affiliations

Radboud University Nijmegen, Nijmegen, The Netherlands
Djoerd Hiemstra
Department of Computer Science, Katholieke Universiteit Leuven, Heverlee, Belgium
Marie-Francine Moens
Toulouse Institute of Computer Science Research, Toulouse, France
Josiane Mothe
Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy
Raffaele Perego
Leipzig University, Leipzig, Germany
Martin Potthast
Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy
Fabrizio Sebastiani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pujari, S.C., Friedrich, A., Strötgen, J. (2021). A Multi-task Approach to Neural Multi-label Hierarchical Patent Classification Using Transformers. In: Hiemstra, D., Moens, MF., Mothe, J., Perego, R., Potthast, M., Sebastiani, F. (eds) Advances in Information Retrieval. ECIR 2021. Lecture Notes in Computer Science(), vol 12656. Springer, Cham. https://doi.org/10.1007/978-3-030-72113-8_34

Download citation

DOI: https://doi.org/10.1007/978-3-030-72113-8_34
Published: 27 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72112-1
Online ISBN: 978-3-030-72113-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Multi-task Approach to Neural Multi-label Hierarchical Patent Classification Using Transformers

Abstract

Access this chapter

Similar content being viewed by others

PatentNet: multi-label classification of patent documents using deep learning based language understanding

Greek Patent Classification Using Deep Learning

A Survey of Automated Hierarchical Classification of Patents

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Multi-task Approach to Neural Multi-label Hierarchical Patent Classification Using Transformers

Abstract

Access this chapter

Similar content being viewed by others

PatentNet: multi-label classification of patent documents using deep learning based language understanding

Greek Patent Classification Using Deep Learning

A Survey of Automated Hierarchical Classification of Patents

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation