Optimizing Neural Networks for Patent Classification

Abdelgawad, Louay; Kluegl, Peter; Genc, Erdan; Falkner, Stefan; Hutter, Frank

doi:10.1007/978-3-030-46133-1_41

Louay Abdelgawad¹⁴,
Peter Kluegl¹⁴,
Erdan Genc¹⁴,
Stefan Falkner¹⁵ &
…
Frank Hutter¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11908))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

1959 Accesses

Abstract

A great number of patents is filed everyday to the patent offices worldwide. Each of these patents has to be labeled by domain experts with one or many of thousands of categories. This process is not only extremely expensive but also overwhelming for the experts, due to the considerable increase of filed patents over the years and the increasing complexity of the hierarchical categorization structure. Therefore, it is critical to automate the manual classification process using a classification model. In this paper, the automation of the task is carried out based on recent advances in deep learning for NLP and compared to customized approaches. Moreover, an extensive optimization analysis grants insights about hyperparameter importance. Our optimized convolutional neural network achieves a new state-of-the-art performance of \(55.02\%\) accuracy on the public Wipo-Alpha dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.epo.org/applying/basics.html (accessed June 20, 2019).
2.
https://www.uspto.gov/web/offices/ac/ido/oeip/taf/us_stat.htm (accessed June 20, 2019).
3.
https://keras.io/preprocessing/text/ (accessed June 20, 2019).
4.
https://www.wipo.int/classifications/ipc/en/ITsupport/Categorization/dataset/index.html (accessed May 4, 2019).
5.
http://www.ifs.tuwien.ac.at/~clef-ip/download/2011/index.shtml (accessed May 4, 2019).
6.
Specifications of the used machine: OS: CentOS Linux 7.5, RAM: 32 GB Kingston HyperX Fury DDR4, CPU: Intel Core i7-7700, GPU: MSI GeForce GTX 1080 Ti Gaming X 11G.
7.
https://github.com/google-research/bert#sentence-and-sentence-pair-classifica-tion-tasks (accessed June 24, 2019).
8.
https://github.com/lo2aayy/patent-classification.

References

Benzineb, K., Guyot, J.: Automated patent classification. In: Lupu, M., Mayer, K., Tait, J., Trippe, A. (eds.) Current Challenges in Patent Information Retrieval, pp. 239–261. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19231-9_12
Chapter Google Scholar
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Ling. 5, 135–146 (2017)
Google Scholar
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 Workshop on Deep Learning (2014)
Google Scholar
Caselles-Dupré, H., Lesaint, F., Royo-Letelier, J.: Word2Vec applied to recommendation: hyperparameters matter. In: Proceedings of the 12th ACM Conference on Recommender Systems, pp. 352–356. ACM (2016)
Google Scholar
Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol. 1, pp. 1107–1116. ACL (2017)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 4171–4186. ACL (2018)
Google Scholar
Falkner, S., Klein, A., Hutter, F.: BOHB: robust and efficient hyperparameter optimization at scale. In: Proceedings of the 35th International Conference on Machine Learning, PMLR, vol. 80, pp. 1437–1446 (2018)
Google Scholar
Fall, C.J., Törcsvári, A., Benzineb, K., Karetka, G.: Automated categorization in the international patent classification. In: SIGIR Forum, vol. 1, pp. 10–25. ACM (2003)
Google Scholar
Fall, C.J., Benzineb, K.: Literature survey: issues to be considered in the automatic classification of patents. In: World Intellectual Property Organization, vol. 29 (2002)
Google Scholar
Fernández-Delgado, M., Cernadas, E., Barro, S., Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems? J. Mach. Learn. Res. 15, 3133–3181 (2014)
MathSciNet MATH Google Scholar
Gomez, J.C., Moens, M.-F.: A survey of automated hierarchical classification of patents. In: Paltoglou, G., Loizides, F., Hansen, P. (eds.) Professional Search in the Modern World. LNCS, vol. 8830, pp. 215–249. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12511-4_11
Chapter Google Scholar
Grawe, M.F., Martins, C.A., Bonfante, A.G.: Automated patent classification using word embedding. In: 16th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 408–411. IEEE (2017)
Google Scholar
Howard, J., Ruder, S.: Universal language model fine-tuning for text classification. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL, vol. 1, pp. 328–339 (2018)
Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 2261–2269. IEEE (2017)
Google Scholar
Hutter, F., Hoos, H.H., Leyton-Brown, K.: An efficient approach for assessing hyperparameter importance. In: ICML, JMLR Workshop and Conference Proceedings, PMLR, vol. 32, pp. 754–762 (2014)
Google Scholar
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol. 2, pp. 427–431. ACL (2017)
Google Scholar
Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 655–665. ACL (2014)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751. ACL (2014)
Google Scholar
Kowsari, K., Brown, D.E., Heidarysafa, M., Meimandi, K.J., Gerber, M.S., Barnes, L.E.: HDLTex: hierarchical deep learning for text classification. In: 16th IEEE International Conference on Machine Learning and Applications, pp. 364–371. IEEE (2017)
Google Scholar
Leopold, E., Kindermann, J.: Text categorization with support vector machines. How to represent texts in input space? Mach. Learn. 46, 423–444 (2002). https://doi.org/10.1023/A:1012491419635
Article MATH Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: CoPR (2013)
Google Scholar
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543. ACL (2014)
Google Scholar
Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 2227–2237. ACL (2018)
Google Scholar
Risch, J., Krestel, R.: Domain-specific word embeddings for patent classification. In: Data Technologies and Applications, vol. 53, pp. 108–122. Emerald Publishing Limited (2019)
Google Scholar
Steinwart, I., Christmann, A.: Support Vector Machines, 1st edn. Springer, New York (2008). https://doi.org/10.1007/978-0-387-77242-4
Book MATH Google Scholar
Wu, S., Zhong, S., Liu, Y.: Deep residual learning for image steganalysis. Multimedia Tools Appl. 77(9), 10437–10453 (2017). https://doi.org/10.1007/s11042-017-4440-4
Article Google Scholar
Xiao, Y., Cho, K.: Efficient character-level document classification by combining convolution and recurrent layers. preprint arXiv:1602.00367 (2016)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489. ACL (2016)
Google Scholar
Zhang, H., Li, D.: Naïve bayes text classifier. In: 2007 IEEE International Conference on Granular Computing (GRC 2007), pp. 708–708. IEEE (2007)
Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, vol. 1, pp. 649–657. MIT Press (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Averbis GmbH, Freiburg, Germany
Louay Abdelgawad, Peter Kluegl & Erdan Genc
Machine Learning Institute, Albert-Ludwigs University of Freiburg, Freiburg im Breisgau, Germany
Stefan Falkner & Frank Hutter

Authors

Louay Abdelgawad
View author publications
You can also search for this author in PubMed Google Scholar
Peter Kluegl
View author publications
You can also search for this author in PubMed Google Scholar
Erdan Genc
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Falkner
View author publications
You can also search for this author in PubMed Google Scholar
Frank Hutter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Louay Abdelgawad .

Editor information

Editors and Affiliations

Leuphana University, Lüneburg, Germany
Ulf Brefeld
IRISA/Inria, Rennes, France
Elisa Fromont
University of Würzburg, Würzburg, Germany
Andreas Hotho
Leiden University, Leiden, The Netherlands
Arno Knobbe
ETH Zurich, Zurich, Switzerland
Marloes Maathuis
Institut National des Sciences Appliquées, Villeurbanne, France
Céline Robardet

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Abdelgawad, L., Kluegl, P., Genc, E., Falkner, S., Hutter, F. (2020). Optimizing Neural Networks for Patent Classification. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Lecture Notes in Computer Science(), vol 11908. Springer, Cham. https://doi.org/10.1007/978-3-030-46133-1_41

Download citation

DOI: https://doi.org/10.1007/978-3-030-46133-1_41
Published: 30 April 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46132-4
Online ISBN: 978-3-030-46133-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)