Skip to main content

Unveiling Black-Boxes: Explainable Deep Learning Models for Patent Classification

  • Conference paper
  • First Online:
Explainable Artificial Intelligence (xAI 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1902))

Included in the following conference series:

  • 643 Accesses

Abstract

Recent technological advancements have led to a large number of patents in a diverse range of domains, making it challenging for human experts to analyze and manage. State-of-the-art methods for multi-label patent classification rely on deep neural networks (DNNs), which are complex and often considered black-boxes due to their opaque decision-making processes. In this paper, we propose a novel deep explainable patent classification framework by introducing layer-wise relevance propagation (LRP) to provide human-understandable explanations for predictions. We train several DNN models, including Bi-LSTM, CNN, and CNN-BiLSTM, and propagate the predictions backward from the output layer up to the input layer of the model to identify the relevance of words for individual predictions. Considering the relevance score, we then generate explanations by visualizing relevant words for the predicted patent class. Experimental results on two datasets comprising two-million patent texts demonstrate high performance in terms of various evaluation measures. The explanations generated for each prediction highlight important relevant words that align with the predicted class, making the prediction more understandable. Explainable systems have the potential to facilitate the adoption of complex AI-enabled methods for patent classification in real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Dataset: https://huggingface.co/AI-Growth-Lab [42].

  2. 2.

    Dataset: https://huggingface.co/datasets/ccdv/patent-classification/tree/main.

  3. 3.

    https://fasttext.cc/docs/en/crawl-vectors.html.

  4. 4.

    https://github.com/ArrasL/LRP_for_LSTM.

References

  1. Kucer, M., Oyen, D., Castorena, J., Wu, J.: Deeppatent: large scale patent drawing recognition and retrieval. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2309–2318 (2022)

    Google Scholar 

  2. Li, S., Jie, H., Cui, Y., Jianjun, H.: Deeppatent: patent classification with convolutional neural networks and word embedding. Scientometrics 117, 721–744 (2018)

    Article  Google Scholar 

  3. Lee, J.-S., Hsiang, J.: Patent classification by fine-tuning Bert language model. World Patent Inf. 61, 101965 (2020)

    Article  Google Scholar 

  4. D’hondt, E., Verberne, S., Koster, C., Boves, L.: Text representations for patent classification. Comput. Linguist. 39(3), 755–775 (2013)

    Google Scholar 

  5. Luo, M., Shi, X., Ji, Q., Shang, M., He, X., Tao, W.: A deep self-learning classification framework for incomplete medical patents with multi-label. In: Liu, Y., Wang, L., Zhao, L., Yu, Z. (eds.) ICNC-FSKD 2019. AISC, vol. 1075, pp. 566–573. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-32591-6_61

    Chapter  Google Scholar 

  6. Jiang, S., Hu, J., Magee, C.L., Luo, J.: Deep learning for technical document classification. IEEE Trans. Eng. Manag. (2022)

    Google Scholar 

  7. Chen, L., Shuo, X., Zhu, L., Zhang, J., Lei, X., Yang, G.: A deep learning based method for extracting semantic information from patent documents. Scientometrics 125, 289–312 (2020)

    Article  Google Scholar 

  8. Fang, L., Zhang, L., Han, W., Tong, X., Zhou, D., Chen, E.: Patent2vec: multi-view representation learning on patent-graphs for patent classification. World Wide Web 24(5), 1791–1812 (2021)

    Article  Google Scholar 

  9. Haghighian Roudsari, A., Afshar, J., Lee, W., Lee, S.: Patentnet: multi-label classification of patent documents using deep learning based language understanding. Scientometrics, pp. 1–25 (2022)

    Google Scholar 

  10. Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: International Conference on Machine Learning, pp. 3145–3153. PMLR (2017)

    Google Scholar 

  11. Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017)

    Google Scholar 

  12. Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.-R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layerd-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015)

    Article  Google Scholar 

  13. Kute, D.V., Pradhan, B., Shukla, N., Alamri, A.: Deep learning and explainable artificial intelligence techniques applied for detecting money laundering-a critical review. IEEE Access 9, 82300–82317 (2021)

    Google Scholar 

  14. Shajalal, M., Boden, A., Stevens, G.: Explainable product backorder prediction exploiting CNN: introducing explainable models in businesses. Electron. Mark. 32, 2107–2122 (2022)

    Google Scholar 

  15. Yang, G., Ye, Q., Xia, J.: Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: a mini-review, two showcases and beyond. Inf. Fusion 77, 29–52 (2022)

    Article  Google Scholar 

  16. Adadi, A., Berrada, M.: Explainable AI for healthcare: from black box to interpretable models. In: Bhateja, V., Satapathy, S.C., Satori, H. (eds.) Embedded Systems and Artificial Intelligence. AISC, vol. 1076, pp. 327–337. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-0947-6_31

    Chapter  Google Scholar 

  17. Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you? Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)

    Google Scholar 

  18. Binder, A., Montavon, G., Lapuschkin, S., Müller, K.-R., Samek, W.: Layer-wise relevance propagation for neural networks with local renormalization layers. In: Villa, A.E.P., Masulli, P., Pons Rivero, A.J. (eds.) ICANN 2016. LNCS, vol. 9887, pp. 63–71. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44781-0_8

    Chapter  Google Scholar 

  19. Shalaby, M., Stutzki, J., Schubert, M., Günnemann, S.: An LSTM approach to patent classification based on fixed hierarchy vectors. In: Proceedings of the 2018 SIAM International Conference on Data Mining, pp. 495–503. SIAM (2018)

    Google Scholar 

  20. Roudsari, A.H., Afshar, J., Lee, C.C., Lee, W.: Multi-label patent classification using attention-aware deep learning model. In: 2020 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 558–559. IEEE (2020)

    Google Scholar 

  21. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196. PMLR (2014)

    Google Scholar 

  22. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26, 3111–3119 (2013)

    Google Scholar 

  23. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  24. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)

    Article  Google Scholar 

  25. Shajalal, M., Aono, M.: Sentence-level semantic textual similarity using word-level semantics. In: 2018 10th International Conference on Electrical and Computer Engineering (ICECE), pp. 113–116. IEEE (2018)

    Google Scholar 

  26. Shajalal, Md., Aono, M.: Semantic textual similarity between sentences using bilingual word semantics. Prog. Artif. Intell. 8, 263–272 (2019)

    Article  Google Scholar 

  27. Shajalal, Md., Aono, M.: Coverage-based query subtopic diversification leveraging semantic relevance. Knowl. Inf. Syst. 62, 2873–2891 (2020)

    Article  Google Scholar 

  28. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  29. Liu, Y., et al.: Roberta: a robustly optimized Bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

  30. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)

  31. Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. Adv. Neural Inf. Process. Syst. 32, 5753–5763 (2019)

    Google Scholar 

  32. Clark, K., Luong, M.T., Le, Q.V., Manning, C.D.: Electra: pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555 (2020)

  33. Kang, M., Lee, S., Lee, W.: Prior art search using multi-modal embedding of patent documents. In: 2020 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 548–550. IEEE (2020)

    Google Scholar 

  34. Pujari, S.C., Friedrich, A., Strötgen, J.: A multi-task approach to neural multi-label hierarchical patent classification using transformers. In: Hiemstra, D., Moens, M.-F., Mothe, J., Perego, R., Potthast, M., Sebastiani, F. (eds.) ECIR 2021. LNCS, vol. 12656, pp. 513–528. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72113-8_34

    Chapter  Google Scholar 

  35. Aroyehun, S.T., Angel, J., Majumder, N., Gelbukh, A., Hussain, A.: Leveraging label hierarchy using transfer and multi-task learning: a case study on patent classification. Neurocomputing 464, 421–431 (2021)

    Google Scholar 

  36. Roudsari, A.H., Afshar, J., Lee, S., Lee, W.: Comparison and analysis of embedding methods for patent documents. In: 2021 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 152–155. IEEE (2021)

    Google Scholar 

  37. Li, H., Li, S., Jiang, Y., Zhao, G.: CoPatE: a novel contrastive learning framework for patent embeddings. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 1104–1113 (2022)

    Google Scholar 

  38. Kamateri, E., Stamatis, V., Diamantaras, K., Salampasis, M.: Automated single-label patent classification using ensemble classifiers. In: 2022 14th International Conference on Machine Learning and Computing (ICMLC), pp. 324–330 (2022)

    Google Scholar 

  39. Arras, L., Horn, F., Montavon, G., Muller, K.R., Samek, W.: What is relevant in a text document?: an interpretable machine learning approach. PloS one 12(8), e0181142 (2017)

    Google Scholar 

  40. Arras, L., Montavon, G., Müller, K.R., Samek, W.: Explaining recurrent neural network predictions in sentiment analysis. arXiv preprint arXiv:1706.07206 (2017)

  41. Karim, M.R., et al.: Deephateexplainer: explainable hate speech detection in under-resourced Bengali language. In: 2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10. IEEE (2021)

    Google Scholar 

  42. Bekamiri, H., Hain, D.S., Jurowetzki, R.: Patentsberta: a deep NLP based hybrid model for patent distance and classification using augmented SBERT. arXiv preprint arXiv:2103.11933 (2021)

  43. Sharma, E., Li, C., Wang, L.: BIGPATENT: a large-scale dataset for abstractive and coherent summarization. arXiv preprint arXiv:1906.03741 (2019)

Download references

Acknowledgment

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 955422.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Md Shajalal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shajalal, M., Denef, S., Karim, M.R., Boden, A., Stevens, G. (2023). Unveiling Black-Boxes: Explainable Deep Learning Models for Patent Classification. In: Longo, L. (eds) Explainable Artificial Intelligence. xAI 2023. Communications in Computer and Information Science, vol 1902. Springer, Cham. https://doi.org/10.1007/978-3-031-44067-0_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-44067-0_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-44066-3

  • Online ISBN: 978-3-031-44067-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics