Abstract
Recent technological advancements have led to a large number of patents in a diverse range of domains, making it challenging for human experts to analyze and manage. State-of-the-art methods for multi-label patent classification rely on deep neural networks (DNNs), which are complex and often considered black-boxes due to their opaque decision-making processes. In this paper, we propose a novel deep explainable patent classification framework by introducing layer-wise relevance propagation (LRP) to provide human-understandable explanations for predictions. We train several DNN models, including Bi-LSTM, CNN, and CNN-BiLSTM, and propagate the predictions backward from the output layer up to the input layer of the model to identify the relevance of words for individual predictions. Considering the relevance score, we then generate explanations by visualizing relevant words for the predicted patent class. Experimental results on two datasets comprising two-million patent texts demonstrate high performance in terms of various evaluation measures. The explanations generated for each prediction highlight important relevant words that align with the predicted class, making the prediction more understandable. Explainable systems have the potential to facilitate the adoption of complex AI-enabled methods for patent classification in real-world applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kucer, M., Oyen, D., Castorena, J., Wu, J.: Deeppatent: large scale patent drawing recognition and retrieval. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2309–2318 (2022)
Li, S., Jie, H., Cui, Y., Jianjun, H.: Deeppatent: patent classification with convolutional neural networks and word embedding. Scientometrics 117, 721–744 (2018)
Lee, J.-S., Hsiang, J.: Patent classification by fine-tuning Bert language model. World Patent Inf. 61, 101965 (2020)
D’hondt, E., Verberne, S., Koster, C., Boves, L.: Text representations for patent classification. Comput. Linguist. 39(3), 755–775 (2013)
Luo, M., Shi, X., Ji, Q., Shang, M., He, X., Tao, W.: A deep self-learning classification framework for incomplete medical patents with multi-label. In: Liu, Y., Wang, L., Zhao, L., Yu, Z. (eds.) ICNC-FSKD 2019. AISC, vol. 1075, pp. 566–573. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-32591-6_61
Jiang, S., Hu, J., Magee, C.L., Luo, J.: Deep learning for technical document classification. IEEE Trans. Eng. Manag. (2022)
Chen, L., Shuo, X., Zhu, L., Zhang, J., Lei, X., Yang, G.: A deep learning based method for extracting semantic information from patent documents. Scientometrics 125, 289–312 (2020)
Fang, L., Zhang, L., Han, W., Tong, X., Zhou, D., Chen, E.: Patent2vec: multi-view representation learning on patent-graphs for patent classification. World Wide Web 24(5), 1791–1812 (2021)
Haghighian Roudsari, A., Afshar, J., Lee, W., Lee, S.: Patentnet: multi-label classification of patent documents using deep learning based language understanding. Scientometrics, pp. 1–25 (2022)
Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: International Conference on Machine Learning, pp. 3145–3153. PMLR (2017)
Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017)
Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.-R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layerd-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015)
Kute, D.V., Pradhan, B., Shukla, N., Alamri, A.: Deep learning and explainable artificial intelligence techniques applied for detecting money laundering-a critical review. IEEE Access 9, 82300–82317 (2021)
Shajalal, M., Boden, A., Stevens, G.: Explainable product backorder prediction exploiting CNN: introducing explainable models in businesses. Electron. Mark. 32, 2107–2122 (2022)
Yang, G., Ye, Q., Xia, J.: Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: a mini-review, two showcases and beyond. Inf. Fusion 77, 29–52 (2022)
Adadi, A., Berrada, M.: Explainable AI for healthcare: from black box to interpretable models. In: Bhateja, V., Satapathy, S.C., Satori, H. (eds.) Embedded Systems and Artificial Intelligence. AISC, vol. 1076, pp. 327–337. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-0947-6_31
Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you? Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)
Binder, A., Montavon, G., Lapuschkin, S., Müller, K.-R., Samek, W.: Layer-wise relevance propagation for neural networks with local renormalization layers. In: Villa, A.E.P., Masulli, P., Pons Rivero, A.J. (eds.) ICANN 2016. LNCS, vol. 9887, pp. 63–71. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44781-0_8
Shalaby, M., Stutzki, J., Schubert, M., Günnemann, S.: An LSTM approach to patent classification based on fixed hierarchy vectors. In: Proceedings of the 2018 SIAM International Conference on Data Mining, pp. 495–503. SIAM (2018)
Roudsari, A.H., Afshar, J., Lee, C.C., Lee, W.: Multi-label patent classification using attention-aware deep learning model. In: 2020 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 558–559. IEEE (2020)
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196. PMLR (2014)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26, 3111–3119 (2013)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Shajalal, M., Aono, M.: Sentence-level semantic textual similarity using word-level semantics. In: 2018 10th International Conference on Electrical and Computer Engineering (ICECE), pp. 113–116. IEEE (2018)
Shajalal, Md., Aono, M.: Semantic textual similarity between sentences using bilingual word semantics. Prog. Artif. Intell. 8, 263–272 (2019)
Shajalal, Md., Aono, M.: Coverage-based query subtopic diversification leveraging semantic relevance. Knowl. Inf. Syst. 62, 2873–2891 (2020)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Liu, Y., et al.: Roberta: a robustly optimized Bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. Adv. Neural Inf. Process. Syst. 32, 5753–5763 (2019)
Clark, K., Luong, M.T., Le, Q.V., Manning, C.D.: Electra: pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555 (2020)
Kang, M., Lee, S., Lee, W.: Prior art search using multi-modal embedding of patent documents. In: 2020 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 548–550. IEEE (2020)
Pujari, S.C., Friedrich, A., Strötgen, J.: A multi-task approach to neural multi-label hierarchical patent classification using transformers. In: Hiemstra, D., Moens, M.-F., Mothe, J., Perego, R., Potthast, M., Sebastiani, F. (eds.) ECIR 2021. LNCS, vol. 12656, pp. 513–528. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72113-8_34
Aroyehun, S.T., Angel, J., Majumder, N., Gelbukh, A., Hussain, A.: Leveraging label hierarchy using transfer and multi-task learning: a case study on patent classification. Neurocomputing 464, 421–431 (2021)
Roudsari, A.H., Afshar, J., Lee, S., Lee, W.: Comparison and analysis of embedding methods for patent documents. In: 2021 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 152–155. IEEE (2021)
Li, H., Li, S., Jiang, Y., Zhao, G.: CoPatE: a novel contrastive learning framework for patent embeddings. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 1104–1113 (2022)
Kamateri, E., Stamatis, V., Diamantaras, K., Salampasis, M.: Automated single-label patent classification using ensemble classifiers. In: 2022 14th International Conference on Machine Learning and Computing (ICMLC), pp. 324–330 (2022)
Arras, L., Horn, F., Montavon, G., Muller, K.R., Samek, W.: What is relevant in a text document?: an interpretable machine learning approach. PloS one 12(8), e0181142 (2017)
Arras, L., Montavon, G., Müller, K.R., Samek, W.: Explaining recurrent neural network predictions in sentiment analysis. arXiv preprint arXiv:1706.07206 (2017)
Karim, M.R., et al.: Deephateexplainer: explainable hate speech detection in under-resourced Bengali language. In: 2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10. IEEE (2021)
Bekamiri, H., Hain, D.S., Jurowetzki, R.: Patentsberta: a deep NLP based hybrid model for patent distance and classification using augmented SBERT. arXiv preprint arXiv:2103.11933 (2021)
Sharma, E., Li, C., Wang, L.: BIGPATENT: a large-scale dataset for abstractive and coherent summarization. arXiv preprint arXiv:1906.03741 (2019)
Acknowledgment
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 955422.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Shajalal, M., Denef, S., Karim, M.R., Boden, A., Stevens, G. (2023). Unveiling Black-Boxes: Explainable Deep Learning Models for Patent Classification. In: Longo, L. (eds) Explainable Artificial Intelligence. xAI 2023. Communications in Computer and Information Science, vol 1902. Springer, Cham. https://doi.org/10.1007/978-3-031-44067-0_24
Download citation
DOI: https://doi.org/10.1007/978-3-031-44067-0_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44066-3
Online ISBN: 978-3-031-44067-0
eBook Packages: Computer ScienceComputer Science (R0)