Abstract
This research presents a new key information extraction algorithm from shopping receipts. Specifically, we train semantic, visual and structural features through three deep learning methods, respectively, and formulate rule features according to the characteristics of shopping receipts. Then we propose a multi-class text classification algorithm based on multi-modal features using Bayesian deep learning. After post-processing the output of the classification algorithm, the key information we seek for can be obtained. Our algorithm was trained on a self-labeled Chinese shopping receipt dataset and compared with several baseline methods. Extensive experimental results demonstrate that the proposed method achieves optimal results on our Chinese receipt dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Gao, S., et al.: Hierarchical attention networks for information extraction from cancer pathology reports. J. Am. Med. Inf. Assoc. 25(3), 321–330 (2018)
Luo, K., Lu, J., Zhu, K.Q., Gao, W., Wei, J., Zhang, M.: Layout-aware information extraction from semi-structured medical images. Comput. Biol. Med. 107, 235–247 (2019)
Liu, X., Gao, F., Zhang, Q., Zhao, H.: Graph convolution for multimodal information extraction from visually rich documents. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 2, pp. 32–39 (2019)
Xu, Y., Li, M., Cui, L., Huang, S., Wei, F., Zhou, M.: LayoutLM: pre-training of text and layout for document image understanding. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1192–1200 (2020)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of the International Conference on Learning Representations (2013)
Joulin, A., Grave, É., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics, pp. 427–431 (2017)
Kenton, J.D.M.W.C., Toutanova, L.K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: Proceedings of the International Conference on Learning Representations (2017)
Blundell, C., Cornebise, J., Kavukcuoglu, K., Wierstra, D.: Weight uncertainty in neural network. In: Proceedings of the International Conference on Machine Learning, pp. 1613–1622 (2015)
Wang, H., Yeung, D.Y.: A survey on bayesian deep learning. ACM Comput. Surv. 53(5), 1–37 (2020)
Kendall, A., Gal, Y.: What uncertainties do we need in bayesian deep learning for computer vision? Adv. Neural Inf. Process. Syst. 30, 5574–5584 (2017)
Siddhant, A., Lipton, Z.C.: Deep bayesian active learning for natural language processing: results of a large-scale empirical study. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 2904–2909 (2018)
Xiao, T., Liang, S., Shen, W., Meng, Z.: Bayesian deep collaborative matrix factorization. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 5474–5481 (2019)
Ren, L., Zhou, H., Chen, J., Shao, L., Wu, Y., Zhang, H.: A transformer-based decoupled attention network for text recognition in shopping receipt images. In: Zhang, H., Yang, Z., Zhang, Z., Wu, Z., Hao, T. (eds.) NCAA 2021. CCIS, vol. 1449, pp. 563–577. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-5188-5_40
Sun, H., Kuang, Z., Yue, X., Lin, C., Zhang, W.: Spatial dual-modality graph reasoning for key information extraction. arXiv preprint arXiv:2103.14470 (2021)
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751 (2014)
Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence, pp. 2873–2879 (2016)
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 2267–2273 (2015)
Bai, S., Kolter, J.Z., Koltun, V.: An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271 (2018)
Zhang, Z., Han, X., Liu, Z., Jiang, X., Sun, M., Liu, Q.: ERNIE: enhanced language representation with informative entities. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1441–1451 (2019)
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under Grant no. 61972112 and no. 61832004, the Guangdong Basic and Applied Basic Research Foundation under Grant no. 2021B1515020088, the Shenzhen Science and Technology Program under Grant no. JCYJ20210324131203009, and the HITSZ-J &A Joint Laboratory of Digital Design and Intelligent Fabrication under Grant no. HITSZ-J &A-2021A01.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chen, J. et al. (2022). Extracting Key Information from Shopping Receipts by Using Bayesian Deep Learning via Multi-modal Features. In: Zhang, H., et al. Neural Computing for Advanced Applications. NCAA 2022. Communications in Computer and Information Science, vol 1637. Springer, Singapore. https://doi.org/10.1007/978-981-19-6142-7_29
Download citation
DOI: https://doi.org/10.1007/978-981-19-6142-7_29
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-6141-0
Online ISBN: 978-981-19-6142-7
eBook Packages: Computer ScienceComputer Science (R0)