Abstract
Table structure recognition is an important task in document analysis and attracts the attention of many researchers. However, due to the diversity of table types and the complexity of table structure, the performances of table structure recognition methods are still not well enough in practice. Row and column separators play a significant role in the two-stage table structure recognition and a better row and column separator segmentation result can improve the final recognition results. Therefore, in this paper, we present a novel deep learning model to detect row and column separators. This model contains a convolution encoder and two parallel row and column decoders. The encoder can extract the visual features by using convolution blocks; the decoder formulates the feature map as a sequence and uses a sequence labeling model, bidirectional long short-term memory networks (BiLSTM) to detect row and column separators. Experiments have been conducted on PubTabNet and the model is benchmarked on several available datasets, including PubTabNet, UNLV ICDAR13, ICDAR19. The results show that our model has a state-of-the-art performance than other strong models. In addition, our model shows a better generalization ability. The code is available on this site (www.github.com/L597383845/row-col-table-recognition).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chen, J., Lopresti, D.P.: Model-based tabular structure detection and recognition in noisy handwritten documents. In: 2012 International Conference on Frontiers in Handwriting Recognition, ICFHR 2012, pp. 75–80 (2012)
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, pp. 1724–1734 (2014)
Dengel, A., Kieninger, T.: A paper-to-HTML table converting system. In: Proceedings of Document Analysis Systems, pp. 356–365 (1998)
Gao, L., et al.: ICDAR 2019 competition on table detection and recognition (CTDAR). In: 2019 International Conference on Document Analysis and Recognition, ICDAR 2019, pp. 1510–1515 (2019)
Göbel, M.C., Hassan, T., Oro, E., Orsi, G.: ICDAR 2013 table competition. In: 12th International Conference on Document Analysis and Recognition, ICDAR 2013, pp. 1449–1453 (2013)
Guo, Q., Qiu, X., Liu, P., Shao, Y., Xue, X., Zhang, Z.: Star-transformer. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, pp. 1315–1325 (2019)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Hu, J., Kashi, R.S., Lopresti, D.P., Wilfong, G.T.: Table structure recognition and its evaluation. In: Document Recognition and Retrieval VIII, 2001. SPIE Proceedings, vol. 4307, pp. 44–55 (2001)
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. CoRR abs/1508.01991 (2015)
Khan, S.A., Khalid, S.M.D., Shahzad, M.A., Shafait, F.: Table structure extraction with bi-directional gated recurrent unit networks. In: 2019 International Conference on Document Analysis and Recognition, ICDAR 2019, pp. 1366–1371 (2019)
Kieninger, T., Dengel, A.: The T-Recs table recognition and analysis system. In: Lee, S., Nakano, Y. (eds.) Document Analysis Systems: Theory and Practice, Third IAPR Workshop, DAS 1998. vol. 1655, pp. 255–269 (1998)
Kieninger, T., Dengel, A.: Table recognition and labeling using intrinsic layout features. In: International Conference on Advances in Pattern Recognition, pp. 307–316 (1999)
Prasad, D., Gadpal, A., Kapadni, K., Visave, M., Sultanpure, K.: CascadeTabNet: an approach for end to end table detection and structure recognition from image-based documents. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2020, pp. 2439–2447 (2020)
Qasim, S.R., Mahmood, H., Shafait, F.: Rethinking table recognition using graph neural networks. In: 2019 International Conference on Document Analysis and Recognition, ICDAR 2019, pp. 142–147 (2019)
Raja, S., Mondal, A., Jawahar, C.V.: Table structure recognition using top-down and bottom-up cues. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12373, pp. 70–86. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58604-1_5
Shahab, A., Shafait, F., Kieninger, T., Dengel, A.: An open approach towards the benchmarking of table structure recognition systems. In: The Ninth IAPR International Workshop on Document Analysis Systems, DAS 2010. pp. 113–120 (2010)
Siddiqui, S.A., Fateh, I.A., Rizvi, S.T.R., Dengel, A., Ahmed, S.: DeepTabStR: deep learning based table structure recognition. In: 2019 International Conference on Document Analysis and Recognition, ICDAR 2019, pp. 1403–1409 (2019)
Tensmeyer, C., Morariu, V.I., Price, B.L., Cohen, S., Martinez, T.R.: Deep splitting and merging for table structure decomposition. In: 2019 International Conference on Document Analysis and Recognition, ICDAR 2019, pp. 114–121 (2019)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems (NIPS) 2017, pp. 5998–6008 (2017)
Wang, Y., Phillips, I.T., Haralick, R.M.: Table structure understanding and its performance evaluation. Pattern Recognit. 37(7), 1479–1497 (2004)
Yan, H., Deng, B., Li, X., Qiu, X.: TENER: adapting transformer encoder for named entity recognition. CoRR abs/1911.04474 (2019)
Yan, Z., Ma, T., Gao, L., Tang, Z., Chen, C.: Persistence homology for link prediction: an interactive view. arXiv preprint arXiv:2102.10255 (2021)
Yuan, K., He, D., Jiang, Z., Gao, L., Tang, Z., Giles, C.L.: Automatic generation of headlines for online math questions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 9490–9497 (2020)
Yuan, K., He, D., Yang, X., Tang, Z., Kifer, D., Giles, C.L.: Follow the curve: arbitrarily oriented scene text detection using key points spotting and curve prediction. In: 2020 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2020)
Zhong, X., ShafieiBavani, E., Jimeno Yepes, A.: Image-based table recognition: data, model, and evaluation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12366, pp. 564–580. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58589-1_34
Acknowledgement
This work is supported by the projects of National Key R&D Program of China (2019YFB1406303) and National Natural Science Foundation of China (No. 61876003), which is also a research achievement of Key Laboratory of Science, Technology and Standard in Press Industry (Key Laboratory of Intelligent Press Media Technology).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, Y. et al. (2021). Rethinking Table Structure Recognition Using Sequence Labeling Methods. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12822. Springer, Cham. https://doi.org/10.1007/978-3-030-86331-9_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-86331-9_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86330-2
Online ISBN: 978-3-030-86331-9
eBook Packages: Computer ScienceComputer Science (R0)