Abstract
Table detection and structure recognition from archival document images remain challenging due to diverse table structures, complex document layouts, degraded image qualities and inconsistent table scales. In this paper, we propose an instance segmentation based approach for archival table structure recognition which utilizes both foreground cell content and background ruling line information. To overcome the influence from inconsistent table scales, we design an adaptive image scaling method based on average cell size and density of ruling lines inside each document image. Different from previous multi-scale training and testing approaches which usually slow down the speed of the whole system, our adaptive scaling resizes each image to a single optimal size which can not only improve overall model performance but also reduce memory and computing overhead on average. Extensive experiments on cTDaR 2019 Archival dataset show that our method can outperform the baselines and achieve new state-of-the-art performance, which demonstrates the effectiveness and superiority of the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agarwal, M., Mondal, A., Jawahar, C.: Cdec-net: composite deformable cascade network for table detection in document images. arXiv:2008.10831 (2020)
Arias, J.F., Kasturi, R.: Efficient extraction of primitives from line drawings composed of horizontal and vertical lines. Mach. Vis. Appl. 10(4), 214–221 (1997)
Bai, M., Urtasun, R.: Deep watershed transform for instance segmentation. In: CVPR, pp. 5221–5229 (2017)
Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587 (2017)
Chi, Z., Huang, H., Xu, H.D., Yu, H., Yin, W., Mao, X.L.: Complicated table structure recognition. arXiv:1908.04729 (2019)
Chin, T.W., Ding, R., Marculescu, D.: Adascale: towards real-time video object detection using adaptive scaling. arXiv:1902.02910 (2019)
Deng, Y., Rosenberg, D., Mann, G.: Challenges in end-to-end neural scientific table recognition. In: ICDAR, pp. 894–901. IEEE (2019)
Gao, L., et al.: Icdar 2019 competition on table detection and recognition (ctdar). In: ICDAR, pp. 1510–1515. IEEE (2019)
Gatos, B., Danatsas, D., Pratikakis, I., Perantonis, S.J.: Automatic table detection in document images. In: Singh, S., Singh, M., Apte, C., Perner, P. (eds.) ICAPR 2005. LNCS, vol. 3686, pp. 609–618. Springer, Heidelberg (2005). https://doi.org/10.1007/11551188_67
Ghanmi, N., Belaid, A.: Table detection in handwritten chemistry documents using conditional random fields. In: ICFHR, pp. 146–151. IEEE (2014)
Gilani, A., Qasim, S.R., Malik, I., Shafait, F.: Table detection using deep learning. In: ICDAR, vol. 1, pp. 771–776. IEEE (2017)
Girshick, R.: Fast r-cnn. In: ICCV (2015)
Hayder, Z., He, X., Salzmann, M.: Boundary-aware instance segmentation. In: CVPR, pp. 5696–5704 (2017)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: ICCV, pp. 2961–2969 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
He, W., Zhang, X.Y., Yin, F., Luo, Z., Ogier, J.M., Liu, C.L.: Realtime multi-scale scene text detection with scale-based region proposal network. Pattern Recogn. 98, 107026 (2020)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR, pp. 4700–4708 (2017)
Huang, Y., et al.: A yolo-based table detection method. In: ICDAR, pp. 813–818. IEEE (2019)
Khan, S.A., Khalid, S.M.D., Shahzad, M.A., Shafait, F.: Table structure extraction with bi-directional gated recurrent unit networks. In: ICDAR, pp. 1366–1371. IEEE (2019)
Li, M., Cui, L., Huang, S., Wei, F., Zhou, M., Li, Z.: Tablebank: Table benchmark for image-based table detection and recognition. arXiv:1903.01949 (2019)
Li, X.H., Yin, F., Liu, C.L.: Page object detection from pdf document images by deep structured prediction and supervised clustering. In: ICPR, pp. 3627–3632. IEEE (2018)
Li, X.-H., Yin, F., Liu, C.-L.: Page segmentation using convolutional neural network and graphical model. In: Bai, X., Karatzas, D., Lopresti, D. (eds.) DAS 2020. LNCS, vol. 12116, pp. 231–245. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57058-3_17
Li, X.H., Yin, F., Xue, T., Liu, L., Ogier, J.M., Liu, C.L.: Instance aware document image segmentation using label pyramid networks and deep watershed transformation. In: ICDAR, pp. 514–519. IEEE (2019)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR, pp. 2117–2125 (2017)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
Qasim, S.R., Mahmood, H., Shafait, F.: Rethinking table recognition using graph neural networks. In: ICDAR, pp. 142–147. IEEE (2019)
Raja, S., Mondal, A., Jawahar, C.: Table structure recognition using top-down and bottom-up cues. arXiv:2010.04565 (2020)
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)
Riba, P., Dutta, A., Goldmann, L., Fornés, A., Ramos, O., Lladós, J.: Table detection in invoice documents by graph neural networks. In: ICDAR, pp. 122–127. IEEE (2019)
Richardson, E., et al.: It’s all about the scale-efficient text detection using adaptive scaling. In: WACV, pp. 1844–1853 (2020)
Seo, W., Koo, H.I., Cho, N.I.: Junction-based table detection in camera-captured document images. IJDAR 18(1), 47–57 (2015)
Siddiqui, S.A., Fateh, I.A., Rizvi, S.T.R., Dengel, A., Ahmed, S.: Deeptabstr: deep learning based table structure recognition. In: ICDAR, pp. 1403–1409. IEEE (2019)
Siddiqui, S.A., Khan, P.I., Dengel, A., Ahmed, S.: Rethinking semantic segmentation for table structure recognition in documents. In: ICDAR, pp. 1397–1402. IEEE (2019)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Sun, N., Zhu, Y., Hu, X.: Faster r-cnn based table detection combining corner locating. In: ICDAR, pp. 1314–1319. IEEE (2019)
Tensmeyer, C., Morariu, V.I., Price, B., Cohen, S., Martinez, T.: Deep splitting and merging for table structure decomposition. In: ICDAR, pp. 114–121. IEEE (2019)
Tseng, L.Y., Chen, R.C.: Recognition and data extraction of form documents based on three types of line segments. Pattern Recogn. 31(10), 1525–1540 (1998)
Wang, W., et al.: Shape robust text detection with progressive scale expansion network. In: CVPR, pp. 9336–9345 (2019)
Xue, W., Li, Q., Tao, D.: Res2tim: reconstruct syntactic structures from table images. In: ICDAR, pp. 749–755. IEEE (2019)
Zheng, Y., Liu, C., Ding, X., Pan, S.: Form frame line detection with directional single-connected chain. In: ICDAR, pp. 699–703. IEEE (2001)
Zhong, X., ShafieiBavani, E., Yepes, A.J.: Image-based table recognition: data, model, and evaluation. arXiv:1911.10683 (2019)
Zhong, X., Tang, J., Yepes, A.J.: Publaynet: largest dataset ever for document layout analysis. In: ICDAR, pp. 1015–1022. IEEE (2019)
Acknowledgments
This work has been supported by the National Key Research and Development Program Grant 2020AAA0109702, the National Natural Science Foundation of China (NSFC) grants 61733007, 61721004.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, XH., Yin, F., Zhang, XY., Liu, CL. (2021). Adaptive Scaling for Archival Table Structure Recognition. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12821. Springer, Cham. https://doi.org/10.1007/978-3-030-86549-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-86549-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86548-1
Online ISBN: 978-3-030-86549-8
eBook Packages: Computer ScienceComputer Science (R0)