Skip to main content

Adaptive Scaling for Archival Table Structure Recognition

  • Conference paper
  • First Online:
Document Analysis and Recognition – ICDAR 2021 (ICDAR 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12821))

Included in the following conference series:

Abstract

Table detection and structure recognition from archival document images remain challenging due to diverse table structures, complex document layouts, degraded image qualities and inconsistent table scales. In this paper, we propose an instance segmentation based approach for archival table structure recognition which utilizes both foreground cell content and background ruling line information. To overcome the influence from inconsistent table scales, we design an adaptive image scaling method based on average cell size and density of ruling lines inside each document image. Different from previous multi-scale training and testing approaches which usually slow down the speed of the whole system, our adaptive scaling resizes each image to a single optimal size which can not only improve overall model performance but also reduce memory and computing overhead on average. Extensive experiments on cTDaR 2019 Archival dataset show that our method can outperform the baselines and achieve new state-of-the-art performance, which demonstrates the effectiveness and superiority of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://pytorch.org/get-started/locally/.

References

  1. Agarwal, M., Mondal, A., Jawahar, C.: Cdec-net: composite deformable cascade network for table detection in document images. arXiv:2008.10831 (2020)

  2. Arias, J.F., Kasturi, R.: Efficient extraction of primitives from line drawings composed of horizontal and vertical lines. Mach. Vis. Appl. 10(4), 214–221 (1997)

    Article  Google Scholar 

  3. Bai, M., Urtasun, R.: Deep watershed transform for instance segmentation. In: CVPR, pp. 5221–5229 (2017)

    Google Scholar 

  4. Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587 (2017)

  5. Chi, Z., Huang, H., Xu, H.D., Yu, H., Yin, W., Mao, X.L.: Complicated table structure recognition. arXiv:1908.04729 (2019)

  6. Chin, T.W., Ding, R., Marculescu, D.: Adascale: towards real-time video object detection using adaptive scaling. arXiv:1902.02910 (2019)

  7. Deng, Y., Rosenberg, D., Mann, G.: Challenges in end-to-end neural scientific table recognition. In: ICDAR, pp. 894–901. IEEE (2019)

    Google Scholar 

  8. Gao, L., et al.: Icdar 2019 competition on table detection and recognition (ctdar). In: ICDAR, pp. 1510–1515. IEEE (2019)

    Google Scholar 

  9. Gatos, B., Danatsas, D., Pratikakis, I., Perantonis, S.J.: Automatic table detection in document images. In: Singh, S., Singh, M., Apte, C., Perner, P. (eds.) ICAPR 2005. LNCS, vol. 3686, pp. 609–618. Springer, Heidelberg (2005). https://doi.org/10.1007/11551188_67

    Chapter  Google Scholar 

  10. Ghanmi, N., Belaid, A.: Table detection in handwritten chemistry documents using conditional random fields. In: ICFHR, pp. 146–151. IEEE (2014)

    Google Scholar 

  11. Gilani, A., Qasim, S.R., Malik, I., Shafait, F.: Table detection using deep learning. In: ICDAR, vol. 1, pp. 771–776. IEEE (2017)

    Google Scholar 

  12. Girshick, R.: Fast r-cnn. In: ICCV (2015)

    Google Scholar 

  13. Hayder, Z., He, X., Salzmann, M.: Boundary-aware instance segmentation. In: CVPR, pp. 5696–5704 (2017)

    Google Scholar 

  14. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: ICCV, pp. 2961–2969 (2017)

    Google Scholar 

  15. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  16. He, W., Zhang, X.Y., Yin, F., Luo, Z., Ogier, J.M., Liu, C.L.: Realtime multi-scale scene text detection with scale-based region proposal network. Pattern Recogn. 98, 107026 (2020)

    Google Scholar 

  17. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR, pp. 4700–4708 (2017)

    Google Scholar 

  18. Huang, Y., et al.: A yolo-based table detection method. In: ICDAR, pp. 813–818. IEEE (2019)

    Google Scholar 

  19. Khan, S.A., Khalid, S.M.D., Shahzad, M.A., Shafait, F.: Table structure extraction with bi-directional gated recurrent unit networks. In: ICDAR, pp. 1366–1371. IEEE (2019)

    Google Scholar 

  20. Li, M., Cui, L., Huang, S., Wei, F., Zhou, M., Li, Z.: Tablebank: Table benchmark for image-based table detection and recognition. arXiv:1903.01949 (2019)

  21. Li, X.H., Yin, F., Liu, C.L.: Page object detection from pdf document images by deep structured prediction and supervised clustering. In: ICPR, pp. 3627–3632. IEEE (2018)

    Google Scholar 

  22. Li, X.-H., Yin, F., Liu, C.-L.: Page segmentation using convolutional neural network and graphical model. In: Bai, X., Karatzas, D., Lopresti, D. (eds.) DAS 2020. LNCS, vol. 12116, pp. 231–245. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57058-3_17

    Chapter  Google Scholar 

  23. Li, X.H., Yin, F., Xue, T., Liu, L., Ogier, J.M., Liu, C.L.: Instance aware document image segmentation using label pyramid networks and deep watershed transformation. In: ICDAR, pp. 514–519. IEEE (2019)

    Google Scholar 

  24. Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR, pp. 2117–2125 (2017)

    Google Scholar 

  25. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)

    Google Scholar 

  26. Qasim, S.R., Mahmood, H., Shafait, F.: Rethinking table recognition using graph neural networks. In: ICDAR, pp. 142–147. IEEE (2019)

    Google Scholar 

  27. Raja, S., Mondal, A., Jawahar, C.: Table structure recognition using top-down and bottom-up cues. arXiv:2010.04565 (2020)

  28. Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv:1804.02767 (2018)

  29. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)

    Google Scholar 

  30. Riba, P., Dutta, A., Goldmann, L., Fornés, A., Ramos, O., Lladós, J.: Table detection in invoice documents by graph neural networks. In: ICDAR, pp. 122–127. IEEE (2019)

    Google Scholar 

  31. Richardson, E., et al.: It’s all about the scale-efficient text detection using adaptive scaling. In: WACV, pp. 1844–1853 (2020)

    Google Scholar 

  32. Seo, W., Koo, H.I., Cho, N.I.: Junction-based table detection in camera-captured document images. IJDAR 18(1), 47–57 (2015)

    Article  Google Scholar 

  33. Siddiqui, S.A., Fateh, I.A., Rizvi, S.T.R., Dengel, A., Ahmed, S.: Deeptabstr: deep learning based table structure recognition. In: ICDAR, pp. 1403–1409. IEEE (2019)

    Google Scholar 

  34. Siddiqui, S.A., Khan, P.I., Dengel, A., Ahmed, S.: Rethinking semantic segmentation for table structure recognition in documents. In: ICDAR, pp. 1397–1402. IEEE (2019)

    Google Scholar 

  35. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)

  36. Sun, N., Zhu, Y., Hu, X.: Faster r-cnn based table detection combining corner locating. In: ICDAR, pp. 1314–1319. IEEE (2019)

    Google Scholar 

  37. Tensmeyer, C., Morariu, V.I., Price, B., Cohen, S., Martinez, T.: Deep splitting and merging for table structure decomposition. In: ICDAR, pp. 114–121. IEEE (2019)

    Google Scholar 

  38. Tseng, L.Y., Chen, R.C.: Recognition and data extraction of form documents based on three types of line segments. Pattern Recogn. 31(10), 1525–1540 (1998)

    Article  Google Scholar 

  39. Wang, W., et al.: Shape robust text detection with progressive scale expansion network. In: CVPR, pp. 9336–9345 (2019)

    Google Scholar 

  40. Xue, W., Li, Q., Tao, D.: Res2tim: reconstruct syntactic structures from table images. In: ICDAR, pp. 749–755. IEEE (2019)

    Google Scholar 

  41. Zheng, Y., Liu, C., Ding, X., Pan, S.: Form frame line detection with directional single-connected chain. In: ICDAR, pp. 699–703. IEEE (2001)

    Google Scholar 

  42. Zhong, X., ShafieiBavani, E., Yepes, A.J.: Image-based table recognition: data, model, and evaluation. arXiv:1911.10683 (2019)

  43. Zhong, X., Tang, J., Yepes, A.J.: Publaynet: largest dataset ever for document layout analysis. In: ICDAR, pp. 1015–1022. IEEE (2019)

    Google Scholar 

Download references

Acknowledgments

This work has been supported by the National Key Research and Development Program Grant 2020AAA0109702, the National Natural Science Foundation of China (NSFC) grants 61733007, 61721004.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheng-Lin Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, XH., Yin, F., Zhang, XY., Liu, CL. (2021). Adaptive Scaling for Archival Table Structure Recognition. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12821. Springer, Cham. https://doi.org/10.1007/978-3-030-86549-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86549-8_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86548-1

  • Online ISBN: 978-3-030-86549-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics