Skip to main content

Digital Line Segment Detection for Table Reconstruction in Document Images

Part of the Lecture Notes in Computer Science book series (LNCS,volume 13232)

Abstract

Table detection is often involved in many applications of document analysis as tables are frequently used to present structured information. In this context, we are interested in extracting table regions in document images. More precisely, we propose a method for table detection based on a recent edge line detector which is developed in the context of digital geometry and it allows to handle noisy document images. The extracted lines are then used to reconstruct the tables contained in the image. The method has been evaluated and compared to other state-of-the-art methods and shown a very competitive result.

Keywords

  • Line detector
  • Blurred segment
  • Adaptive directional scan
  • Materialized table extraction
  • Digital geometry

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (Canada)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    A similar line model has been used in [14] for line verification and table extraction. Their model consists in separating the 1D intensity profile, from left to right, into 3 zones: the intensity should begin to increase then (potentially) stabilize and decrease.

References

  1. ICDAR 2013 table competition dataset. https://www.tamirhassan.com/html/competition.html

  2. Marmot dataset. https://www.icst.pku.edu.cn/cpdp/sjzy/index.htm

  3. OpenCV: Open computer vision. https://opencv.org/

  4. UNLV table dataset. https://github.com/tesseract-ocr/

  5. Arif, S., Shafait, F.: Table detection in document images using foreground and background features. In: 2018 Digital Image Computing: Techniques and Applications (DICTA), pp. 1–8 (2018). https://doi.org/10.1109/DICTA.2018.8615795

  6. Cesarini, F., Marinai, S., Sarti, L., Soda, G.: Trainable table location in document images. In: 2002 International Conference on Pattern Recognition, vol. 3, pp. 236–240 (2002). https://doi.org/10.1109/ICPR.2002.1047838

  7. Even, P., Ngo, P., Kerautret, B.: Thick line segment detection with fast directional tracking. In: Ricci, E., Rota Bulò, S., Snoek, C., Lanz, O., Messelodi, S., Sebe, N. (eds.) ICIAP 2019. LNCS, vol. 11752, pp. 159–170. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30645-8_15

    CrossRef  Google Scholar 

  8. Fang, J., Tao, X., Tang, Z., Qiu, R., Liu, Y.: Dataset, ground-truth and performance metrics for table detection evaluation. In: 2012 10th IAPR International Workshop on Document Analysis Systems, pp. 445–449 (2012). https://doi.org/10.1109/DAS.2012.29

  9. Farrukh, W., et al.: Interpreting data from scanned tables. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 2, pp. 5–6 (2017). https://doi.org/10.1109/ICDAR.2017.250

  10. Gatos, B., Danatsas, D., Pratikakis, I., Perantonis, S.: Automatic table detection in document images. vol. 3686, pp. 609–618 (08 2005). https://doi.org/10.1007/11551188_67

  11. Gilani, A., Qasim, S.R., Malik, I., Shafait, F.: Table detection using deep learning. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 771–776 (2017). https://doi.org/10.1109/ICDAR.2017.131

  12. Green, E., Krishnamoorthy, M.: Model-based analysis of printed tables, pp. 214–217, January 1995. https://doi.org/10.1109/ICDAR.1995.598979

  13. Göbel, M., Hassan, T., Oro, E., Orsi, G.: ICDAR 2013 table competition. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1449–1453 (2013). https://doi.org/10.1109/ICDAR.2013.292

  14. Alhéritière, H., Amaïeur, W., Cloppet, F., Kurtz, C., Ogier, J.-M., Vincent, N.: Straight line reconstruction for fully materialized table extraction in degraded document images. In: Couprie, M., Cousty, J., Kenmochi, Y., Mustafa, N. (eds.) DGCI 2019. LNCS, vol. 11414, pp. 317–329. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-14085-4_25

    CrossRef  MATH  Google Scholar 

  15. Hu, J., Kashi, R., Lopresti, D., Wilfong, G.: Medium-independent table detection, December 1999

    Google Scholar 

  16. Isabelle, D.R., Fabien, F., Jocelyne, R.D.: Optimal blurred segments decomposition of noisy shapes in linear time. Comput. Graph. 30(1), 30–36 (2006)

    CrossRef  Google Scholar 

  17. Kerautret, B., Even, P.: Blurred segments in gray level images for interactive line extraction. In: Wiederhold, P., Barneva, R.P. (eds.) IWCIA 2009. LNCS, vol. 5852, pp. 176–186. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10210-3_14

    CrossRef  Google Scholar 

  18. Kieninger, T.: Table structure recognition based on robust block segmentation, pp. 22–32 (1998)

    Google Scholar 

  19. Kieninger, T., Dengel, A.: An approach towards benchmarking of table structure recognition results, pp. 1232–1236, August 2005. https://doi.org/10.1109/ICDAR.2005.47

  20. Klette, R., Rosenfeld, A.: Digital Geometry - Geometric Methods for Digital Picture Analysis. Morgan Kaufmann, San Francisco (2004)

    MATH  Google Scholar 

  21. Mandal, S., Chowdhury, S., Das, A., Chanda, B.: Simple and effective table detection system from document images. Int. J. Doc. Anal. Recogn. 8, 172–182 (2006). https://doi.org/10.1007/s10032-005-0006-5

    CrossRef  Google Scholar 

  22. Minghao, L., Lei, C., Shaohan, H., Furu, W., Ming, Z., Zhoujun, L.: TableBank: a benchmark dataset for table detection and recognition (2019)

    Google Scholar 

  23. Prasad, D., Gadpal, A., Kapadni, K., Visave, M., Sultanpure, K.: Cascade TabNet: an approach for end to end table detection and structure recognition from image-based documents (2020)

    Google Scholar 

  24. Reveillès, J.P.: Géométrie discrète, calcul en nombres entiers et algorithmique. Thèse d’état, Université Strasbourg 1 (1991)

    Google Scholar 

  25. Shafait, F., Smith, R.: Table detection in heterogeneous documents. In: DAS 2010, pp. 65–72. Association for Computing Machinery, New York, NY, USA (2010). https://doi.org/10.1145/1815330.1815339

  26. Shahab, A., Shafait, F., Kieninger, T., Dengel, A.: An open approach towards the benchmarking of table structure recognition systems. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, DAS 2010, pp. 113–120. Association for Computing Machinery, New York, NY, USA (2010). https://doi.org/10.1145/1815330.1815345

  27. Watanabe, T., Naruse, H., Luo, Q., Sugie, N.: Structure analysis of table-form documents on the basis of the recognition of vertical and horizontal line segments. In: Proceedings of International Conference on Document Analysis and Recognition (ICDAR 1991), pp. 638–646 (1991)

    Google Scholar 

  28. Kieninger, T., Dengel, A.: The T-Recs table recognition and analysis system. In: Lee, S.-W., Nakano, Y. (eds.) DAS 1998. LNCS, vol. 1655, pp. 255–270. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48172-9_21

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Phuc Ngo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ngo, P. (2022). Digital Line Segment Detection for Table Reconstruction in Document Images. In: Sclaroff, S., Distante, C., Leo, M., Farinella, G.M., Tombari, F. (eds) Image Analysis and Processing – ICIAP 2022. ICIAP 2022. Lecture Notes in Computer Science, vol 13232. Springer, Cham. https://doi.org/10.1007/978-3-031-06430-2_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-06430-2_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-06429-6

  • Online ISBN: 978-3-031-06430-2

  • eBook Packages: Computer ScienceComputer Science (R0)