Skip to main content

Lumped Approach to Recognize Types of Construction Defect from Text with Hand-Drawn Circles

  • Conference paper
  • First Online:
Advances in Information Technology in Civil and Building Engineering (ICCCBE 2022)

Part of the book series: Lecture Notes in Civil Engineering ((LNCE,volume 357))

Abstract

This study aims to improve the performance of optical character recognition (OCR), particularly in identifying printed Korean text marked by hand-drawn circles from images of construction defect tags. Despite advancements in mobile technologies, marking text on paper remains a prevalent practice. The typical approach for recognition in this context is to first detect the circles from the images and then identify the text entity within the region using OCR. Numerous OCR models have been developed to automatically identify various text types, but even a competition-winning multilingual model by Baek et al. does not perform well in recognizing circled Korean text, yielding a weighted F1 score of just 69%. The core idea of the lumped approach proposed in this study is to recognize circles and named entities as one instance. For this purpose, the YOLOv5 is fine tuned to detect 65 types of named entity overlapped with hand-drawn circles and yields a weighted F1 score of 94%, 25% higher than a typical approach using YOLOv5 for circle detection and a model by Baek et al. for subsequent OCR. This work thereby introduces a novel approach for developing advanced text information extraction methods and processing paper-based marked text in the construction industry.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Singh A, Bacchuwar K, Bhasin A (2012) A survey of OCR applications. IJMLC, 314–318. https://doi.org/10.7763/IJMLC.2012.V2.137

  2. Baek J et al (2019) What is wrong with scene text recognition model comparisons? Dataset and model analysis. In: 2019 IEEE/CVF international conference on computer vision (ICCV). IEEE, Seoul, Korea (South), pp 4714–4722. https://doi.org/10.1109/ICCV.2019.00481

  3. Islam N, Islam Z, Noor N (2016) A survey on optical character recognition system. J Inf Commun Technol 10:4

    Google Scholar 

  4. Bassil Y, Alwani M (2012) OCR post-processing error correction algorithm using google online spelling suggestion. arXiv:1204.0191

  5. Gossweiler. R, Kamvar. M, Baluja S (2009) What’s up CAPTCHA?: A CAPTCHA based on image orientation. In: Proceedings of the 18th international conference on World wide web - WWW 2009. ACM Press, Madrid, Spain, p 841. https://doi.org/10.1145/1526709.1526822

  6. Optical character recognition market size report (2030). https://www.grandviewresearch.com/industry-analysis/optical-character-recognition-market. Accessed 14 July 2022

  7. Awel MA, Abidi AI (2019) Review on optical character recognition. Int Res J Eng Technol (IRJET) 06:5

    Google Scholar 

  8. Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Lingvisticæ Investigationes. 30:3–26. https://doi.org/10.1075/li.30.1.03nad

    Article  Google Scholar 

  9. Packer T.L et al (2010) Extracting person names from diverse and noisy OCR text. In: Proceedings of the fourth workshop on Analytics for noisy unstructured text data—AND 2010. ACM Press, Toronto, ON, Canada, p 19. https://doi.org/10.1145/1871840.1871845

  10. Rodriquez KJ, Bryant M, Blanke T, Luszczynska M (2012) Comparison of named entity recognition tools for raw OCR text. In: 2012 conference on natural language processing (KONVENS), p 5

    Google Scholar 

  11. Hamdi A, Jean-Caurant A, Sidère N, Coustaty M, Doucet A (2020) Assessing and minimizing the impact of OCR quality on named entity recognition. In: Hall M, Merčun T, Risse T, Duchateau F (eds) International conference on theory and practice of digital libraries. Springer, Cham, pp 87–101. https://doi.org/10.1007/978-3-030-54956-5_7

  12. Zheng Z, Lu X-Z, Chen K-Y, Zhou Y-C, Lin J-R (2022) Pretrained domain-specific language model for natural language processing tasks in the AEC domain. Comput Ind 142:103733. https://doi.org/10.1016/j.compind.2022.103733

    Article  Google Scholar 

  13. GitHub - ultralytics/yolov5: YOLOv5 in PyTorch > ONNX > CoreML > TFLite. https://github.com/Ultralytics/Yolov5. Accessed 07 July 2022

  14. Pi Y, Duffield N, Behzadan AH, Lomax T (2022) Visual recognition for urban traffic data retrieval and analysis in major events using convolutional neural networks. Comput. Urban Sci. 2:2. https://doi.org/10.1007/s43762-021-00031-w

    Article  Google Scholar 

  15. Xu Y, Zhang J (2022) UAV-based bridge geometric shape measurement using automatic bridge component detection and distributed multi-view reconstruction. Autom Constr 140:104376. https://doi.org/10.1016/j.autcon.2022.104376

    Article  Google Scholar 

  16. Sezen G, Cakir M, Atik ME, Duran Z (2022) Deep learning-based door and window detection from building façade. In: The international archives of the photogrammetry, remote sensing and spatial information sciences. Copernicus GmbH, pp 315–320. https://doi.org/10.5194/isprs-archives-XLIII-B4-2022-315-2022

  17. Redmon. J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 779–788. https://doi.org/10.1109/CVPR.2016.91

  18. Chng CK et al (2019) ICDAR2019 robust reading challenge on arbitrary-shaped text - RRC-ArT. In: 2019 International conference on document analysis and recognition (ICDAR), pp 1571–1576. https://doi.org/10.1109/ICDAR.2019.00252

  19. What is wrong with scene text recognition model comparisons? Dataset and model analysis. https://github.com/clovaai/deep-text-recognition-benchmark. Accessed 16 May 2022

  20. Korean font image. https://aihub.or.kr/aidata/133. Accessed 09 May 2022

  21. Rezgui Y, Zarli A (2006) Paving the way to the vision of digital construction: a strategic roadmap. J. Constr. Eng. Manage. 132:767–776. https://doi.org/10.1061/(ASCE)0733-9364(2006)132:7(767)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by a National Research Foundation of Korea grant funded by the Korean government (No. 2021R1A2C3008209).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ghang Lee .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Suh, S., Lee, G., Gil, D. (2024). Lumped Approach to Recognize Types of Construction Defect from Text with Hand-Drawn Circles. In: Skatulla, S., Beushausen, H. (eds) Advances in Information Technology in Civil and Building Engineering. ICCCBE 2022. Lecture Notes in Civil Engineering, vol 357. Springer, Cham. https://doi.org/10.1007/978-3-031-35399-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-35399-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-35398-7

  • Online ISBN: 978-3-031-35399-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics