Skip to main content

Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples

Abstract

This work focuses on the layout analysis of historical handwritten registers, in which local religious ceremonies were recorded. The aim of this work is to delimit each record using few available training data. To this end, two approaches are proposed. Firstly, three state-of-the-art object detection networks are explored and compared. Further experiments are then conducted on Mask R-CNN, as it yields the best performance. Secondly, we introduce and investigate Deep&Syntax, a hybrid system that takes advantages of recurrent patterns to delimit each record, by combining u-shaped networks and logical rules. Finally, these two approaches are evaluated on 3708 French records (sixteenth–eighteenth centuries), as well as on the Esposalles public database, containing 253 Spanish records (seventeenth century). While both systems perform well on homogeneous documents, we observe a significant drop in performance with Mask R-CNN on more challenging documents, especially when trained on a small, non-representative subset. By contrast, Deep&Syntax relies on steady patterns and is therefore able to process a wider range of documents with less training data. When both systems are trained on 120 documents, Deep&Syntax produces 15% more match configurations and reduces the ZoneMap surface error metric by 30%. It also outperforms Mask R-CNN when trained on a database three times smaller. As Deep&Syntax generalizes better, we believe it can be used for massive parish register processing, as collecting and annotating a sufficiently large and representative set of training data is not always achievable.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Notes

  1. 1.

    https://github.com/matterport/Mask_RCNN.

  2. 2.

    https://github.com/fizyr/keras-retinanet.

  3. 3.

    https://github.com/qqwweee/keras-yolo3.

  4. 4.

    https://gitlab.inria.fr/starride/structure-esposalles.

  5. 5.

    https://github.com/Transkribus/TranskribusBaseLineEvaluationScheme.

References

  1. 1.

    Alaasam, R., Kurar, B., El-Sana, J.: Layout analysis on challenging historical Arabic manuscripts using Siamese network. In: 15th International Conference on Document Analysis and Recognition, pp. 738–742 (2019)

  2. 2.

    Alberti, M., Pondenkandath, V., Würsch, M., Ingold, R., Liwicki, M.: Deepdiva: a highly-functional python framework for reproducible experiments. CoRR arXiv:1805.00329 (2018)

  3. 3.

    Alberti, M., Vögtlin, L., Pondenkandath, V., Seuret, M., Ingold, R., Liwicki, M.: Labeling, cutting, grouping: an efficient text line segmentation method for medieval manuscripts. CoRR arXiv:1906.11894 (2019)

  4. 4.

    Alvaro, F., Cruz, F., Sánchez, J.A., Ramos Terrades, O., Benedí, J.M.: Structure detection and segmentation of documents using 2D stochastic context-free grammars. Neurocomputing 150, 147–154 (2015)

    Article  Google Scholar 

  5. 5.

    Antonacopoulos, A., Gatos, B., Bridson, D.: Page segmentation competition. In: 9th International Conference on Document Analysis and Recognition, vol. 2, pp. 1279–1283 (2007)

  6. 6.

    Asi, A., Cohen, R., Kedem, K., El-Sana, J.: Simplifying the reading of historical manuscripts. In: 13th International Conference on Document Analysis and Recognition, pp. 826–830 (2015)

  7. 7.

    Baechler, M., Liwicki, M., Ingold, R.: Text line extraction using DMLP classifiers for historical manuscripts. In: 12th International Conference on Document Analysis and Recognition, pp. 1029–1033 (2013)

  8. 8.

    Barlas, P., Adam, S., Chatelain, C., Paquet, T.: A typed and handwritten text block segmentation system for heterogeneous and complex documents. In: 11th International Workshop on Document Analysis Systems, pp. 46–50 (2014)

  9. 9.

    Benjlaiel, M., Mullot, R., Alimi, A.M.: Multi-oriented handwritten annotations extraction from scanned documents. In: 11th International Workshop on Document Analysis Systems, pp. 126–130 (2014)

  10. 10.

    Bolshakov, I.A., Gelbukh, A.: Text segmentation into paragraphs based on local text cohesion. In: Matoušek, V., Mautner, P., Mouček, R., Taušer, K. (eds.) Text, Speech and Dialogue, pp. 158–166 (2001)

  11. 11.

    Brunessaux, S., Giroux, P., Grilhères, B., Manta, M., Bodin, M., Choukri, K., Galibert, O., Kahn, J.: The Maurdor project: improving automatic processing of digital documents. In: 11th International Workshop on Document Analysis Systems, pp. 349–354 (2014)

  12. 12.

    Bukhari, S., Shafait, F., Breuel, T.: Coupled Snakelets for curled text-line segmentation from warped document images. In: 11th International Journal on Document Analysis and Recognition vol. 16, pp. 1–21 (2011)

  13. 13.

    Bukhari, S.S., Shafait, F., Breuel, T.M.: High performance layout analysis of Arabic and Urdu document images. In: 11th International Conference on Document Analysis and Recognition, pp. 1275–1279 (2011)

  14. 14.

    Bulacu, M., Koert, R., Schomaker, L.: Layout analysis of handwritten historical documents for searching the archive of the cabinet of the Dutch queen. In: 9th International Conference on Document Analysis and Recognition (2007)

  15. 15.

    Carel, E., Burie, J.C., Courboulay, V., Ogier, J.M., Poulain d’Andecy, V.: Multiresolution approach based on adaptive superpixels for administrative documents segmentation into color layers. In: 13th International Conference on Document Analysis and Recognition, pp. 566–570 (2015)

  16. 16.

    Chen, K., Seuret, M., Liwicki, M., Hennebert, J., Ingold, R.: Page segmentation of historical document images with convolutional autoencoders. In: 13th International Conference on Document Analysis and Recognition, pp. 1011–1015 (2015)

  17. 17.

    Chen, K., Wei, H., Liwicki, M., Hennebert, J., Ingold, R.: Robust text line segmentation for historical manuscript images using color and texture. In: 22nd International Conference on Pattern Recognition, pp. 2978–2983 (2014)

  18. 18.

    Chen, K., Yin, F., Liu, C.: Hybrid page segmentation with efficient whitespace rectangles extraction and grouping. In: 12th International Conference on Document Analysis and Recognition, pp. 958–962 (2013)

  19. 19.

    Clausner, C., Antonacopoulos, A., Pletschacher, S.: A robust hybrid approach for text line segmentation in historical documents. In: 21st International Conference on Pattern Recognition, pp. 335–338 (2012)

  20. 20.

    Coüasnon, B.: Dmos, a generic document recognition method: Application to table structure analysis in a general and in a specific way. IJDAR 8, 111–122 (2006)

    Article  Google Scholar 

  21. 21.

    Coüasnon, B.B., Lemaitre, A.: DMOS, It’s your turn ! In: 1st International Workshop on Open Services and Tools for Document Analysis (2017)

  22. 22.

    Cruz, F., Terrades, O.R.: Em-based layout analysis method for structured documents. In: 22nd International Conference on Pattern Recognition, pp. 315–320 (2014)

  23. 23.

    Diem, M., Kleber, F., Sablatnig, R.: Text classification and document layout analysis of paper fragments. In: 11th International Conference on Document Analysis and Recognition, pp. 854–858 (2011)

  24. 24.

    Diem, M., Kleber, F., Sablatnig, R.: Text line detection for heterogeneous documents. In: 12th International Conference on Document Analysis and Recognition, pp. 743–747 (2013)

  25. 25.

    Diem, M., Kleber, F., Sablatnig, R., Gatos, B.: CBAD: ICDAR2019 competition on baseline detection. In: 15th International Conference on Document Analysis and Recognition, pp. 1494–1498 (2019)

  26. 26.

    Ferilli, S., Biba, M., Esposito, F., Basile, T.M.A.: A distance-based technique for non-manhattan layout analysis. In: 10th International Conference on Document Analysis and Recognition, pp. 231–235 (2009)

  27. 27.

    Fernández, F.C., Terrades, O.R.: Document segmentation using relative location features. In: 21st International Conference on Pattern Recognition, pp. 1562–1565 (2012)

  28. 28.

    Filippova, K., Strube, M.: Using linguistically motivated features for paragraph boundary identification. In: Conference on Empirical Methods in Natural Language Processing, pp. 267–274 (2006)

  29. 29.

    Fischer, A., Baechler, M., Garz, A., Liwicki, M., Ingold, R.: A combined system for text line extraction and handwriting recognition in historical documents. In: 11th International Workshop on Document Analysis Systems, pp. 71–75 (2014)

  30. 30.

    Fornès, A., Romero, V., Barò, A., Toledo, J.I., Sánchez, J.A., Vidal, E., Lladòs, J.: Icdar2017 competition on information extraction in historical handwritten records. In: 14th International Conference on Document Analysis and Recognition, vol. 01, pp. 1389–1394 (2017)

  31. 31.

    Gaceb, D., Eglin, V., Lebourgeois, F., Emptoz, H.: Application of graph coloring in physical layout segmentation. In: 19th International Conference on Pattern Recognition, pp. 1–4 (2008)

  32. 32.

    Galibert, O., Kahn, J., Oparin, I.: The zonemap metric for page segmentation and area classification in scanned documents. In: 21st International Conference on Image Processing, pp. 2594–2598 (2014)

  33. 33.

    Garz, A., Sablatnig, R., Diem, M.: Layout analysis for historical manuscripts using sift features. In: 11th International Conference on Document Analysis and Recognition, pp. 508–512 (2011)

  34. 34.

    Grüning, T., Labahn, R., Diem, M., Kleber, F., Fiel, S.: Read-bad: A new dataset and evaluation scheme for baseline detection in archival documents. In: 13th International Workshop on Document Analysis Systems, pp. 351–356 (2018)

  35. 35.

    Grüning, T., Leifert, G., Strauß, T., Labahn, R.: A two-stage method for text line detection in historical documents. CoRR arXiv:1802.03345 (2018)

  36. 36.

    He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. CoRR arXiv:1703.06870 (2017)

  37. 37.

    Hebert, D., Paquet, T., Nicolas, S.: Continuous crf with multi-scale quantization feature functions application to structure extraction in old newspaper. In: 11th International Conference on Document Analysis and Recognition, pp. 493–497 (2011)

  38. 38.

    Jaekyu Ha, Haralick, R.M., Phillips, I.T.: Document page decomposition by the bounding-box project. In: 3rd International Conference on Document Analysis and Recognition, vol. 2, pp. 1119–1122 vol.2 (1995)

  39. 39.

    Journet, N., Ramel, J.Y., Eglin, V., Mullot, R.: Document image characterization using a multiresolution analysis of the texture: application to old documents. Int. J. Doc. Anal. Recognit. 11(1), 9–18 (2008)

    Article  Google Scholar 

  40. 40.

    Kamola, G., Spytkowski, M., Paradowski, M., Markowska-Kaczmar, U.: Image-based logical document structure recognition. Pattern Anal. Appl. 18, 651–665 (2015)

    MathSciNet  Article  Google Scholar 

  41. 41.

    Kumar, J., Abd-Almageed, W., Kang, L., Doermann, D.: Handwritten Arabic text line segmentation using affinity propagation. In: 9th IAPR International Workshop on Document Analysis Systems, pp. 135–142 (2010)

  42. 42.

    Lemaitre, A., Camillerapp, J., Coüasnon, B.: Multiresolution cooperation makes easier document structure recognition. IJDAR 11, 97–109 (2008)

    Article  Google Scholar 

  43. 43.

    Lemaitre, A., Camillerapp, J., Coüasnon, B.: A perceptive method for handwritten text segmentation. Document recognition and retrieval XVIII 7874, (2011)

  44. 44.

    Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. CoRR arXiv:1708.02002 (2017)

  45. 45.

    Lin, T., Maire, M., Belongie, S.J., Bourdev, L.D., Girshick, R.B., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. CoRR arXiv:1405.0312 (2014)

  46. 46.

    Mehri, M., Gomez-Krämer, P., Héroux, P., Boucher, A., Mullot, R.: Texture feature evaluation for segmentation of historical document images. In: 2nd International Workshop on Historical Document Imaging and Processing, pp 102–109 (2013)

  47. 47.

    Mehri, M., Heroux, P., Gomez-Krämer, P., Boucher, A., Mullot, R.: A pixel labeling approach for historical digitized books. In: 12th International Conference on Document Analysis and Recognition, pp. 817–821 (2013)

  48. 48.

    Mehri, M., Hèroux, P., Mullot, R., Moreux, J., Coüasnon, B., Barrett, B.: ICDAR2019 competition on historical book analysis—HBA2019. In: 15th International Conference on Document Analysis and Recognition, pp. 1488–1493 (2019)

  49. 49.

    Moysset, B., Kermorvant, C., Wolf, C., Louradour, J.: Paragraph text segmentation into lines with recurrent neural networks. In: 13th International Conference on Document Analysis and Recognition, pp. 456–460 (2015)

  50. 50.

    Oliveira, D., Viana, M.: Fast cnn-based document layout analysis. In: IEEE International Conference on Computer Vision Workshops, pp. 1173–1180 (2017)

  51. 51.

    Oliveira, S.A., Seguin, B., Kaplan, F.: dhsegment: A generic deep-learning approach for document segmentation. CoRR arXiv:1804.10371 (2018)

  52. 52.

    Ouwayed, N., Belaïd, A.: A general approach for multi-oriented text line extraction of handwritten document. Int. J. Doc. Anal. Recognit. 14(4), 297–314 (2011)

    Article  Google Scholar 

  53. 53.

    Papavassiliou, V., Stafylakis, T., Katsouros, V., Carayannis, G.: Handwritten document image segmentation into text lines and words. Pattern Recognit. 43(1), 369–377 (2010)

    Article  Google Scholar 

  54. 54.

    Peng, X., Setlur, S., Govindaraju, V., Sitaram, R.: Handwritten text separation from annotated machine printed documents using Markov random fields. In: 11th International Journal on Document Analysis and Recognition, vol. 16, pp. 1–16 (2011)

  55. 55.

    Pinson, S.J., Barrett, W.A.: Connected component level discrimination of handwritten and machine-printed text using eigenfaces. In: 11th International Conference on Document Analysis and Recognition, pp. 1394–1398 (2011)

  56. 56.

    Prusty, A., Aitha, S., Trivedi, A., Sarvadevabhatla, R.K.: Indiscapes: Instance segmentation networks for layout parsing of historical INDIC manuscripts (2019)

  57. 57.

    Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. CoRR arXiv:1804.02767 (2018)

  58. 58.

    Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. CoRR arXiv:1506.01497 (2015)

  59. 59.

    Renton, G., Soullard, Y., Chatelain, C., Adam, S., Kermorvant, C., Paquet, T.: Fully convolutional network with dilated convolutions for handwritten text line segmentation. Int. J. Doc. Anal. Recognit. 21, 177–186 (2018)

    Article  Google Scholar 

  60. 60.

    Romero, V., Fornés, A., Serrano, N., Sánchez, J.A., Toselli, A.H., Frinken, V., Vidal, E., Lladós, J.: The esposalles database: an ancient marriage license corpus for off-line handwriting recognition. Pattern Recognit. 46, 1658–1669 (2013)

    Article  Google Scholar 

  61. 61.

    Ryu, J., Koo, H.I., Cho, N.I.: Language-independent text-line extraction algorithm for handwritten documents. IEEE Signal Process. Lett. 21(9), 1115–1119 (2014)

    Article  Google Scholar 

  62. 62.

    Saha, R., Mondal, A., Jawahar, C.V.: Graphical object detection in document images. In: 15th International Conference on Document Analysis and Recognition pp. 51–58 (2019)

  63. 63.

    Shafait, F., v. Beusekom, J., Keysers, D., Breuel, T.M.: Structural mixtures for statistical layout analysis. In: 8th International Workshop on Document Analysis Systems, pp. 415–422 (2008)

  64. 64.

    Tang, Y., Wu, X., Bu, W.: Text line segmentation based on matched filtering and top-down grouping for handwritten documents. In: 11th International Workshop on Document Analysis Systems, pp. 365–369 (2014)

  65. 65.

    Tarride, S., Lemaitre, A., Coüasnon, B., Tardivel, S.: Signature detection as a way to recognise historical parish register structure. In: 5th International Workshop on Historical Document Imaging and Processing, pp. 54–59 (2019)

  66. 66.

    Wei, H., Baechler, M., Slimane, F., Ingold, R.: Evaluation of SVM, MLP and GMM classifiers for layout analysis of historical documents. In: 12th International Conference on Document Analysis and Recognition, pp. 1220–1224 (2013)

  67. 67.

    Wei, H., Chen, K., Ingold, R., Liwicki, M.: Hybrid feature selection for historical document layout analysis. In: 14th International Conference on Frontiers in Handwriting Recognition, pp. 87–92 (2014)

  68. 68.

    Weliwitage, C., Harvey, A.L., Jennings, A.B.: Handwritten document offline text line segmentation. In: Digital Image Computing: Techniques and Applications, pp. 27–27 (2005)

  69. 69.

    Yi, X., Gao, L., Liao, Y., Zhang, X., Liu, R., Jiang, Z.: CNN based page object detection in document images. In: 14th International Conference on Document Analysis and Recognition, vol. 01, pp. 230–235 (2017)

  70. 70.

    Yin, F., Liu, C.: A variational Bayes method for handwritten text line segmentation. In: 10th International Conference on Document Analysis and Recognition, pp. 436–440 (2009)

  71. 71.

    Yin, F., Liu, C.L.: Handwritten chinese text line segmentation by clustering with distance metric learning. Pattern Recognit. 42(12), 3146–3157 (2009)

    Article  Google Scholar 

  72. 72.

    Ziaratban, M., Faez, K.: An adaptive script-independent block-based text line extraction. In: 20th International Conference on Pattern Recognition, pp. 249–252 (2010)

Download references

Acknowledgements

The BMS database is provided by Les Archives Départementales d’Ille-et-Vilaine, 35, France.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Solène Tarride.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tarride, S., Lemaitre, A., Coüasnon, B. et al. Combination of deep neural networks and logical rules for record segmentation in historical handwritten registers using few examples. IJDAR 24, 77–96 (2021). https://doi.org/10.1007/s10032-021-00362-8

Download citation

Keywords

  • Historical handwritten documents
  • Deep neural networks
  • Hybrid systems
  • Layout analysis