Skip to main content
Log in

A proposal for touching component segmentation in Arabic manuscripts

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Text-line segmentation is one of the key factors which affect the performance of handwriting recognition system. Therefore, to make recognition systems more effective and accurate, segmentation of touching text-lines is an important task. One of the problems making this task crucial is the presence of touching components (TCs) representing connections between word letters of consecutive text-lines or those of words of the same text-line. The proposed method aims to segment TCs. It is mainly based on two steps: (1) finding for a localized TC a similar model, stored in a dictionary with its correct segmentation, using shape context descriptor and an interpolation function: the thin plate spline transformation, (2) segmenting the TC based on central point of the found similar model parts. TCs are assumed to be already extracted from Arabic manuscript images. Experiments are carried on a common TC database, using two metrics: Manhattan and Euclidean distances. Obtained results outperform the state of the art, considering the different types, variability and complexity of the TCs data set, and show the effectiveness of the proposed TC segmentation method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28

Similar content being viewed by others

References

  1. Kang L, Doermann D (2011) Template based segmentation of touching components in handwritten text-lines. Proceedings of the ICDAR, Beijining

    Book  Google Scholar 

  2. Ouwayed N, Belaïd A (2009) Separation of overlapping and touching lines within handwritten Arabic documents. In: Proceedings of the 13th international conference on computer analysis of images and patterns, Münster (North Rhine-Westphalia), pp 237–244

  3. Kumar J, Kang L, Doermann DS, Abd-Almageed W (2011) Segmentation of handwritten text-lines in presence of touching components. In: Proceedings of the ICDAR, pp 109–113

  4. Likforman-Sulem L, Faure C (1995) Une méthode de résolution des conflits d’alignements pour la segmentation des documents manuscrits. Traitement Signal 12(6):541–549

    Google Scholar 

  5. Vassilis P, Themos S, Vassilis K, George C (2010) Handwritten document image segmentation into text lines and words. Pattern Recognit 43(1):369–377

    Article  MATH  Google Scholar 

  6. Lemaitre A, Camillerapp J, Cousnon B (2011) A perceptive method for handwritten text segmentation. Proceedings of the DRR, San Francisco

    Book  Google Scholar 

  7. Ouloudis GL, Gatos B, Pratikakis I, Halatsis C (2009) Text-line and word segmentation of handwritten documents. Pattern Recognit 42(12):3169–3183

    Article  MATH  Google Scholar 

  8. Takru K, Leedham G (2002) Separation of touching and overlapping words in adjacent lines of handwritten text. Proceedings of the IWFHR, Ontario

    Book  Google Scholar 

  9. Rohini S, Uma Devi RS, Mohanavel S (2012) Segmentation of touching, overlapping, skewed and short handwritten text lines. Int J Comput Appl 49(19):24–27

  10. Ouwayed N (2010) Segmentation en lignes de documents anciens: application aux documents arabes. PhD thesis, Nancy University

  11. Farnandez-Mota D, Lldos J, Fornes A (2014) Graph based approach for segmenting touching lines in historical handwritten documents. In: IJDAR

  12. Kang L, Doermann DS, Cao H , Prasad R, Natarajan P (2012) Local segmentation of touching characters using contour based shape decomposition. In: Proceedings of the DAS, Gold Coast, pp  460–464

  13. Boukerma H, Farah N (2012) PAW-IFN/ENIT: une nouvelle base de pseudo-mots arabes pour une approche de reconnaissance pseudo analytique, vol 1, ARIMA, pp 1–9

  14. Alaei A, Pal U, Nagabhushan P (2011) A new scheme for unconstrained handwritten text-line segmentation. Pattern Recognit 44(4):917–928

    Article  Google Scholar 

  15. Alaei A, Nagabhushan P, Pal U (2011) Piece-wise painting technique for line segmentation of unconstrained handwritten text: a specific study with Persian text documents. Pattern Anal Appl 14:381–394

    Article  MathSciNet  Google Scholar 

  16. Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape context. In: IEEE transactions on pattern analysis and machine intelligence, pp 509–522

  17. Bookstein FL (1989) Principal warps: thin-plane spline and the decomposition of deformations. In: IEEE transactions on pattern analysis and machine intelligence

  18. Schaefer SE (2007) Graph clustering survey. Elsevier, New York

  19. Piquin P, Viard-Gaudin C, Barba D (1994) Coopération des outils de segmentation et de binarisation de documents. In: Proceedings of the colloque national sur l’Ecrit et le document, Rouen

    Google Scholar 

  20. Ikeda H, Ogawa Y, Koga M, Nishimura H, Sako H, Fujisawa H (1999) A recognition method for touching Japanese handwritten characters. In: Proceedings of the ICDAR 99, pp 641–644

  21. Tseng YH, Lee HJ (1999) Recognition-based handwritten Chinese character segmentation using a probabilistic Viterbi algorithm. Pattern Recognit Lett 20(8):791–806

    Article  Google Scholar 

  22. You D, Kim G (2003) An approach for locating segmentation points of handwritten digit strings using a neural network. In: Proceedings of the ICDAR, pp 142–146

  23. Vellasques E, Oliveira LS, Sabourin R, Britto AS, Koerich AL (2006) Modeling segmentation cuts using support vector machines. In: Proceedings of the IWFHR, pp 41–46

  24. Pal US, Datta S (2003) Segmentation of Bangla unconstrained handwritten text. In: Proceedings of the ICDAR, pp 1128–1132

  25. Zahour A, Taconet B, Ramdane S (2004) Contribution la segmentation de textes manuscrits anciens. In: Proceedings of the CIFED 2004, La Rochelle

  26. Chen YA, Leedham G (2005) Independent component analysis segmentation algorithm. In: Proceedings of the ICDAR, pp 680–684

  27. Bruzzone E, Coffetti MC (1999) An algorithm for extracting cursive text lines. In: Proceedings of the ICDAR, 20–22 September 1999, pp 749–752

  28. Aouadi N, Kacem A, Belaïd A (2014) Segmentation of touching component in Arabic manuscripts. In: Proceedings of the ICFHR, 1–4 September 2014, pp 452–457

  29. Aouadi N, Amiri S, Kacem A (2013) Segmentation of connected component in Arabic handwritten documents. In: Proceedings of the CIMTA, vol 10. Elsevier, New York, pp 738–746

  30. Aouadi N, Amiri S, Kacem A (2013) Segmentation of touching components in Arabic handwritten documents. In: Proceedings of the CIMTA, 27–28 September 2013, pp 738–746

  31. The Java Tutorial: Thread Pools. http://www.math.unihamburg.de/doc/java/tutorial/essential/threads/group.html. Accessed 14 Feb 2005

  32. Canny J (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 8(6):679–698

    Article  Google Scholar 

  33. Jonker R, Volgenant A (1987) A shortest augmentating path algorithm for dense and sparse linear assignment problem. J Comput 38(4):325–340

    Article  MATH  Google Scholar 

  34. Chui H (2001) Non-rigid point matching: algorithms, extensions and applications. PhD dissertation, Yale University

  35. Gatos B, Stamatopoulos N, Louloudis G (2009) ICDAR2009 handwriting segmentation contest. In: Proceedings of the ICDAR, 26–29 July 2009, 1393–1397

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Afef Kacem.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aouadi, N., Kacem, A. A proposal for touching component segmentation in Arabic manuscripts. Pattern Anal Applic 20, 1005–1027 (2017). https://doi.org/10.1007/s10044-016-0543-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-016-0543-1

Keywords

Navigation