\(\textit{TexT}\) - Text Extractor Tool for Handwritten Document Transcription and Annotation

  • Anders Hast
  • Per Cullhed
  • Ekta VatsEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 806)


This paper presents a framework for semi-automatic transcription of large-scale historical handwritten documents and proposes a simple user-friendly text extractor tool, \(\textit{TexT}\) for transcription. The proposed approach provides a quick and easy transcription of text using computer assisted interactive technique. The algorithm finds multiple occurrences of the marked text on-the-fly using a word spotting system. \(\textit{TexT}\) is also capable of performing on-the-fly annotation of handwritten text with automatic generation of ground truth labels, and dynamic adjustment and correction of user generated bounding box annotations with the word being perfectly encapsulated. The user can view the document and the found words in the original form or with background noise removed for easier visualization of transcription results. The effectiveness of \(\textit{TexT}\) is demonstrated on an archival manuscript collection from well-known publicly available dataset.


Handwritten text recognition Transcription Annotation \(\textit{TexT}\) Word spotting Historical documents 



This work was supported by the Riksbankens Jubileumsfond (Dnr NHS14-2068:1) and the Swedish strategic research programme eSSENCE.


  1. 1.
    Mori, S., Nishida, H., Yamada, H.: Optical Character Recognition. Wiley, New York (1999)Google Scholar
  2. 2.
    Govindan, V.K., Shivaprasad, A.P.: Character recognition - a review. Pattern Recogn. 23(7), 671–683 (1990)CrossRefGoogle Scholar
  3. 3.
    Blanke, T., Bryant, M., Hedges, M.: Open source optical character recognition for historical research. J. Doc. 68(5), 659–683 (2012)CrossRefGoogle Scholar
  4. 4.
    Plamondon, R., Srihari, S.N.: Online and off-line handwriting recognition: a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 63–84 (2000)CrossRefGoogle Scholar
  5. 5.
    Marti, U.V., Bunke, H.: Hidden Markov Models, pp. 65–90. World Scientific Publishing Co., Inc., River Edge (2002)Google Scholar
  6. 6.
    Toselli, A.H., Vidal, E.: Handwritten text recognition results on the Bentham collection with improved classical N-gram-HMM methods. In: Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, HIP 2015, pp. 15–22. ACM, New York (2015)Google Scholar
  7. 7.
    Espana-Boquera, S., Castro-Bleda, M.J., Gorbe-Moya, J., Zamora-Martinez, F.: Improving offline handwritten text recognition with hybrid HMM/ANN models. IEEE Trans. Pattern Anal. Mach. Intell. 33(4), 767–779 (2011)CrossRefGoogle Scholar
  8. 8.
    Parvez, M.T., Mahmoud, S.A.: Offline Arabic handwritten text recognition: a survey. ACM Comput. Sur. 45(2), 23:1–23:35 (2013)zbMATHGoogle Scholar
  9. 9.
  10. 10.
    Howe, J.: The rise of crowdsourcing. Wired Mag. 14(6), 1–4 (2006)Google Scholar
  11. 11.
    Moyle, M., Tonra, J., Wallace, V.: Manuscript transcription by crowdsourcing: transcribe bentham. Liber Q. 20(3–4), 347–356 (2011)CrossRefGoogle Scholar
  12. 12.
  13. 13.
  14. 14.
  15. 15.
    Borne, K., Team, Z.: The zooniverse: a framework for knowledge discovery from citizen science data. In: AGU Fall Meeting Abstracts (2011)Google Scholar
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
    Hast, A., Fornés, A.: A segmentation-free handwritten word spotting approach by relaxed feature matching. In: 2016 12th IAPR Workshop on Document Analysis Systems (DAS), pp. 150–155. IEEE (2016)Google Scholar
  23. 23.
    Héroux, P., Barbu, E., Adam, S., Trupin, É.: Automatic ground-truth generation for document image analysis and understanding. In: Ninth International Conference on Document Analysis and Recognition, ICDAR 2007, pp. 476–480. IEEE (2007)Google Scholar
  24. 24.
    Pletschacher, S., Antonacopoulos, A.: The page (page analysis and ground-truth elements) format framework. In: 2010 20th International Conference on Pattern Recognition (ICPR), pp. 257–260. IEEE (2010)Google Scholar
  25. 25.
    Yanikoglu, B.A., Vincent, L.: Pink panther: a complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recogn. 31(9), 1191–1204 (1998)CrossRefGoogle Scholar
  26. 26.
    Kanungo, T., Lee, C.H., Czorapinski, J., Bella, I.: TRUEVIZ: a groundtruth/metadata editing and visualizing toolkit for OCR. In: Document Recognition and Retrieval VIII, vol. 4307, pp. 1–13. International Society for Optics and Photonics (2000)Google Scholar
  27. 27.
    Yacoub, S., Saxena, V., Sami, S.N.: PerfectDoc: a ground truthing environment for complex documents. In: Proceedings of the Eighth International Conference on Document Analysis and Recognition, pp. 452–456. IEEE (2005)Google Scholar
  28. 28.
    Saund, E., Lin, J., Sarkar, P.: PixLabeler: user interface for pixel-level labeling of elements in document images. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 646–650. IEEE (2009)Google Scholar
  29. 29.
    Doermann, D., Zotkina, E., Li, H.: GEDI - a groundtruthing environment for document images. In: Ninth IAPR International Workshop on Document Analysis Systems (DAS) (2010)Google Scholar
  30. 30.
    Clausner, C., Pletschacher, S., Antonacopoulos, A.: Aletheia - an advanced document layout and text ground-truthing system for production environments. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 48–52. IEEE (2011)Google Scholar
  31. 31.
    Biller, O., Asi, A., Kedem, K., El-Sana, J., Dinstein, I.: WebGT: an interactive web-based system for historical document ground truth generation. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 305–308. IEEE (2013)Google Scholar
  32. 32.
    Valsecchi, F., Abrate, M., Bacciu, C., Piccini, S., Marchetti, A.: Text encoder and annotator: an all-in-one editor for transcribing and annotating manuscripts with RDF. In: Sack, H., Rizzo, G., Steinmetz, N., Mladenić, D., Auer, S., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9989, pp. 399–407. Springer, Cham (2016). CrossRefGoogle Scholar
  33. 33.
    Antonacopoulos, A., Karatzas, D., Bridson, D.: Ground truth for layout analysis performance evaluation. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 302–311. Springer, Heidelberg (2006). CrossRefGoogle Scholar
  34. 34.
    Vats, E., Hast, A.: On-the-fly historical handwritten text annotation. In: Proceedings of the 2017 Workshop on Human-Document Interaction (2017, in press)Google Scholar
  35. 35.
    Wei, H., Seuret, M., Liwicki, M., Ingold, R.: The use of Gabor features for semi-automatically generated polyon-based ground truth of historical document images. Digit. Scholarsh. Humanit. 32(1), i134–i149 (2017)CrossRefGoogle Scholar
  36. 36.
    Romero, V., Bosch, V., Hernández, C., Vidal, E., Sánchez, J.A.: A historical document handwriting transcription end-to-end system. In: Alexandre, L.A., Salvador Sánchez, J., Rodrigues, J.M.F. (eds.) IbPRIA 2017. LNCS, vol. 10255, pp. 149–157. Springer, Cham (2017). CrossRefGoogle Scholar
  37. 37.
    Terrades, O.R., Toselli, A.H., Serrano, N., Romero, V., Vidal, E., Juan, A.: Interactive layout analysis and transcription systems for historic handwritten documents. In: 10th ACM Symposium on Document Engineering, pp. 219–222 (2010)Google Scholar
  38. 38.
    Serrano, N., Pérez, D., Sanchis, A., Juan, A.: Adaptation from partially supervised handwritten text transcriptions. In: Proceedings of the 2009 International Conference on Multimodal Interfaces, ICMI-MLMI 2009, pp. 289–292. ACM, New York (2009)Google Scholar
  39. 39.
    Serrano, N., Giménez, A., Sanchis, A., Juan, A.: Active learning strategies for handwritten text transcription. In: International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction, ICMI-MLMI 2010, pp. 48:1–48:4. ACM, New York (2010)Google Scholar
  40. 40.
    Romero, V., Toselli, A.H., Vidal, E.: Multimodal Interactive Handwritten Text Transcription, vol. 80. World Scientific, Singapore (2012)zbMATHGoogle Scholar
  41. 41.
  42. 42.
    Bosch, V., Toselli, A.H., Vidal, E.: Semiautomatic text baseline detection in large historical handwritten documents. In: 2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 690–695. IEEE (2014)Google Scholar
  43. 43.
    Giotis, A.P., Sfikas, G., Gatos, B., Nikou, C.: A survey of document image word spotting techniques. Pattern Recogn. 68, 310–332 (2017)CrossRefGoogle Scholar
  44. 44.
    Vats, E., Hast, A., Singh, P.: Automatic document image binarization using Bayesian optimization. In: Proceedings of the 2017 Workshop on Historical Document Imaging and Processing. ACM (2017, in press)Google Scholar
  45. 45.
    Kittur, A., Nickerson, J.V., Bernstein, M., Gerber, E., Shaw, A., Zimmerman, J., Lease, M., Horton, J.: The future of crowd work. In: Proceedings of the 2013 conference on Computer Supported Cooperative Work, pp. 1301–1318. ACM (2013)Google Scholar
  46. 46.
    Romero, V., Fornés, A., Serrano, N., Sánchez, J.A., Toselli, A.H., Frinken, V., Vidal, E., Lladós, J.: The ESPOSALLES database: an ancient marriage license corpus for off-line handwriting recognition. Pattern Recogn. 46(6), 1658–1669 (2013)CrossRefGoogle Scholar
  47. 47.
    Fernández-Mota, D., Almazán, J., Cirera, N., Fornés, A., Lladós, J.: BH2M: the Barcelona historical, handwritten marriages database. In: 2014 22nd International Conference on Pattern Recognition (ICPR), pp. 256–261. IEEE (2014)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Department of Information TechnologyUppsala UniversityUppsalaSweden
  2. 2.University LibraryUppsala UniversityUppsalaSweden

Personalised recommendations