Skip to main content

Subjective Assessments of Legibility in Ancient Manuscript Images - The SALAMI Dataset

  • Conference paper
  • First Online:
Pattern Recognition. ICPR International Workshops and Challenges (ICPR 2021)

Abstract

The research field concerned with the digital restoration of degraded written heritage lacks a quantitative metric for evaluating its results, which prevents the comparison of relevant methods on large datasets. Thus, we introduce a novel dataset of Subjective Assessments of Legibility in Ancient Manuscript Images (SALAMI) to serve as a ground truth for the development of quantitative evaluation metrics in the field of digital text restoration. This dataset consists of 250 images of 50 manuscript regions with corresponding spatial maps of mean legibility and uncertainty, which are based on a study conducted with 20 experts of philology and paleography. As this study is the first of its kind, the validity and reliability of its design and the results obtained are motivated statistically: we report a high intra- and inter-rater agreement and show that the bulk of variation in the scores is introduced by the image regions observed and not by controlled or uncontrolled properties of participants and test environments, thus concluding that the legibility scores measured are valid attributes of the underlying images.

Funded by the Austrian Science Fund (FWF): P29892.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Projects financed by the Austrian Science Fund (FWF) with grant numbers P19608-G12 (2007–2010), P23133 (2011–2014) and P29892 (2017–2019), as well as a project financed by the Austrian Federal Ministry of Science, Research and Economy (2014–2016).

  2. 2.

    PCA is frequently used as a standard procedure for dimensionality reduction and source separation in multispectral manuscript images [1, 8, 21].

References

  1. Arsene, C.T.C., Church, S., Dickinson, M.: High performance software in multidimensional reduction methods for image processing with application to ancient manuscripts. Manuscr. Cult. 11, 73–96 (2018)

    Google Scholar 

  2. Bates, D., Mächler, M., Bolker, B., Walker, S.: Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67(1), 1–48 (2015)

    Article  Google Scholar 

  3. Brenner, S.: SALAMI 1.0 (2020). https://doi.org/10.5281/zenodo.4270352

  4. De Simone, F., Naccari, M., Tagliasacchi, M., Dufaux, F., Tubaro, S., Ebrahimi, T.: Subjective assessment of H.264/AVC video sequences transmitted over a noisy channel. In: 2009 International Workshop on Quality of Multimedia Experience, QoMEx 2009, pp. 204–209 (2009)

    Google Scholar 

  5. Diem, M., Sablatnig, R.: Registration of ancient manuscript images using local descriptors. In: Digital Heritage, Proceedings of the 14th International Conference on Virtual Systems and Multimedia, pp. 188–192 (2008)

    Google Scholar 

  6. Easton, R.L., Christens-Barry, W.A., Knox, K.T.: Spectral image processing and analysis of the Archimedes Palimpsest. In: European Signal Processing Conference (Eusipco), pp. 1440–1444 (2011)

    Google Scholar 

  7. Ghadiyaram, D., Bovik, A.C.: Massive online crowdsourced study of subjective and objective picture quality. IEEE Trans. Image Process. 25(1), 372–387 (2016)

    Article  MathSciNet  Google Scholar 

  8. Giacometti, A., et al.: The value of critical destruction: evaluating multispectral image processing methods for the analysis of primary historical texts. Digit. Scholarsh. Humanit. 32(1), 101–122 (2017)

    Google Scholar 

  9. Glaser, L., Deckers, D.: The basics of fast-scanning XRF element mapping for iron-gall ink palimpsests. Manuscr. Cult. 7, 104–112 (2013)

    Google Scholar 

  10. Hedjam, R., Nafchi, H.Z., Moghaddam, R.F., Kalacska, M., Cheriet, M.: ICDAR 2015 contest on multispectral text extraction (MS-TEx 2015). In: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR 2015, pp. 1181–1185 (November 2015)

    Google Scholar 

  11. Hollaus, F., Diem, M., Fiel, S., Kleber, F., Sablatnig, R.: Investigation of ancient manuscripts based on multispectral imaging. In: DocEng 2015 - Proceedings of the 2015 ACM Symposium on Document Engineering, no. 1, pp. 93–96 (2015)

    Google Scholar 

  12. Hollaus, F., Brenner, S., Sablatnig, R.: CNN based binarization of multispectral document images. In: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, pp. 533–538 (2019)

    Google Scholar 

  13. Hollaus, F., Diem, M., Sablatnig, R.: Improving OCR accuracy by applying enhancement techniques on multispectral images. In: Proceedings - International Conference on Pattern Recognition, pp. 3080–3085 (2014)

    Google Scholar 

  14. Hollaus, F., Gau, M., Sablatnig, R.: Multispectral image acquisition of ancient manuscripts. In: Ioannides, M., Fritsch, D., Leissner, J., Davies, R., Remondino, F., Caffo, R. (eds.) EuroMed 2012. LNCS, vol. 7616, pp. 30–39. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34234-9_4

    Chapter  Google Scholar 

  15. International Telecommunication Union: Subjective video quality assessment methods for multimedia applications P.910. ITU-T (April 2008)

    Google Scholar 

  16. International Telecommunication Union: Methodology for the subjective assessment of the quality of television pictures ITU-R BT.500-13. ITU-R (January 2012)

    Google Scholar 

  17. Koo, T.K., Li, M.Y.: A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J. Chiropr. Med. 15(2), 155–163 (2016)

    Article  Google Scholar 

  18. Likforman-Sulem, L., Darbon, J., Smith, E.H.: Enhancement of historical printed document images by combining total variation regularization and non-local means filtering. Image Vis. Comput. 29(5), 351–363 (2011)

    Article  Google Scholar 

  19. Lin, H., Hosu, V., Saupe, D.: KADID-10k: a large-scale artificially distorted IQA database. In: 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX), pp. 1–3 (2019)

    Google Scholar 

  20. Mantiuk, R.K., Tomaszewska, A., Mantiuk, R.: Comparison of four subjective methods for image quality assessment. Comput. Graph. Forum 31(8), 2478–2491 (2012)

    Article  Google Scholar 

  21. Mindermann, S.: Hyperspectral imaging for readability enhancement of historic manuscripts. Master’s thesis, TU München (2018)

    Google Scholar 

  22. Perez-Ortiz, M., Mikhailiuk, A., Zerman, E., Hulusic, V., Valenzise, G., Mantiuk, R.K.: From pairwise comparisons and rating to a unified quality scale. IEEE Trans. Image Process. 29, 1139–1151 (2019)

    Article  MathSciNet  Google Scholar 

  23. Ponomarenko, N., et al.: Image database TID2013: peculiarities, results and perspectives. Signal Process.: Image Commun. 30, 57–77 (2015)

    Google Scholar 

  24. Ponomarenko, N., et al.: TID2008 - a database for evaluation of full-reference visual quality assessment metrics. Adv. Mod. Radioelectron. 10(4), 30–45 (2009)

    Google Scholar 

  25. Pouyet, E., et al.: Revealing the biography of a hidden medieval manuscript using synchrotron and conventional imaging techniques. Anal. Chimica Acta 982, 20–30 (2017)

    Article  Google Scholar 

  26. R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2008). http://www.R-project.org. ISBN 3-900051-07-0

  27. Ribeiro, F., Florencio, D., Nascimento, V.: Crowdsourcing subjective image quality evaluation. In: Proceedings - International Conference on Image Processing, ICIP, pp. 3097–3100 (2011)

    Google Scholar 

  28. Salerno, E., Tonazzini, A., Bedini, L.: Digital image analysis to enhance underwritten text in the Archimedes palimpsest. Int. J. Doc. Anal. Recognit. 9(2–4), 79–87 (2007)

    Article  Google Scholar 

  29. Shaus, A., Faigenbaum-Golovin, S., Sober, B., Turkel, E.: Potential contrast - a new image quality measure. Electron. Imaging 2017(12), 52–58 (2017)

    Article  Google Scholar 

  30. Sheikh, H.R., Sabir, M.F., Bovik, A.C.: A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image Process. 15(11), 3441–3452 (2006)

    Article  Google Scholar 

  31. Shrout, P.E., Fleiss, J.L.: Intraclass correlations: uses in assessing rater reliability. Psychol. Bull. 86(2), 420–428 (1979)

    Article  Google Scholar 

  32. Virtanen, T., Nuutinen, M., Vaahteranoksa, M., Oittinen, P., Häkkinen, J.: CID2013: a database for evaluating no-reference image quality assessment algorithms. IEEE Trans. Image Process. 24(1), 390–402 (2015)

    Article  MathSciNet  Google Scholar 

  33. Ye, P., Doermann, D.: Combining preference and absolute judgements in a crowd-sourced setting. In: Proceedings of International Conference on Machine Learning, pp. 1–7 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simon Brenner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Brenner, S., Sablatnig, R. (2021). Subjective Assessments of Legibility in Ancient Manuscript Images - The SALAMI Dataset. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12667. Springer, Cham. https://doi.org/10.1007/978-3-030-68787-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-68787-8_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-68786-1

  • Online ISBN: 978-3-030-68787-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics