Skip to main content
Log in

Analytical developments for the Homer Multitext: palaeography, orthography, morphology, prosody, semantics

  • Published:
International Journal on Digital Libraries Aims and scope Submit manuscript

Abstract

We describe ongoing development for The Homer Multitext focusing on the interlocking challenges of automated analysis of diplomatic manuscript transcriptions. With the goal of lexical and morphological analysis of prose and poetry texts, and metrical analysis of poetic texts (and quotations thereof), we face the challenge of working generically across languages and across multiple possible orthographies in each language. In the case of Greek, our working dataset includes Greek following the conventions of Attica before 404 BCE, the conventions of “standard” literary polytonic Greek, and the particular conventions found in Byzantine codex manuscripts of Greek epic poetry with accompanying commentary. The latest work involves re-implementing existing CITE Architecture libraries in the Julia language, with documentation in the form of runnable code notebooks using the Pluto.jl framework. The Homer Multitext has been a work in progress for two decades. Because of the project’s emphasis on simple data formats (plain text, very simple XML, tabular lists), our data remain valid even as we gain understanding of the challenges posed by our source-material, particularly the 10th and 11th Century manuscripts of Greek epic poetry with accompanying ancient commentary that, within themselves, represent over a thousand years of linguistic evolution. The work outlined here represents the latest shift in our development tools, a flexibility likewise made possible by the separation of concerns that has been a central value in the project.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. Our reasoning for these decisions is thus: A scholar specifically interested in the palaeography of ligatures or ending-abbreviations would never rely on our transcription as primary data, but would return to the manuscript images. Our data associates every passage of text with a region-of-interest on at least one, and sometimes more than one, digital image of the folio. So we are providing data helpful for a systematic study of those features, but carrying them into our transcription would help no one, and would make many things more difficult.

  2. The Homer Multitext uses “validation” to mean error-checking that can be fully automated, such as ensuring that only valid Greek characters are present in an edition of a Greek text. The project uses “verification” for error-checking that requires a human editor, but which a machine can help, such as confirming that each reference to “son of Atreus” is identified with the correct character, Agamemnon or Menelaus.

  3. Notwithstanding, the libraries for composing and decomposing Unicode characters built into the Julia language (https://docs.julialang.org/en/v1/stdlib/Unicode/) are particularly helpful even as we try to work with more rigorously specified orthographies.

  4. See Smith, Neel. “Morphological Analysis of Historical Languages.” Bulletin of the Institute of Classical Studies 59, no. 2 (2016): 89–102. https://doi.org/10.1111/j.2041-5370.2016.12040.x. Also, for discussion of our project-specific algorithms for treating orthography, see the Polytonic Greek Code library in Julia.

  5. See A. MAHONEY, “THE FORMS YOU ‘REALLY’ NEED TO KNOW,” The Classical Outlook, vol. 81, no. 3, pp. 101–105, 2004.

  6. https://github.com/homermultitext/hmt-archive.

  7. The current print publication of the “LSJ” lexicon: A Greek-English Lexicon, H.G. Liddell and R. Scott. 9th Edition (1940). Oxford University Press. A user-interface to this lexical data is at: http://folio2.furman.edu/lsj/. Discussion of the data and the application are at http://eumaeus.github.io/2018/10/30/lsj.html, with further comment at http://eumaeus.github.io/2018/11/05/chicago.html.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christopher Blackwell.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Smith, N., Blackwell, C. Analytical developments for the Homer Multitext: palaeography, orthography, morphology, prosody, semantics. Int J Digit Libr 24, 179–184 (2023). https://doi.org/10.1007/s00799-023-00380-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00799-023-00380-3

Keywords

Navigation