Advertisement

Significant Characteristics to Abstract Content: Long Term Preservation of Information

  • Manfred Thaller
  • Volker Heydegger
  • Jan Schnasse
  • Sebastian Beyl
  • Elona Chudobkaite
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5173)

Abstract

The (automatic) extraction of significant characteristics of files is an important feature of all long term preservation activities. We propose, however, that for the necessary automatic evaluation of the outcomes of certain preservation actions – notably migration – an approach is necessary, which follows other traditions in the abstraction of format descriptions. To implement a strategy for the automatic evaluation of various actions within a preservation environment, we define two formal, XML base languages: One allowing to define the content of a specific file, the other describing a file format in such a way, that it can be handled by multi-purpose software.

Keywords

File characteristics format definition languages data abstraction long term preservation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    National Library of New Zealand: Metadata Extraction Tool Software Architecture, Version 3.0, p. 2 (June 17, 2003), http://meta-extractor.sourceforge.net/meta-extractor-software-architecture-v3.pdf
  2. 2.
    National Library of New Zealand: Metadata Standards Framework – Preservation Metadata (Revised) (June 2003), http://www.natlib.govt.nz/catalogues/library-documents/preservation-metadata-revised
  3. 3.
    Harvard University Library, http://hul.harvard.edu/jhove/
  4. 4.
  5. 5.
    HDF4 User’s Guide / HDF4 Release 2.2 (December 2007), ftp://ftp.hdfgroup.org/HDF/Documentation/HDF4.2r2/HDF42r2_UserGd.pdf
  6. 6.
    Data Format Description Language, http://forge.gridforum.org/projects/dfdl-wg/
  7. 7.
  8. 8.
    Binary Format Description Language, http://collaboratory.emsl.pnl.gov/sam/bfd/
  9. 9.
  10. 10.
  11. 11.
    Fisher, K., Mandelbaum, Y., Walker, D.: The Next 700 Data Description Languages. ACM Sigplan Notices 41(1), 2–15 (2006)CrossRefGoogle Scholar
  12. 12.
  13. 13.
    Hardy, M.R.B.: The Mars Project - PDF in XML. In: DocEng 2007, pp. 161–170. ACM Press, New York (2007)Google Scholar
  14. 14.
    Gruhl, D., Meredith, D., Pieper, J.: A case study on alternate representations of data structures in XML. In: DocEng 2005, pp. 217–219. ACM Press, New York (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Manfred Thaller
    • 1
  • Volker Heydegger
    • 1
  • Jan Schnasse
    • 1
  • Sebastian Beyl
    • 1
  • Elona Chudobkaite
    • 1
  1. 1.Historisch-Kulturwissenschaftliche Informationsverarbeitung, Albertus-Magnus-PlatzUniversität zu KölnKölnGermany

Personalised recommendations