Practical considerations in the use of TEI headers in a large corpus

Dunlop, Dominic

doi:10.1007/BF01830318

Practical considerations in the use of TEI headers in a large corpus

Part II: Document-Wide Encoding Issues
Published: January 1995

Volume 29, pages 85–98, (1995)
Cite this article

Computers and the Humanities Aims and scope Submit manuscript

Dominic Dunlop¹

50 Accesses
7 Citations
Explore all metrics

Abstract

Many aspects of the guidelines of the Text Encoding Initiative (TEI) are applicable to corpora and text collections, and to the texts that these contain. As the first large corpus developed using mark-up conforming to the guidelines, the British National Corpus (BNC) is a test-bed for many TEI-developed mechanisms. This is particularly true in the case of the TEI header, which has three intended applications — to describe a corpus, to describe an individual text, and as a free-standing bibliographic record — all of them used by the BNC. This paper describes the application of the TEI header to the BNC. It is intended that this information should, through a description of experience on a practical project, serve as a guide for those wishing to use TEI headers in the documentation and management of other corpora and collections of texts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Responsa Project: Some Promising Future Directions

A Guide to Dictionary-Based Text Mining

Natural-Language Text Compression Using Reverse Multi-Delimiter Codes

Article 01 January 2024

References

BNC, TGAW15.Spoken Corpus Design Specification. British National Corpus project document, 1991a. (Note: Copies of British National Corpus project documents may be obtained by sending electronic mail to the author at natcorp@vax.ox.ac.uk.)
BNC, BNCW08.Written Corpus Design Specification. British National Corpus project document, 1991b.
BNC, TGAP21.Selecting Titles for the British National Corpus. British National Corpus project document, 1992a.
BNC, TGBP05.BNC Permissions Request. British National Corpus project document, 1992b.
BNC, TGDW36.The New BNC Database. British National Corpus project document, 1992c.
Burnage, Gavin and Dominic Dunlop. “Encoding the British National Corpus”. InEnglish Language Corpora: Design, Analysis and Exploitation. Ed. Jan Aarts, Pieter de Haan and Nelleke Oostdijk. Amsterdam and Atlanta: Editions Rodopi, 1993, pp. 79–95.
Google Scholar
Giordano, Richard. “The TEI Header”. In this volume.
Goldfarb, Charles F.The SGML Handbook. Oxford: Oxford University Press, 1990.
Google Scholar
Ingres.Introducing Ingres for the UNIX and VMS Operating Systems. Alameda, CA: Relational Technology Inc., 1989.
Google Scholar
ISO.ISO 8879: 1986 Information Processing — Structured Generalized Markup Language. Geneva: International Organization for Standardization, 1986.
Google Scholar
ISO.ISO 646: 1991 Information Processing — ISO 7-bit Coded Character Set for Information Interchange. Geneva: International Organization for Standardization, 1991.
Google Scholar
Pratchett, Terry.Wings. London: Corgi, 1991.
Google Scholar
Sperberg-McQueen, C.M. and Lou Burnard.Guidelines for Electronic Text Encoding and Interchange (TEI P3). Chicago, Oxford: Text Encoding Initiative, 1994.
Google Scholar
TEI P3. See Sperberg-McQueen and Burnard.

Download references

Author information

Authors and Affiliations

Computing Services, Oxford University, 13 Banbury Road, OX2 6NN, Oxford, UK
Dominic Dunlop

Authors

Dominic Dunlop
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Dominic Dunlop is project manager for the British National Corpus at Oxford University Computing Services. Prior to assuming this position, he worked in a variety of positions related to development and support of the UNIX operating system, and was active in the POSIX initiative for the standardization of UNIX.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dunlop, D. Practical considerations in the use of TEI headers in a large corpus. Comput Hum 29, 85–98 (1995). https://doi.org/10.1007/BF01830318

Download citation

Issue Date: January 1995
DOI: https://doi.org/10.1007/BF01830318

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Practical considerations in the use of TEI headers in a large corpus

Abstract

Access this article

Similar content being viewed by others

The Responsa Project: Some Promising Future Directions

A Guide to Dictionary-Based Text Mining

Natural-Language Text Compression Using Reverse Multi-Delimiter Codes

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

Practical considerations in the use of TEI headers in a large corpus

Abstract

Access this article

Similar content being viewed by others

The Responsa Project: Some Promising Future Directions

A Guide to Dictionary-Based Text Mining

Natural-Language Text Compression Using Reverse Multi-Delimiter Codes

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation