Skip to main content
Log in

Authorship Attribution and Pastiche

  • Published:
Computers and the Humanities Aims and scope Submit manuscript

Abstract

This paper considers the question of authorship attribution techniques whenfaced with a pastiche. We ask whether the techniques can distinguish the real thing from the fake, or can the author fool the computer? If the latter, is this because the pastiche is good, or because the technique is faulty? Using a number of mainly vocabulary-based techniques, Gilbert Adair's pastiche of Lewis Carroll, Alice Through the Needle's Eye, is compared with the original `Alice' books. Standard measures of lexical richness, Yule's K andOrlov's Z both distinguish Adair from Carroll, though Z also distinguishesthe two originals. A principal component analysis based on word frequenciesfinds that the main differences are not due to authorship. A discriminantanalysis based on word usage and lexical richness successfully distinguishes thepastiche from the originals. Weighted cusum tests were also unable to distinguish the two authors in a majority of cases. As a cross-validation, wemade similar comparisons with control texts: another children's story from thesame era, and other work by Carroll and Adair. The implications of thesefindings are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Adair G. (1984) Alice Through the Needle's Eye: A Third Adventure for Lewis Carroll's 'Alice'. Macmillan, London.

    Google Scholar 

  • Adair G. (1986) Myths & Memories. Fontana Paperbacks, London.

    Google Scholar 

  • Baayen H., van Halteren H., Tweedie F. (1996) Outside the Cave of Shadows: Using Syntactic Annotation to Enhance Authorship Attribution. Literary and Linguistic Computing, 11, pp. 121–131.

    Google Scholar 

  • Baum L.F. (1900) The Wonderful Wizard of Oz. G.M. Hill, Chicago.

    Google Scholar 

  • Bee R.E. (1971) Statistical Methods in the Study of theMasoretic Text of the Old Testament. Journal of the Royal Statistical Society A, 134, pp. 611–622.

    Google Scholar 

  • Bee R.E. (1972) A Statistical Study of the Sinai Pericope. Journal of the Royal Statistical Society A,135, pp. 406–421.

    Google Scholar 

  • Bell A. (1985) Linked by a Single Tail. Times Literary Supplement, 4th January 1985, p. 18.

  • Benson J.D., Brainerd B. (1988) Chesterton's Parodies of Swinburne and Yeats: A Lexical Approach. Literary and Linguistic Computing, 3, pp. 221–231.

    Google Scholar 

  • Binongo J.N.G. (1994) Joaquin's Joaquinesquerie, Joaquinesquerie's Joaquin: A Statistical Expression of a Filipino Writer's Style. Literary and Linguistic Computing, 9, pp. 267–279.

    Google Scholar 

  • Bissell A.F. (1995a) Weighted Cumulative Sums for Text Analysis Using Word Counts. Journal ofthe Royal Statistical Society A, 158, pp. 525–545.

    Google Scholar 

  • Bissell D. (1995b) Statistical Methods for Text Analysis by Word-Counts. European Business Management School, University of Wales, Swansea.

    Google Scholar 

  • Burrows J.F. (1987) Computation into Criticism: A Study of Jane Austen's Novels and an Experiment in Method. Clarendon Press, Oxford.

    Google Scholar 

  • Burrows J.F. (1989) “An Ocean Where Each Kind ⋯ ”: Statistical Analysis and Some Major Determinants of Literary Style. Computers and the Humanities, 23, pp. 309–321.

    Google Scholar 

  • Burrows J.F. (1992) Computers and the Study of Literature. In Butler C.S. (ed.), Computers andWritten Texts. Blackwell, Oxford, pp. 167–204.

    Google Scholar 

  • Carroll L. (1865) Alice's Adventures in Wonderland. Macmillan, London.

    Google Scholar 

  • Carroll L. (1872) Through the Looking Glass. Macmillan, London.

    Google Scholar 

  • Carroll L. (1891) The Nyctograph. The Lady, 29th October 1891; reproduced in Fisher J. (ed.), The Magic of Lewis Carroll, Harmondsworth, Middlesex (1975): Penguin, pp. 214–217.

  • Dodgson C.L. (1889) Curiosa Mathematica Part I: A New Theory of Parallels. Macmillan, London.

    Google Scholar 

  • Farringdon J.M. (1996) Analysing for Authorship: A Guide to the Cusum Technique. University of Wales Press, Cardiff.

    Google Scholar 

  • Flesch R. (1974) The Art of Readable Writing. Harper & Row, New York.

    Google Scholar 

  • Fuller J. (1985) Lewis Carroll is not Dead. The New York Times Book Review, 5th May 1985, p. 42.

  • Hardcastle R.A. (1997). CUSUM: A Credible Method for the Determination of Authorship? Science & Justice, 37, pp. 129–138.

    Google Scholar 

  • Hilton M.L., Holmes D.I. (1993) An Assessment of Cumulative Sum Charts for Authorship Attribution. Literary and Linguistic Computing, 8, pp. 73–80.

    Google Scholar 

  • Holmes D.I. (1994) Authorship Attribution. Computers and the Humanities, 28, pp. 87–106.

    Google Scholar 

  • Holmes D.I. (1998) The Evolution of Stylometry in Humanities Scholarship. Literary and Linguistic Computing, 13, pp. 111–117.

    Google Scholar 

  • Holmes D.I., Forsyth R.S. (1995) The Federalist Revisited: New Directions in Authorship Attribution. Literary and Linguistic Computing, 10, pp. 111–127.

    Google Scholar 

  • Holmes D.I., Singh S. (1996) A Stylometric Analysis of Conversational Speech of Aphasic Patients. Literary and Linguistic Computing, 11, pp. 133–140.

    Google Scholar 

  • Holmes D.I., Tweedie F.J. (1995) Forensic Stylometry: A Review of the Cusum Controversy. Revue Informatique et Statistique dans les Sciences Humaines, 31, pp. 19–47.

    Google Scholar 

  • Irizarry E. (1989) Exploring Conscious Imitation of Style with Ready-made Software. Computers and the Humanities, 23, pp. 227–233.

    Google Scholar 

  • Ledger G., Merriam T. (1994) Shakespeare, Fletcher and the Two Noble Kinsmen. Literary and Linguistic Computing, 9, pp. 235–247.

    Google Scholar 

  • Mealand D.L. (1995) Correspondance Analysis of Luke. Literary and Linguistic Computing, 10, pp. 171–182.

    Google Scholar 

  • Morton A.Q. (1978) Literary Detection: How to Prove Authorship and Fraud in Literature and Documents. Bowker, London.

    Google Scholar 

  • Ogden C.K. (1934) The System of Basic English. Harcourt, Brace, New York.

    Google Scholar 

  • Orlov J.K. (1983) Ein Modell der Häufigkeitsstruktur des Vokabulars. In Guiter H. and Arapov M. (eds.), Studies on Zipf 's Law. Brockmeyer, Bochum, pp. 154–233.

  • Potter R.G. (1991) Statistical Analysis of Literature: A Retrospective on Computers and the Humanities, 1966–1990. Computers and the Humanities, 25, pp. 401–429.

    Google Scholar 

  • Sigelman L. (1995) By Their (New) Words Shall Ye Know Them: Edith Wharton, Marion Mainwaring, and The Buccaneers. Computers and the Humanities, 29, pp. 271–283.

    Google Scholar 

  • Sigelman L., Jacoby W. (1996) The Not-so-simple Art of Imitation: Pastiche, Literary Style, and Raymond Chandler. Computers and the Humanities, 30, pp. 11–28.

    Google Scholar 

  • Somers H. (1999) Computational Stylometry and Pastiche: Can a Good Fake Fool the Computer? Unpublished paper presented at ILASH Seminar, University of Sheffield, 8th December 1999. http://www.dcs.shef.ac.uk/research/ilash/Seminars/somers.html

  • Tweedie F.J., Baayen H.R. (1998) How Variable May a Constant Be? Measures of Lexical Richness in Perspective. Computers and the Humanities, 32, pp. 323–352.

    Google Scholar 

  • Tweedie F.J., Holmes D.I., Corns T.N. (1998) The Provenance of De Doctrina Christiana, Attributed to John Milton: A Statistical Investigation. Literary and Linguistic Computing, 13, pp. 77–87.

    Google Scholar 

  • Yule G.U. (1944) The Statistical Study of Literary Vocabulary. Cambridge University Press, Cambridge.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Somers, H., Tweedie, F. Authorship Attribution and Pastiche. Computers and the Humanities 37, 407–429 (2003). https://doi.org/10.1023/A:1025786724466

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1025786724466

Navigation