Skip to main content
Log in

On the utility of content analysis in author attribution:The Federalist

  • Published:
Computers and the Humanities Aims and scope Submit manuscript

Abstract

In studies of author attribution, measurement of differential use of function words is the most common procedure, though lexical statistics are often used. Content analysis has seldom been employed. We compare the success of lexical statistics, content analysis, and function words in classifying the 12 disputedFederalist papers. Of course, Mosteller and Wallace (1964) have presented overwhelming evidence that all 12 were by James Madison rather than by Alexander Hamilton. Our purpose is not to challenge these attributions but rather to useThe Federalist as a test case. We found lexical statistics to be of no use in classifying the disputed papers. Using both classical canonical discriminant analysis and a neural-network approach, content analytic measures — the Harvard III Psychosociological Dictionary and semantic differential indices — were found to be successful at attributing most of the disputed papers to Madison. However, a function-word approach is more successful. We argue that content analysis can be useful in cases where the function-word approach does not yield compelling conclusions and, perhaps, in preliminary screening in cases where there are a large number of possible authors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Anderson, C. W. and G. E. McMaster. “Quantification of the Brothers Grimm: A Comparison of Successive Versions of Three Tales”.Computers and the Humanities, 23 (1989), 341–46.

    Google Scholar 

  • Caudill, M. and C. Butler.Naturally Intelligent Systems. Cambridge, MA: MIT Press, 1990.

    Google Scholar 

  • Damerau, F. J. “The Use of Function Word Frequencies as Indicators of Style”.Computers and the Humanities, 9 (1975), 271–80.

    Google Scholar 

  • Elliott, W. and R. Valenza. “Who Was Shakespeare?”Chance, 4 (1991a), 8–14.

    Google Scholar 

  • Elliott, W. and R. Valenza. “A Touchstone for the Bard”.Computers and the Humanities, 25 (1991b), 199–209.

    Google Scholar 

  • Fix, E. and J. L. Hodges.Discriminatory Analysis, Nonparametric Discrimination: Consistency Properties. Report 4. Randolph Field, TX: USAF School of Aviation Medicine, 1951.

    Google Scholar 

  • Forsyth, R. S. “Neural Learning Algorithms: Some Empirical Trials”. InProceedings of the Third Conference on Neural Nets and their Applications. Nanterre, France, 1990, pp. 301–17.

  • Frautschi, R. L. “Lexical and Focal Preferences in Rousseau'sProfession de Foi du Vicaire Savoyard (Book IV ofEmile)”.Computers and the Humanities, 23 (1989), 347–55.

    Google Scholar 

  • Freud, S. “Psychopathology of Everyday Life”. InBasic Writings of Sigmund Freud. Ed. A. A. Brill. New York: Modern Library, 1938. (Original work published, 1904).

    Google Scholar 

  • Hand, D. J.Discrimination and Classification. New York: Wiley, 1981.

    Google Scholar 

  • Heise, D. R. “Semantic Differential Profiles for 1000 Most Frequent English Words”.Psychological Monographs, 79 (1965), 1–31.

    Google Scholar 

  • Holmes, D. I. “Authorship Attribution”.Computers and the Humanities, 28 (1994), 87–106.

    Google Scholar 

  • Horton, T. B.The Effectiveness of the Stylometry of Function Words in Discriminating Between Fletcher and Shakespeare. Unpublished Ph.D. dissertation. University of Edinburgh, 1987.

  • Kruskal, J. B. “Multidimensional Scaling by Optimizing Goodness of Fit to a Nonmetric Hypothesis”.Psychometrika, 29 (1964), 1–28, 114–29.

    Google Scholar 

  • Martindale, C. “COUNT: A PL/I Program for Content Analysis of Natural Language”.Behavioral Science, 18 (1973), 1948.

    Google Scholar 

  • Martindale, C. “LEXSTAT: A PL/I Program for Computation of Lexical Statistics”.Behavior Research Methods and Instrumentation, 6 (1974), 571.

    Google Scholar 

  • Martindale, C.The Clockwork Muse: The Predictability of Artistic Change. New York: Basic Books, 1990.

    Google Scholar 

  • Martindale, C.Cognitive Psychology: A Neural-network Approach. Pacific Grove, CA: Brooks/Cole, 1991.

    Google Scholar 

  • Matthews, R. A. J. and T. V. N. Merriam. “Neural Computation in Stylometry I: An application to the Works of Shakespeare and Fletcher”.Literary and Linguistic Computing, 4 (1993) 203–209.

    Google Scholar 

  • McKenzie, D. P. and R. S. Forsyth. “Classification by Similarity: An Overview of Statistical Methods of Case-Based Reasoning”.Computers in Human Behavior (in press).

  • Mendenhall, T. C. “A Mechanical Solution of a Literary Problem”.Popular Science, 60 (1901), 97–105.

    Google Scholar 

  • Merriam, T. V. N.Modelling a Canon: A Stylometric Examination of Shakespeare's First Folio. Unpublished Ph.D. dissertation. University of London, 1992.

  • Merriam, T. V. N. “Marlowe's Hand in Edward III”.Literary and Linguistic Computing, 8 (1993), 59–72.

    Google Scholar 

  • Mosteller, F. and D. L. Wallace.Inference and Disputed Authorship: The Federalist. Reading, MA: Addison-Wesley, 1964.

    Google Scholar 

  • Mosteller, F. and D. L. Wallace.Applied Bayesian and Classical Inference: The Case of the Federalist Papers. 2nd. Ed. New York: Springer-Verlag, 1984.

    Google Scholar 

  • Osgood, C. E., G. Suci and P. H. Taunenbaum.The Measurement of Meaning. Urbana, IL: University of Illinois Press, 1957.

    Google Scholar 

  • Rokeach, M., R. Homant and L. Penner. “A Value Analysis of the Disputed Federalist Papers”.Journal of Personality and Social Psychology, 16 (1970), 245–50.

    Google Scholar 

  • SAS Institute, Inc.SAS User's Guide: Statistics. Cary, NC: SAS Institute, 1985.

    Google Scholar 

  • Siegel, S. and N. J. Castellan.Nonparametric Statistics for the Behavioral Sciences. New York: McGraw-Hill, 1988.

    Google Scholar 

  • Sigelman, L. and W. Jacoby. “The Not-So-Simple Art of Imitation: Pastiche, Literary Style, and Raymond Chandler”.Computers and the Humanities (in press).

  • Specht, D. F. “Probabilistic Neural Networks”.Neural Networks, 3 (1990), 109–18.

    Google Scholar 

  • Specht, D. F. and P. D. Shapiro. “Generalized Accuracy of Probabilistic Neural Networks Compared with Back-Propagation Networks”. InProceedings of the International Joint Conference on Neural Networks. Seattle, WA., 1991, pp. 887–92.

  • Spence, D. P., H. S. Scarborough and E. H. Ginsberg. “Lexical Correlates of Cervical Cancer”.Social Science and Medicine, 12 (1978), 141–44.

    Google Scholar 

  • SPSS, Inc.SPSS Reference Guide. Chicago: SPSS, Inc., 1990.

    Google Scholar 

  • Stone, P. J. et al.The General Inquirer: A Computer Aproach to Content Aalysis. Cambridge, MA: MIT Press, 1966.

    Google Scholar 

  • Ward Systems Group.NeuroWindows: Neural Network Dynamic Link Library. Frederick, MD: Ward Systems Group, 1992.

    Google Scholar 

  • Williams, C. B. “A Note on the Statistical Analysis of Sentence-Length as a Criterion of Literary Style”.Biometrika, 31 (1939), 356–61.

    Google Scholar 

  • Yule, G. U. “On Sentence-Length as a Statistical Characteristic of Style in Prose: With Application to Two Cases of Disputed Authorship”.Biometrika, 30 (1938), 363–90.

    Google Scholar 

  • Yule, G. U.The Statistical Study of Literary Vocabulary. Cambridge: Cambridge University Press, 1944.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Colin Martindale is Professor of Psychology at the University of Maine. He is author of a number of articles and books on content analysis, literary history, and other topics. A recent book isThe Clockwork Muse: The Predictability of Artistic Change (New York: Basic Books). He is Executive Editor ofEmpirical Studies of the Arts and serves on the editorial boards ofComputers and the Humanities andPoetics.

Dean McKenzie is Professional Officer/Statistician for Psychological Medicine, Monash University, Melbourne, Australia. He is author of several articles concerned with machine learning and artificial intelligence.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Martindale, C., McKenzie, D. On the utility of content analysis in author attribution:The Federalist . Comput Hum 29, 259–270 (1995). https://doi.org/10.1007/BF01830395

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01830395

Key words

Navigation