Computers and the Humanities

, Volume 38, Issue 1, pp 15–36 | Cite as

Detecting Collaborations in Text Comparing the Authors' Rhetorical Language Choices in The Federalist Papers

  • Jeff Collins
  • David Kaufer
  • Pantelis Vlachos
  • Brian Butler
  • Suguru Ishizaki


In author attribution studies function words or lexical measures areoften used to differentiate the authors' textual fingerprints. Thesestudies can be thought of as quantifying the texts, representing thetext with measured variables that stand for specific textual features.The resulting quantifications, while proven useful for statisticallydifferentiating among the texts, bear no resemblance to the understanding a human reader – even an astute one – would develop whilereading the texts. In this paper we present an attribution study that,instead, characterizes the texts according to the representationallanguage choices of the authors, similar to a way we believe close humanreaders come to know a text and distinguish its rhetorical purpose. Fromour automated quantification of The Federalist papers, it isclear why human readers find it impossible to distinguish the authorshipof the disputed papers. Our findings suggest that changes occur in theprocesses of rhetorical invention when undertaken in collaborativesituations. This points to a need to re-evaluate the premise ofautonomous authorship that has informed attribution studies of The Federalist case.

authorship attribution collaboration federalist papers statistics 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Adair D. (1944) The Authorship of the Disputed Federalist Papers (2 Parts). William and Mary Quarterly, 3rd Series, pp. 97–122 and 235-264.Google Scholar
  2. Bailey R.W. (1978) Authorship Attribution in a Forensic Setting. In Ager D.E., Knowles F.E., and Smith J. (eds.), Fifth Internat'l Symposium on Computers in Literary and Linguistic Research. U of Aston, Birmingham, pp. 1–20.Google Scholar
  3. Biber D. (1988) Variation Across Speech and Writing. Cambridge University Press, Cambridge.Google Scholar
  4. Carey G.W. (1984) Publius-A Split Personalityİ Review of Politics, 46(1), pp. 5–22.Google Scholar
  5. Cicero (1954) Ad Herennium: On the Theory of Public Speaking, Loeb Classical Library. Harvard UP, Cambridge, MA.Google Scholar
  6. Collins J. (2001) Collaborations in Text: The Federalist Case Support Site. [ projects/fed01/ [10/30/02]].Google Scholar
  7. Ede L., Lunsford A. (1990) Singular Texts/Plural Authors: Perspectives on Collaborative Writing. Southern Illinois University Press, Carbondale.Google Scholar
  8. Efron B. (1979) The Bootstrap. In The Jackknife, the Bootstrap and Other Resampling Plans. Society for Industrial and Applied Mathematics, Philadelphia, pp. 27–36.Google Scholar
  9. Efron B., Rogosa D., Tibshirani R. (2001) Resampling Methods of Estimation. In Smelser N. and Baltes P. (eds.), International Encyclopedia of Behavioral and Social Sciences, Vol. 19. Elsevier Science, pp. 13216–13220.Google Scholar
  10. Fellbaum C. (1998) WordNet: An Electronic Lexical Database, Language, Speech, and Communication. MIT Press, Cambridge, MA.Google Scholar
  11. Fienberg S.E. (1979) The Analysis of Cross-classified Categorical Data. MIT Press, Cambridge,MA.Google Scholar
  12. Holmes D.I. (1994) Authorship Attribution. Computers and the Humanities, 28(1), pp. 87–106.Google Scholar
  13. Holmes D.I., Forsyth R.S. (1995) The Federalist Revisited: New Directions in Authorship Attribution. Literary and Linguistic Computing, 10(2), pp. 111–127.Google Scholar
  14. Hosmer D.W., Lemeshow, S. (2000) Applied Logistic Regression. John Wiley and Sons, New York.Google Scholar
  15. Irizarry E. (1993) The Two Authors of Columbus' Diary. Computers and the Humanities, 27, pp. 85–92.Google Scholar
  16. Kachigan S.K. (1991) Multivariate Statistical Analysis, 2nd edition. Radius Press, FDR Station, NY.Google Scholar
  17. Kaufer D.S., Butler, B.S. (1996) Rhetoric and the Arts of Design. Lawrence Erlbaum Associates, Mahwah, NJ.Google Scholar
  18. Kaufer D.S., Butler, B.S. (2000) Principles of Writing as Representational Composition: Designing Interactive Worlds with Words. Lawrence Erlbaum Associates, Mahwah, NJ.Google Scholar
  19. Kaufer D.S., Carley K.M. (1993) Communication at a Distance: The Influence of Print on Sociocultural Organization and Change, Communication. Lawrence Erlbaum Associates, Hillsdale, NJ.Google Scholar
  20. Kesler C.R. (1999) Introduction to The Federalist Papers. In Rossiter C. (ed.), The Federalist Papers. Mentor, New York, pp. vii–xxxiv.Google Scholar
  21. LeFevre K.B. (1987) Invention as a Social Act. Southern Illinois University Press, Carbondale.Google Scholar
  22. Mallon T. (2001) Stolen Words: The Classic Book on Plagiarism (Updated). Harcourt, New York.Google Scholar
  23. Marcu D. (2000) The Rhetorical Parsing of Unrestricted Texts: A Surface-based Approach. Computational Linguistics, 26(3), pp. 395–448.Google Scholar
  24. Martindale C., McKenzie D. (1995) On the Utility of Content Analysis in Author Attribution: The Federalist. Computers and the Humanities, 29(4), pp.259–270.Google Scholar
  25. May J.M., Wisse J. (2001) Introduction. In Cicero: On the Ideal Orator (De Oratore). Oxford University Press, New York, pp. 3–56.Google Scholar
  26. Mosteller F., Wallace D.L. (1964) Inference and Disputed Authorship: The Federalist. Addison-Wesley Publishing Company, Reading, MA.Google Scholar
  27. O'Hara T.F., Hosmer D.W., Lemeshow S., Hartz S.C. (1982) A Comparison of Discriminant Function and Maximum Likelihood Estimates of Logistic Coefficients for Categorical-scaled Data. Journal of Statistical Computation and Simulation, 14, pp. 169–178.Google Scholar
  28. Richards I.A. (1991) Context Theory of Meaning and Types of Context. In Berthoff A.E. (ed.), Richards on Rhetoric. Oxford University Press, New York, pp. 111–117.Google Scholar
  29. Rudman J. (1998) The State of Authorship Attribution Studies: Some Problems and Solutions. Computers and the Humanities, 31, pp. 351–365.Google Scholar
  30. Rudman J. (2000) Non-Traditional Authorship Attribution Studies: Ignis Fatuus or Rosetta Stoneİ. Bibliographical Society of Australia and New Zealand Bulletin, 24(3), pp. 163–176.Google Scholar
  31. Scolari (2000) Diction 5.0: The Text Analysis Program. [ [6/26/01]].Google Scholar
  32. Seber G.A.F. (1984) Multivariate Observations. John Wiley and Sons, New York.Google Scholar
  33. Stillinger J. (1991) Multiple Authorship and the Question of Authority. Text: Transactions of the Society for Textual Scholarship, 5, pp. 283–293.Google Scholar
  34. Stone P.J. (2000) Comparison of General Inquirer with Other Text-analysis Procedures. [İinquirer/ [4/4/01]].Google Scholar
  35. Tukey J.W. (1949) Comparing Individual Means in the Analysis of Variance. Biometrics, 5, pp. 99–114.Google Scholar
  36. Turner D. (1992) Project Gutenberg Etext of The Federalist Papers. [ fed/fedpaper.txt [1/18/01]].Google Scholar
  37. Vlachos P. (2001) StatLib Datasets Archive. [ [10/21/02]].Google Scholar
  38. Wilks S.S. (1932) Certain Generalizations in the Analysis of Variance. Biometrika, 24, pp. 471–494.Google Scholar
  39. Woodmansee M. (1994) Genius and the Copyright. In The Author, Art, and the Market: Rereading the History of Aesthetics. Columbia University Press, New York, pp. 35–55.Google Scholar

Copyright information

© Kluwer Academic Publishers 2004

Authors and Affiliations

  • Jeff Collins
    • 1
  • David Kaufer
    • 1
  • Pantelis Vlachos
    • 1
  • Brian Butler
    • 2
  • Suguru Ishizaki
    • 3
  1. 1.Carnegie Mellon UniversityUSA
  2. 2.University of PittsburghUSA
  3. 3.PittsburghUSA

Personalised recommendations