Classification analysis of Kouji Uno’s novels using topic model

  • Xueqin LiuEmail author
  • Mingzhe Jin


Kouji Uno is a prominent Japanese littérateur, whose creative activity was subjected to disruption twice. Literary critics take the view that Uno’s writing style underwent changes when he resumed writing. This paper aims at revealing the partition of Uno’s creative phase using statistical methods to conduct an investigation into the stylistic characteristics of his novels. For this purpose, a topic-model was applied to classifying Uno’s novels and to comparing the characteristics of each group. As revealed by the results, Uno’s novels can be classified into three groups separated approximately by the two non-productive periods and there are different stylistic characteristics displayed by novels in each group. Moreover, one interesting observation is that his stylistic characteristics have changed even prior to the interruptions caused to writing. It is more reasonable that Uno’s writing style started to change beforethe interruptions with achievements made to some extent after the resumption.


Kouji Uno Writing style Quantitative analysis Creative phases Topic model 


Compliance with ethical standards

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.


  1. Blei DM (2012) Probabilistic topic models. Commun ACM 55(4):77–84Google Scholar
  2. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022zbMATHGoogle Scholar
  3. Brinegar CS (1963) Mark Twain and the Quintus Curtius Snodgrass letters: a statistical test of authorship. J Am Stat Assoc 58(301):85–96Google Scholar
  4. Brody S, Lapata M (2009) Bayesian word sense induction. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2009), pp 103–11Google Scholar
  5. Can F, Patton JM (2004) Change of writing style with time. Comput Humanit 38(1):61–82Google Scholar
  6. Grieve J (2007) Quantitative authorship attribution: an evaluation of techniques. Lit Linguist Comput 22(3):251–270Google Scholar
  7. Haruhara T, Kajitani T (1971) Gendai bungakusha no byouseki-sousaku to kyouki no nazo-, 74-84, Shinjuku shyobou, TokyoGoogle Scholar
  8. Hennig L (2009) Topic-based multi-document summarization with probabilistic latent semantic analysis. In: Proceedings of the International Conference RANLP: 144–149Google Scholar
  9. Hirotsu K (1998) Akutagawa ryuunosuke no jisatsu, Hirotsu kazuo -sakka no jiden 65-, 218-221, Nihontosho Center, TokyoGoogle Scholar
  10. Hirst G, Feng WV (2012) Changes in style in authors with Alzheimer’s Disease. Engl Stud 93(3):357–370Google Scholar
  11. Holmes DI, Robertson M, Paez R (2001) Stephen Crane and the New-York tribune: a case study in traditional and non-traditional authorship attribution. Comput Humanit 35(3):315–331Google Scholar
  12. Hoover DL (2002) Frequent word sequences and statistical stylistic. Lit Linguist Comput 17(2):157–180Google Scholar
  13. Ito Z, Murakami M (1991) A statistical study of Nichiren (1222–1282)’s literary style. Thought Relig Asia 8:27–35Google Scholar
  14. Jin MZ (2002) Authorship attribution based on n-gram models in postpositional particle of Japanese. Math Linguist 23(5):225–240MathSciNetGoogle Scholar
  15. Jin MZ (2009) Estimation of when the works were written: with the works of Ryunosuke Akutagawa as examples. Behaviormetrika 36(2):89–103Google Scholar
  16. Jin MZ (2013) Authorship identification based on phrase patterns. Jpn J Behaviormetr 40(1):17–28Google Scholar
  17. Jin MZ (2014) Using integrated classification algorithm to identify a text’s author. Jpn J Behaviormetr 41(1):35–46Google Scholar
  18. Jin MZ, Murakami M (1993) Author’s features writing styles as seen through their features use of commas. Behaviormetrika 20(1):63–76Google Scholar
  19. Jockers ML, Mimno D (2013) Significant themes in 19th-century literature. Poetics 41(6):750–769Google Scholar
  20. Kabashima T (1955) Ruibetsu shita hinshi ni mirauru kisokusei. Kokugo kokubun 24(6):55–57Google Scholar
  21. Li X, Lancashire L, Hirst G, Jokel R (2011) Longitudinal detection of dementia through lexical and syntactic changes in writing: a case study of three British novelists. Lit Linguist Comput 26(4):435–461Google Scholar
  22. Louvigné S, Uto M, Kato Y, Ishii T (2018) Social constructivist approach of motivation: social media messages recommendation system. Behaviormetrika 45(1):133–155Google Scholar
  23. Matsuura T, Kanada Y (2000) Identifying authors of sentences in Japanese modern Novels via distribution of n-grams. Math Linguist 22(6):225–238Google Scholar
  24. Mendenhall TC (1887) The characteristic curves of composition. Science IX:237–249Google Scholar
  25. Mizukami T (1979) Kouji Uno den, Chuoukouronshya, TokyoGoogle Scholar
  26. Mosteller F, Wallace DL (1964) Inference and disputed authorship: the federalist. Addison-Wesley, ReadingzbMATHGoogle Scholar
  27. Murakami M, Imanishi Y (1999) On a quantitative analysis of auxiliary verbs used in genji monogatari. Inform Proc Soc Jpn 40(3):774–782Google Scholar
  28. Navarro-Colorado B (2018) On poetic topic modeling: extracting themes and motifs from a corpus of Spanish poetry. Front Dig Humanit 5:15 (Computational linguistics and literature) Google Scholar
  29. O’Brien DP, Darnell AC (1982) Authorship puzzles in the history of economics: a statistical approach. Macmillan, Humanities Press, LondonGoogle Scholar
  30. O’Donnell B (1966) Stephen Crane’s The O’ Ruddy: a problem in authorship discrimination. In: Leed Jacob (ed) The computer and literary style, kent. Kent State University Press, KentGoogle Scholar
  31. Schöch C (2017) Topic modeling genre: an exploration of french classical and enlightenment drama. Dig Humanit Q 11(2):266–285Google Scholar
  32. Seroussi Y, Bohnert F, Zukerman I (2012) Authorship attribution with author aware topic models. In: Proceedings of the 50th annual meeting of the association for computational linguistics, vol 2, short papers, pp 264–269Google Scholar
  33. Shinoda H (1972) Yumemiruheya no kouzu, Subaru (10):90–105, ShueishaGoogle Scholar
  34. Smith MWA (1983) Recent experience and new developments of methods for the determination of authorship. Assoc Lit Linguist Comput Bull 11:73–82Google Scholar
  35. Sun H, Jin MZ (2018) Ghostwriter verification of Yasunari Kawabata’s novel hananikki. J Jpn Soc Inform Knowl 28(1):3–14Google Scholar
  36. Titov I, McDonald R (2008) A joint model of text and aspect ratings for sentiment summarization. In: Proceedings of association for computational linguistics-08: HLT, pp 308–316Google Scholar
  37. Tsujino H (1983) Uno Kouji shi no kingyou nitsuite, 127-135, Yuuseidou, TokyoGoogle Scholar
  38. Uesaka A, Murakami M (2015) Verifying the authorship of Saikaku Ihara’s work in early modern Japanese literature; a quantitative approach. Dig Sch Humanit 30(4):599–607Google Scholar
  39. Uto M, Louvigné S, Kato Y, Ishii T, Miyazawa Y (2017) Diverse reports recommendation system based on latent Dirichlet allocation. Behaviormetrika 44(2):425–444Google Scholar
  40. Wei X, Croft WB (2006) LDA-based document models for ad-hoc retrieval. In: Proceedings of the 29th annual international ACM SIGIR conference: 178–185Google Scholar
  41. Whissell C (1996) Traditional and emotional stylometric analysis of the songs of Beatles Paul McCartney and John Lennon. Comput Humanit 30:257–265Google Scholar
  42. Yasumoto B (1958) The author of Uji jujyo: infer authorship attribution by sentence psychology. Jpn Psychol Rev 2(1):147–156Google Scholar
  43. Yule GU (1938) On sentence-length as a statistical characteristic of style in prose, with application to two cases of disputed authorship. Biometrika 30(3/4):363–390Google Scholar
  44. Yule GU (1944) The statistical study of literary vocabulary. Cambridge University Press, CambridgeGoogle Scholar
  45. Zaitsu W (2016) Text-mining to classify motives for single and serial arson in last 10 years. Jpn J Crim Psychol 53(2):29–41Google Scholar

Copyright information

© The Behaviormetric Society 2019

Authors and Affiliations

  1. 1.Graduate School of Culture and Information ScienceDoshisha UniversityKyotanabeJapan
  2. 2.Faculty of Culture and Information ScienceDoshisha UniversityKyotanabeJapan

Personalised recommendations