Skip to main content
Log in

Knowledge discovery from the texts of Nobel Prize winners in literature: sentiment analysis and Latent Dirichlet Allocation

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Today, The Nobel Prize for Literature is one of the most recognized and prestigious awards. Examining the texts of the authors who have received this award and revealing the factors that play an important role in the awarding of this award is very important for the author, the reader and interested parties. In this direction, within the framework of the study, firstly identified the most popular works of the authors who received the Nobel Prize in Literature between 1980 and 2021 and created a data set—corpus. Dictionary-based sentiment analysis, a method for classifying sentiments, and Latent Dirichlet Allocation (LDA), a very popular approach in topic modeling, were applied to this dataset. As a result, the findings obtained from both sentiment and LDA analyzes were evaluated together and it was found that the themes with the highest distribution in the popular texts of Nobel Prize winners are also those with the positive emotional pole and “trust” weighted sentiment. This study is an exemplary resource in that it contributes to the understanding of the structure and emotional character of the related works of Nobel Prize-winning authors and enables readers and authors to quickly and functionally examine large groups of texts in terms of theme and content.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  • Alm, C. O., & Sproat, R. (2005). Emotional sequencing and development in fairy tales. International conference on affective computing and intelligent interaction (pp. 668–674). Springer.

    Chapter  Google Scholar 

  • Archer, J., & Jockers, M. L. (2016). The bestseller code: Anatomy of the blockbuster novel. St. Martin’s Press.

    Google Scholar 

  • Audrin, C., & Audrin, B. (2022). Key factors in digital literacy in learning and education: A systematic literature review using text mining. Education and Information Technologies. https://doi.org/10.1007/s10639-021-10832-5

    Article  Google Scholar 

  • Bizzoni, Y., Nielbo, K. L., & Thomsen, M. R. (2022a). Fractality of sentiment arcs for literary quality assessment: The case of Nobel laureates. In Proceedings of the 2nd International Workshop on Natural Language Processing for Digital Humanities (pp. 31–41).

  • Bizzoni, Y., Peura, T., Thomsen, M., & Nielbo, K. (2022b). Fractal sentiments and fairy tales: Fractal scaling of narrative arcs as predictor of the perceived quality of Andersen’s fairy tales. Journal of Data Mining & Digital Humanities. https://doi.org/10.46298/jdmdh.9154

    Article  Google Scholar 

  • Bjork, S., Offer, A., & Söderberg, G. (2014). Time series citation data: The Nobel Prize in economics. Scientometrics, 98, 185–196. https://doi.org/10.1007/s11192-013-0989-5

    Article  Google Scholar 

  • Blei, D. M., Ng, A., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  • Chan, H. F., & Torgler, B. (2015). The implications of educational and methodological background for the career success of Nobel laureates: An investigation of major awards. Scientometrics, 102, 847–863. https://doi.org/10.1007/s11192-014-1367-7

    Article  Google Scholar 

  • Colace, F., De Santo, M., & Greco, L. (2014). SAFE: A Sentiment analysis framework for E-learning. International Journal of Emerging Technologies in Learning (iJET), 9(6), 37–41. https://doi.org/10.3991/ijet.v9i6.4110

    Article  Google Scholar 

  • Dhaen, T., Domínguez, C., & Thomsen, M. R. (2012). World literature: A reader. Routledge Literature Readers.

    Book  Google Scholar 

  • Feldman, B. (2000). The Nobel Prize: A history of genius. Controversy and Prestige.

    Google Scholar 

  • Ferreira-Mello, R., André, M., Pinheiro, A., Costa, E., & Romero, C. (2019). Text mining in education. Wires Data Mining and Knowledge Discovery, 9, e1332. https://doi.org/10.1002/widm.1332

    Article  Google Scholar 

  • Francisco, V., Hervás, R., Peinado, F., et al. (2012). EmoTales: Creating a corpus of folk tales with emotional annotations. Lang Resources & Evaluation, 46, 341–381. https://doi.org/10.1007/s10579-011-9140-5

    Article  Google Scholar 

  • Gao, J., Jockers, M.L., Laudun, J., & Tangherlini, T.R. (2016). A multiscale theory for the dynamical evolution of sentiment in novels. 2016 International Conference on Behavioral, Economic and Socio-cultural Computing (BESC), ( pp. 1–4).

  • Gingras, Y., & Wallace, M. L. (2010). Why it has become more difficult to predict Nobel Prize winners: A bibliometric analysis of nominees and winners of the chemistry and physics prizes (1901–2007). Scientometrics, 82, 401–412. https://doi.org/10.1007/s11192-009-0035-9

    Article  Google Scholar 

  • Harzing, A. (2012). A preliminary test of Google Scholar as a source for citation data: A longitudinal study of Nobel prize winners. Scientometrics, 94, 1057–1075.

    Article  Google Scholar 

  • Hemmatian, F., Sohrabi, M., & K. (2019). A survey on classification techniques for opinion mining and sentiment analysis. Artificial Intelligence Review, 52, 1495–1545. https://doi.org/10.1007/s10462-017-9599-6

    Article  Google Scholar 

  • Hogan, P. C. (2022). Literature and moral feeling: A cognitive poetics of ethics, narrative, and empathy. Cambridge University Press.

  • Hu, Q., Liu, B., Thomsen, M.R., Gao, J., & Nielbo, K.L. (2019). Dynamic evolution of sentiments in Never Let Me Go: Insights from quantitative analysis and implications. 2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC), (pp. 1–6).

  • Hu, Q., Liu, B., Gao, J., et al. (2021). Fractal scaling laws for the dynamic evolution of sentiments in Never Let Me Go and their implications for writing, adaptation and reading of novels. World Wide Web, 24, 1147–1164. https://doi.org/10.1007/s11280-021-00892-5

    Article  Google Scholar 

  • Hu, Q., Liu, B., Thomsen, M. R., Gao, J., & Nielbo, K. L. (2020). Dynamic evolution of sentiments in Never Let Me Go: Insights from multifractal theory and its implications for literary analysis. Digit. Scholarsh. Humanit., 36, 322–332.

    Article  Google Scholar 

  • Jockers, M. L. (2014). Token distribution analysis. Text analysis with R for students of literature. Quantitative methods in the humanities and social sciences. Springer. https://doi.org/10.1007/978-3-319-03164-4_4

    Chapter  Google Scholar 

  • Jockers, M. L., & Mimno, D. (2013). Significant themes in 19th-century literature. Poetics, 41(6), 750–769.

    Article  Google Scholar 

  • Jockers, M. L., & Thalken, R. (2020). Sentiment analysis. Text analysis with R. Quantitative methods in the humanities and social sciences. Springer. https://doi.org/10.1007/978-3-030-39643-5_14

    Chapter  Google Scholar 

  • Kitano, H. (2016). Artificial intelligence to win the Nobel Prize and beyond: Creating the engine for scientific discovery. AI Magazine, 37, 39–49.

    Article  Google Scholar 

  • Kotu, V., & Deshpande, B. (2019). Data science, concepts and practice. Morgan Kaufmann Publishers.

    Google Scholar 

  • Kumar, P., & Vardhan, M. (2022). PWEBSA: Twitter sentiment analysis by combining Plutchik wheel of emotion and word embedding. International Journal of Information Technology, 14, 69–77.

    Article  Google Scholar 

  • Kwartler, T. (2017). Text mining in practice with R. John Wiley & Sons.

    Book  Google Scholar 

  • Lee, H., & Kang, P. (2018). Identifying core topics in technology and innovation management studies: A topic model approach. The Journal of Technology Transfer, 43, 1291–1317. https://doi.org/10.1007/s10961-017-9561-4

    Article  Google Scholar 

  • Lee, J., Kang, J.-H., Jun, S., Lim, H., Jang, D., & Park, S. (2018). Ensemble modeling for sustainable technology transfer. Sustainability, 10(7), 2278. https://doi.org/10.3390/su10072278

    Article  Google Scholar 

  • Li, J. X., Quyang, X., Zhou, YLu., & Liu, Y. (2014). Supervised labeled latent Dirichlet allocation for document categorization. Applied Intelligence, 3, 42.

    Google Scholar 

  • Li, J., Yin, Y., Fortunato, S., & Wang, D. (2020). Scientific elite revisited: patterns of productivity, collaboration, authorship and impact. Journal of the Royal Society Interface. https://doi.org/10.1098/rsif.2020.0135

    Article  Google Scholar 

  • Liang, G., Hou, H., Ding, Y., & Hu, Z. (2020). Knowledge recency to the birth of Nobel Prize-winning articles: Gender, career stage, and country. J. Informetrics, 14, 101053.

    Article  Google Scholar 

  • Liang, G., Hou, H., Ren, P., Bu, Y., Kong, X., & Hu, Z. (2019). Understanding Nobel Prize-winning articles: A bibliometric analysis. Current Science. https://doi.org/10.18520/cs/v116/i3/379-385

    Article  Google Scholar 

  • Liang, J., Liu, P., Tan, J., & Bai, S. (2014). Sentiment Classification Based on AS-LDA Model. International Conference on Information Technology and Quantitative Management. https://doi.org/10.1016/j.procs.2014.05.296

    Article  Google Scholar 

  • Lin, C., & He, Y. (2009). Joint sentiment/topic model for sentiment analysis. In: Proceeding of the 18th ACM Conference on Information and Knowledge Management - CIKM ’09. https://doi.org/10.1145/1645953.1646003

  • Liu, S. V. (2005). Nobel prize-winning original publications’ under performance in making citation glory. Logical Biology, 5(4), 29–305.

    Google Scholar 

  • Miner, G., Delen, D., Elder, J., Fast, A., Thomas, H. T., & Nisbet, R. (2012). Practical text mining and statistical analysis for non-structured text data applications. Academic Press.

    Google Scholar 

  • Mo, Y., Kontonatsios, G., & Ananidou, S. (2015). Supporting systematic reviews using LDA-based document representations. Systematic Reviews, 4, 172. https://doi.org/10.1186/s13643-015-0117-0

    Article  Google Scholar 

  • Mohammad, S. (2011). From once upon a time to happily ever after: Tracking emotions in novels and fairy tales. In Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (pp. 105–114). Association for Computational Linguistics.

  • Momtazi, S., & Naumann, F. (2013). Topic modeling for expert finding using latent Dirichlet allocation. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery., 5, 3.

    Google Scholar 

  • Nalisnick, E.T. & Baird, H.S. (2013). Character-to-Character Sentiment Analysis in Shakespeare’s Plays. ACL.

  • Navas-Loro, M., & Rodríguez-Doncel, V. (2020). Spanish corpora for sentiment analysis: a survey. Lang Resources & Evaluation, 54, 303–340. https://doi.org/10.1007/s10579-019-09470-8

    Article  Google Scholar 

  • Onan, A., Korukoglu, S., & Bulut, H. (2016). LDA-based topic modelling in text sentiment classification: An empirical analysis. International Journal of Computational Linguistics and Applications, 7, 101–119.

    Google Scholar 

  • Ozyurt, B., & Akcayol, M. A. (2020). A new topic modeling based approach for aspect extraction in aspect based sentiment analysis: SS-LDA. Expert Systems with Applications, 168, 114231. https://doi.org/10.1016/j.eswa.2020.114231

    Article  Google Scholar 

  • Pang, B., & Lee, L. (2008). Opinion Mining and Sentiment Analysis. Foundations and Trends® in Information Retrieval, 2, 1–135. https://doi.org/10.1561/1500000011

    Article  Google Scholar 

  • Pasupa, K., Netisopakul, P., & Lertsuksakda, R. (2016). Sentiment analysis of Thai children stories. Artif Life Robotics, 21, 357–364. https://doi.org/10.1007/s10015-016-0283-8

    Article  Google Scholar 

  • Plutchik, R. (1980). Emotion: A psychoevolutionary synthesis. Harper & Row.

    Google Scholar 

  • Poria, S., Chaturvedi, I., Cambria, E., & Bisio, F. (2016). Sentic LDA: Improving on LDA with semantic similarity for aspect-based sentiment analysis. International Joint Conference on Neural Networks (IJCNN), 2016, 4465–4473.

    Article  Google Scholar 

  • Raj, P. M., & Sai, D. J. (2021). Sentiment analysis, opinion mining and topic modelling of epics and novels using machine learning techniques. Materials Today: Proceedings. https://doi.org/10.1016/j.matpr.2021.06.001

    Article  Google Scholar 

  • Salgaro, M., Sorrentino, P., Lauer, G., & Jacobs, A.M. (2018). How to Measure the Social Prestige of a Nobel Prize in Literature? Development of a scale assessing the literary value of a text.

  • Schlagberger, E. M., Bornmann, L., & Bauer, J. (2016). At what institutions did Nobel laureates do their prize-winning work? An analysis of biographical information on Nobel laureates from 1994 to 2014. Scientometrics, 109, 723–767.

    Article  Google Scholar 

  • Schmidt, T., Burghardt, M., Dennerlein, K., & Wolff, C. (2019). Sentiment annotation for lessing's plays: Towards a language resource for sentiment analysis on german literary Texts. LDK.

  • Sherstinova et al. (2022). Topic modeling of literary texts using LDA: on the influence of linguistic preprocessing on Model Interpretability. In 31st Conference of Open Innovations Association (FRUCT), (pp. 305–312), doi: https://doi.org/10.23919/FRUCT54823.2022.9770887.

  • Sherstinova, T.Y., Moskvina, A., Kirina, M., Zavyalova, I.Y., Karysheva, A., Kolpashchikova, E., Maksimenko, P., & Moskalenko, A. (2022). Topic Modeling of Literary Texts Using LDA: on the Influence of Linguistic Preprocessing on Model Interpretability. In 2022 31st Conference of Open Innovations Association (FRUCT), (pp. 305–312).

  • Silge, et al. (2016). tidytext: Text Mining and Analysis Using Tidy Data Principles in R. Journal of Open Source Software, 1(3), 37.

    Article  Google Scholar 

  • Szell, M., Ma, Y., & Sinatra, R. (2018). A Nobel opportunity for interdisciplinarity. Nature Physics, 14, 1075–1078.

    Article  Google Scholar 

  • Thomsen, M. R. (2003). Kanoniske konstellationer: Om litteraturhistorie, kanonstudier og 1920'ernes litteratur [Canonical Constellations: On Literary History, Canon Studies, and 1920s Literature]. Syddansk Universitetsforlag.

  • Thomsen, M. R. (2013). The New Human in Literature: Posthuman Visions of Changes in Body Mind and Society after 1900. Bloomsbury Academic

  • Thomsen, M. R. (2017). Changing Spaces: Canonization of anglophone world literature. Anglia, 135(1), 51–66.

    Article  Google Scholar 

  • Ullah, Z., Uzair, M., & Mehmood, A. (2021). Extraction of key motifs as a preview from 2017 Nobel Prize Winning Novel, ‘Never Let Me Go’.

  • Wankhade, M., Rao, A. C. S., & Kulkarni, C. (2022). A survey on sentiment analysis methods, applications, and challenges. Artificial Intelligence Review. https://doi.org/10.1007/s10462-022-10144-1

    Article  Google Scholar 

  • Washbourne, K. (2016). Translation, Littérisation, and the Nobel Prize for Literature. TranscUlturAl: A Journal of Translation and Cultural Studies, 8, 57–75.

    Article  Google Scholar 

  • Xie, C. (2020). Psychoanalysis and literature: The stories we live. Routledge.

    Google Scholar 

  • Ye, J., Jing, X., & Li, J. (2018). Sentiment Analysis Using Modified LDA. In S. Sun, N. Chen, & T. Tian (Eds.), Signal and information processing, Networking and computers lecture notes in electrical engineering. Springer.

    Google Scholar 

  • Yin, S., Han, J., Huang, Y., & Kumar, K. (2014). Dependency-topic-affects-sentiment-LDA model for sentiment analysis. In 2014 IEEE 26th International Conference on Tools with Artificial Intelligence, (pp. 413–418).

  • Zehe, A., Arns, J., Hettinger, L., & Hotho, A. (2020). HarryMotions - classifying relationships in Harry Potter based on emotion analysis. SwissText/KONVENS.

  • Zhang, Y., Ji, D. H., Su, Y., & Wu, H. (2013). Joint Naïve bayes and LDA for Unsupervised Sentiment Analysis. In J. Pei, V. S. Tseng, L. Cao, H. Motoda, & G. Xu (Eds.), advances in knowledge discovery and data mining. Lecture notes in computer science. Springer. https://doi.org/10.1007/978-3-642-37453-1_33

    Chapter  Google Scholar 

  • Zhou, Y., Wang, R., Zeng, A., & Zhang, Y. (2020). Identifying prize-winning scientists by a competition-aware ranking. J. Informetrics, 14, 101038.

    Article  Google Scholar 

  • Zhou, Z., Xing, R., Liu, J., & Xing, F. (2014). Landmark papers written by the Nobelists in physics from 1901 to 2012: A bibliometric analysis of their citations and journals. Scientometrics, 100, 329–338.

    Article  Google Scholar 

Download references

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Contributions

BBA and BD presented the idea for the corresponding study. BBA designed the general framework of the study and supervised each phase of the study. LK performed the data collection and analysis phases. All authors contributed to the final version of the study.

Corresponding author

Correspondence to Bilal Barış Alkan.

Ethics declarations

Competing Interest

The authors declare that they have no competing interests.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alkan, B.B., Karakuş, L. & Direkci, B. Knowledge discovery from the texts of Nobel Prize winners in literature: sentiment analysis and Latent Dirichlet Allocation. Scientometrics 128, 5311–5334 (2023). https://doi.org/10.1007/s11192-023-04783-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-023-04783-6

Keywords

Navigation