Abstract
Today, The Nobel Prize for Literature is one of the most recognized and prestigious awards. Examining the texts of the authors who have received this award and revealing the factors that play an important role in the awarding of this award is very important for the author, the reader and interested parties. In this direction, within the framework of the study, firstly identified the most popular works of the authors who received the Nobel Prize in Literature between 1980 and 2021 and created a data set—corpus. Dictionary-based sentiment analysis, a method for classifying sentiments, and Latent Dirichlet Allocation (LDA), a very popular approach in topic modeling, were applied to this dataset. As a result, the findings obtained from both sentiment and LDA analyzes were evaluated together and it was found that the themes with the highest distribution in the popular texts of Nobel Prize winners are also those with the positive emotional pole and “trust” weighted sentiment. This study is an exemplary resource in that it contributes to the understanding of the structure and emotional character of the related works of Nobel Prize-winning authors and enables readers and authors to quickly and functionally examine large groups of texts in terms of theme and content.
Similar content being viewed by others
Data Availability
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Alm, C. O., & Sproat, R. (2005). Emotional sequencing and development in fairy tales. International conference on affective computing and intelligent interaction (pp. 668–674). Springer.
Archer, J., & Jockers, M. L. (2016). The bestseller code: Anatomy of the blockbuster novel. St. Martin’s Press.
Audrin, C., & Audrin, B. (2022). Key factors in digital literacy in learning and education: A systematic literature review using text mining. Education and Information Technologies. https://doi.org/10.1007/s10639-021-10832-5
Bizzoni, Y., Nielbo, K. L., & Thomsen, M. R. (2022a). Fractality of sentiment arcs for literary quality assessment: The case of Nobel laureates. In Proceedings of the 2nd International Workshop on Natural Language Processing for Digital Humanities (pp. 31–41).
Bizzoni, Y., Peura, T., Thomsen, M., & Nielbo, K. (2022b). Fractal sentiments and fairy tales: Fractal scaling of narrative arcs as predictor of the perceived quality of Andersen’s fairy tales. Journal of Data Mining & Digital Humanities. https://doi.org/10.46298/jdmdh.9154
Bjork, S., Offer, A., & Söderberg, G. (2014). Time series citation data: The Nobel Prize in economics. Scientometrics, 98, 185–196. https://doi.org/10.1007/s11192-013-0989-5
Blei, D. M., Ng, A., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993–1022.
Chan, H. F., & Torgler, B. (2015). The implications of educational and methodological background for the career success of Nobel laureates: An investigation of major awards. Scientometrics, 102, 847–863. https://doi.org/10.1007/s11192-014-1367-7
Colace, F., De Santo, M., & Greco, L. (2014). SAFE: A Sentiment analysis framework for E-learning. International Journal of Emerging Technologies in Learning (iJET), 9(6), 37–41. https://doi.org/10.3991/ijet.v9i6.4110
Dhaen, T., Domínguez, C., & Thomsen, M. R. (2012). World literature: A reader. Routledge Literature Readers.
Feldman, B. (2000). The Nobel Prize: A history of genius. Controversy and Prestige.
Ferreira-Mello, R., André, M., Pinheiro, A., Costa, E., & Romero, C. (2019). Text mining in education. Wires Data Mining and Knowledge Discovery, 9, e1332. https://doi.org/10.1002/widm.1332
Francisco, V., Hervás, R., Peinado, F., et al. (2012). EmoTales: Creating a corpus of folk tales with emotional annotations. Lang Resources & Evaluation, 46, 341–381. https://doi.org/10.1007/s10579-011-9140-5
Gao, J., Jockers, M.L., Laudun, J., & Tangherlini, T.R. (2016). A multiscale theory for the dynamical evolution of sentiment in novels. 2016 International Conference on Behavioral, Economic and Socio-cultural Computing (BESC), ( pp. 1–4).
Gingras, Y., & Wallace, M. L. (2010). Why it has become more difficult to predict Nobel Prize winners: A bibliometric analysis of nominees and winners of the chemistry and physics prizes (1901–2007). Scientometrics, 82, 401–412. https://doi.org/10.1007/s11192-009-0035-9
Harzing, A. (2012). A preliminary test of Google Scholar as a source for citation data: A longitudinal study of Nobel prize winners. Scientometrics, 94, 1057–1075.
Hemmatian, F., Sohrabi, M., & K. (2019). A survey on classification techniques for opinion mining and sentiment analysis. Artificial Intelligence Review, 52, 1495–1545. https://doi.org/10.1007/s10462-017-9599-6
Hogan, P. C. (2022). Literature and moral feeling: A cognitive poetics of ethics, narrative, and empathy. Cambridge University Press.
Hu, Q., Liu, B., Thomsen, M.R., Gao, J., & Nielbo, K.L. (2019). Dynamic evolution of sentiments in Never Let Me Go: Insights from quantitative analysis and implications. 2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC), (pp. 1–6).
Hu, Q., Liu, B., Gao, J., et al. (2021). Fractal scaling laws for the dynamic evolution of sentiments in Never Let Me Go and their implications for writing, adaptation and reading of novels. World Wide Web, 24, 1147–1164. https://doi.org/10.1007/s11280-021-00892-5
Hu, Q., Liu, B., Thomsen, M. R., Gao, J., & Nielbo, K. L. (2020). Dynamic evolution of sentiments in Never Let Me Go: Insights from multifractal theory and its implications for literary analysis. Digit. Scholarsh. Humanit., 36, 322–332.
Jockers, M. L. (2014). Token distribution analysis. Text analysis with R for students of literature. Quantitative methods in the humanities and social sciences. Springer. https://doi.org/10.1007/978-3-319-03164-4_4
Jockers, M. L., & Mimno, D. (2013). Significant themes in 19th-century literature. Poetics, 41(6), 750–769.
Jockers, M. L., & Thalken, R. (2020). Sentiment analysis. Text analysis with R. Quantitative methods in the humanities and social sciences. Springer. https://doi.org/10.1007/978-3-030-39643-5_14
Kitano, H. (2016). Artificial intelligence to win the Nobel Prize and beyond: Creating the engine for scientific discovery. AI Magazine, 37, 39–49.
Kotu, V., & Deshpande, B. (2019). Data science, concepts and practice. Morgan Kaufmann Publishers.
Kumar, P., & Vardhan, M. (2022). PWEBSA: Twitter sentiment analysis by combining Plutchik wheel of emotion and word embedding. International Journal of Information Technology, 14, 69–77.
Kwartler, T. (2017). Text mining in practice with R. John Wiley & Sons.
Lee, H., & Kang, P. (2018). Identifying core topics in technology and innovation management studies: A topic model approach. The Journal of Technology Transfer, 43, 1291–1317. https://doi.org/10.1007/s10961-017-9561-4
Lee, J., Kang, J.-H., Jun, S., Lim, H., Jang, D., & Park, S. (2018). Ensemble modeling for sustainable technology transfer. Sustainability, 10(7), 2278. https://doi.org/10.3390/su10072278
Li, J. X., Quyang, X., Zhou, YLu., & Liu, Y. (2014). Supervised labeled latent Dirichlet allocation for document categorization. Applied Intelligence, 3, 42.
Li, J., Yin, Y., Fortunato, S., & Wang, D. (2020). Scientific elite revisited: patterns of productivity, collaboration, authorship and impact. Journal of the Royal Society Interface. https://doi.org/10.1098/rsif.2020.0135
Liang, G., Hou, H., Ding, Y., & Hu, Z. (2020). Knowledge recency to the birth of Nobel Prize-winning articles: Gender, career stage, and country. J. Informetrics, 14, 101053.
Liang, G., Hou, H., Ren, P., Bu, Y., Kong, X., & Hu, Z. (2019). Understanding Nobel Prize-winning articles: A bibliometric analysis. Current Science. https://doi.org/10.18520/cs/v116/i3/379-385
Liang, J., Liu, P., Tan, J., & Bai, S. (2014). Sentiment Classification Based on AS-LDA Model. International Conference on Information Technology and Quantitative Management. https://doi.org/10.1016/j.procs.2014.05.296
Lin, C., & He, Y. (2009). Joint sentiment/topic model for sentiment analysis. In: Proceeding of the 18th ACM Conference on Information and Knowledge Management - CIKM ’09. https://doi.org/10.1145/1645953.1646003
Liu, S. V. (2005). Nobel prize-winning original publications’ under performance in making citation glory. Logical Biology, 5(4), 29–305.
Miner, G., Delen, D., Elder, J., Fast, A., Thomas, H. T., & Nisbet, R. (2012). Practical text mining and statistical analysis for non-structured text data applications. Academic Press.
Mo, Y., Kontonatsios, G., & Ananidou, S. (2015). Supporting systematic reviews using LDA-based document representations. Systematic Reviews, 4, 172. https://doi.org/10.1186/s13643-015-0117-0
Mohammad, S. (2011). From once upon a time to happily ever after: Tracking emotions in novels and fairy tales. In Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (pp. 105–114). Association for Computational Linguistics.
Momtazi, S., & Naumann, F. (2013). Topic modeling for expert finding using latent Dirichlet allocation. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery., 5, 3.
Nalisnick, E.T. & Baird, H.S. (2013). Character-to-Character Sentiment Analysis in Shakespeare’s Plays. ACL.
Navas-Loro, M., & Rodríguez-Doncel, V. (2020). Spanish corpora for sentiment analysis: a survey. Lang Resources & Evaluation, 54, 303–340. https://doi.org/10.1007/s10579-019-09470-8
Onan, A., Korukoglu, S., & Bulut, H. (2016). LDA-based topic modelling in text sentiment classification: An empirical analysis. International Journal of Computational Linguistics and Applications, 7, 101–119.
Ozyurt, B., & Akcayol, M. A. (2020). A new topic modeling based approach for aspect extraction in aspect based sentiment analysis: SS-LDA. Expert Systems with Applications, 168, 114231. https://doi.org/10.1016/j.eswa.2020.114231
Pang, B., & Lee, L. (2008). Opinion Mining and Sentiment Analysis. Foundations and Trends® in Information Retrieval, 2, 1–135. https://doi.org/10.1561/1500000011
Pasupa, K., Netisopakul, P., & Lertsuksakda, R. (2016). Sentiment analysis of Thai children stories. Artif Life Robotics, 21, 357–364. https://doi.org/10.1007/s10015-016-0283-8
Plutchik, R. (1980). Emotion: A psychoevolutionary synthesis. Harper & Row.
Poria, S., Chaturvedi, I., Cambria, E., & Bisio, F. (2016). Sentic LDA: Improving on LDA with semantic similarity for aspect-based sentiment analysis. International Joint Conference on Neural Networks (IJCNN), 2016, 4465–4473.
Raj, P. M., & Sai, D. J. (2021). Sentiment analysis, opinion mining and topic modelling of epics and novels using machine learning techniques. Materials Today: Proceedings. https://doi.org/10.1016/j.matpr.2021.06.001
Salgaro, M., Sorrentino, P., Lauer, G., & Jacobs, A.M. (2018). How to Measure the Social Prestige of a Nobel Prize in Literature? Development of a scale assessing the literary value of a text.
Schlagberger, E. M., Bornmann, L., & Bauer, J. (2016). At what institutions did Nobel laureates do their prize-winning work? An analysis of biographical information on Nobel laureates from 1994 to 2014. Scientometrics, 109, 723–767.
Schmidt, T., Burghardt, M., Dennerlein, K., & Wolff, C. (2019). Sentiment annotation for lessing's plays: Towards a language resource for sentiment analysis on german literary Texts. LDK.
Sherstinova et al. (2022). Topic modeling of literary texts using LDA: on the influence of linguistic preprocessing on Model Interpretability. In 31st Conference of Open Innovations Association (FRUCT), (pp. 305–312), doi: https://doi.org/10.23919/FRUCT54823.2022.9770887.
Sherstinova, T.Y., Moskvina, A., Kirina, M., Zavyalova, I.Y., Karysheva, A., Kolpashchikova, E., Maksimenko, P., & Moskalenko, A. (2022). Topic Modeling of Literary Texts Using LDA: on the Influence of Linguistic Preprocessing on Model Interpretability. In 2022 31st Conference of Open Innovations Association (FRUCT), (pp. 305–312).
Silge, et al. (2016). tidytext: Text Mining and Analysis Using Tidy Data Principles in R. Journal of Open Source Software, 1(3), 37.
Szell, M., Ma, Y., & Sinatra, R. (2018). A Nobel opportunity for interdisciplinarity. Nature Physics, 14, 1075–1078.
Thomsen, M. R. (2003). Kanoniske konstellationer: Om litteraturhistorie, kanonstudier og 1920'ernes litteratur [Canonical Constellations: On Literary History, Canon Studies, and 1920s Literature]. Syddansk Universitetsforlag.
Thomsen, M. R. (2013). The New Human in Literature: Posthuman Visions of Changes in Body Mind and Society after 1900. Bloomsbury Academic
Thomsen, M. R. (2017). Changing Spaces: Canonization of anglophone world literature. Anglia, 135(1), 51–66.
Ullah, Z., Uzair, M., & Mehmood, A. (2021). Extraction of key motifs as a preview from 2017 Nobel Prize Winning Novel, ‘Never Let Me Go’.
Wankhade, M., Rao, A. C. S., & Kulkarni, C. (2022). A survey on sentiment analysis methods, applications, and challenges. Artificial Intelligence Review. https://doi.org/10.1007/s10462-022-10144-1
Washbourne, K. (2016). Translation, Littérisation, and the Nobel Prize for Literature. TranscUlturAl: A Journal of Translation and Cultural Studies, 8, 57–75.
Xie, C. (2020). Psychoanalysis and literature: The stories we live. Routledge.
Ye, J., Jing, X., & Li, J. (2018). Sentiment Analysis Using Modified LDA. In S. Sun, N. Chen, & T. Tian (Eds.), Signal and information processing, Networking and computers lecture notes in electrical engineering. Springer.
Yin, S., Han, J., Huang, Y., & Kumar, K. (2014). Dependency-topic-affects-sentiment-LDA model for sentiment analysis. In 2014 IEEE 26th International Conference on Tools with Artificial Intelligence, (pp. 413–418).
Zehe, A., Arns, J., Hettinger, L., & Hotho, A. (2020). HarryMotions - classifying relationships in Harry Potter based on emotion analysis. SwissText/KONVENS.
Zhang, Y., Ji, D. H., Su, Y., & Wu, H. (2013). Joint Naïve bayes and LDA for Unsupervised Sentiment Analysis. In J. Pei, V. S. Tseng, L. Cao, H. Motoda, & G. Xu (Eds.), advances in knowledge discovery and data mining. Lecture notes in computer science. Springer. https://doi.org/10.1007/978-3-642-37453-1_33
Zhou, Y., Wang, R., Zeng, A., & Zhang, Y. (2020). Identifying prize-winning scientists by a competition-aware ranking. J. Informetrics, 14, 101038.
Zhou, Z., Xing, R., Liu, J., & Xing, F. (2014). Landmark papers written by the Nobelists in physics from 1901 to 2012: A bibliometric analysis of their citations and journals. Scientometrics, 100, 329–338.
Funding
No funding was received to assist with the preparation of this manuscript.
Author information
Authors and Affiliations
Contributions
BBA and BD presented the idea for the corresponding study. BBA designed the general framework of the study and supervised each phase of the study. LK performed the data collection and analysis phases. All authors contributed to the final version of the study.
Corresponding author
Ethics declarations
Competing Interest
The authors declare that they have no competing interests.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Alkan, B.B., Karakuş, L. & Direkci, B. Knowledge discovery from the texts of Nobel Prize winners in literature: sentiment analysis and Latent Dirichlet Allocation. Scientometrics 128, 5311–5334 (2023). https://doi.org/10.1007/s11192-023-04783-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-023-04783-6