Abstract
There is a need for an automated approach to extract current trends and perceptions from literature review material in a field of interest. Manually reviewing a large number of papers is time-consuming, topic modelling will help to avoid this. The text mining technique chosen for this task is topic modelling. The chapter gives an overview of the most widely used topic modelling techniques, as well as a few applications. It also summarizes a few current research trends and the generic processes of topic modelling. A section demonstrates an approach to discovering current perceptions from literature materials focused on data analytics in e-commerce using topic modelling. The case study framework included five steps: data collection, data pre-processing, topic tuning, performance evaluation, and interpretation of topic modelling results. The topic numbers were tuned using MALLET with Gensim wrappers. LDA is used. The Gensim topic coherence framework in Python was used to evaluate the topics. The perceptions in the reviewed material are interpreted using the inter-topic distance map in pyLDAVis. The modelling revealed distinct perceptions or directions of interest in e-commerce and data analytics research. Researchers can use topic modelling to see which areas are getting attention and which aren’t.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
V.B. Kobayashi, S.T. Mol, H.A. Berkers, G. Kismihók, D.N. Den Hartog, Text mining in organizational research. Org. Res. Methods 21(3) (2018)
I. Vayansky, S.A.P. Kumar, A review of topic modeling methods. Inf. Syst. 94, 101582 (2020). https://doi.org/10.1016/j.is.2020.101582
S.K. Ray, A. Ahmad, C.A. Kumar, Review and implementation of topic modeling in Hindi. Appl. Artif. Intell. 33(11), 979–1007 (2019). https://doi.org/10.1080/08839514.2019.1661576
T. Nummelin, R. Hänninen, M. Kniivilä, Exploring forest sector research subjects and trends from 2000 to 2019 using topic modeling. Curr. For. Rep. 267–281 (2021). https://doi.org/10.1007/s40725-021-00152-9
C.C. Silva, M. Galster, F. Gilson, Topic modeling in software engineering research (2021)
N.L. Processing, D. Sarkar, Text analytics with python (2016)
M.W. Neff, E.A. Corley, 35 years and 160,000 articles: A bibliometric exploration of the evolution of ecology. Scientometrics 80(3), 657–682 (2009). https://doi.org/10.1007/s11192-008-2099-3
H. Jiang, M. Qiang, P. Lin, A topic modeling based bibliometric exploration of hydropower research. Renew. Sustain. Energy Rev. 57, 226–237 (2016). https://doi.org/10.1016/j.rser.2015.12.194
Z. Ding, Z. Li, C. Fan, Building energy savings: analysis of research trends based on text mining. Autom. Constr. 96(June), 398–410 (2018). https://doi.org/10.1016/j.autcon.2018.10.008
H. Xiong, Y. Cheng, W. Zhao, J. Liu, Analyzing scientific research topics in manufacturing field using a topic model. Comput. Ind. Eng. 135, 333–347 (2019). https://doi.org/10.1016/j.cie.2019.06.010
S. Zaza, M. Al-Emran, Mining and exploration of credit cards data in UAE, in Proceedings of 2015 5th International Conference on e-Learning (ECONF 2015) (2016), pp. 275–279. https://doi.org/10.1109/ECONF.2015.57
S. Hantoobi, A. Wahdan, M. Al-Emran, K. Shaalan, A review of learning analytics studies. Stud. Syst. Decis. Control 335, 119–134 (2021). https://doi.org/10.1007/978-3-030-64987-6_8
S. Paek, T. Um, N. Kim, Exploring latent topics and international research trends in competency-based education using topic modeling. Educ. Sci. 11(6) (2021). https://doi.org/10.3390/educsci11060303
T.M. Pratidina, D.B. Setyohadi, Automatization news grouping using latent dirichlet allocation for improving efficiency. Int. J. Innov. Comput. Inf. Control 17(5), 1643–1651 (2021). https://doi.org/10.24507/ijicic.17.05.1643
S.A. Salloum, M. Al-Emran, K. Shaalan, Mining text in news channels: a case study from Facebook. Int. J. Inf. Technol. Lang. Stud. 1(1), 1–9 (2017)
C.B. Asmussen, C. Møller, Smart literature review : a practical topic modelling approach to exploratory literature review. J. Big Data (2019). https://doi.org/10.1186/s40537-019-0255-7
P. Kherwa, P. Bansal, Topic modeling: a comprehensive review. ICST Trans. Scalable Inf. Syst. 159623 (2018). https://doi.org/10.4108/eai.13-7-2018.159623
Q. Wang, J. Xu, H. Li, N. Craswell, Regularized latent semantic indexing: A new approach to large-scale topic modeling. ACM Trans. Inf. Syst. 31(1) (2013). https://doi.org/10.1145/2414782.2414787
S. Debortoli, O. Müller, I. Junglas, Text mining for information systems researchers : an annotated topic modeling tutorial. Commun. Assoc. Inform. Syst. 39 (2016). https://doi.org/10.17705/1CAIS.03907
D.T.K. Geeganage, Concept Embedded Topic Modeling Technique (2018), pp. 831–835
O. Kononova, T. He, H. Huo, A. Trewartha, E.A. Olivetti, G. Ceder, Opportunities and challenges of text mining in aterials research. iScience 24(3), 102155 (2021). https://doi.org/10.1016/j.isci.2021.102155
R. Alghamdi, A survey of topic modeling in text mining. Int. J. Adv. Comput. Sci. Appl. 6(1), 147–153 (2015)
H. Jelodar, Y. Wang, Latent Dirichlet Allocation (LDA) and Topic modeling: models, applications, a survey, Nov 2017
D.M. Blei, J.D. Lafferty, Dynamic topic models. ACM Int. Conf. Proc. Ser. 148, 113–120 (2006). https://doi.org/10.1145/1143844.1143859
M. Rosen-Zvi, T. Griffiths, P. Smyth, M. Steyvers, Learning author topic models from text corpora. J. Mach. Learn. Res. V, 1–38 (2005). [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.59.7284&rep=rep1&type=pdf%0A; http://scholar.google.com/scholar?hl=en%7B%5C&%7DbtnG=Search%7B%5C&%7Dq=intitle:Learning+Author-Topic+Models+from+Text+Corpora%7B%5C#%7D0
D.M. Blei, J.D. Lafferty, A correlated topic model of science. Ann. Appl. Stat. 1(1), 17–35 (2007). https://doi.org/10.1214/07-aoas114
X. Bai, X. Zhang, K. X. Li, Y. Zhou, K. Fai, Research topics and trends in the maritime transport : a structural topic model. Transp. Policy 102 (2020), 11–24 (2021). https://doi.org/10.1016/j.tranpol.2020.12.013
S. Rani, M. Kumar, Topic modeling and its applications in materials science and engineering. Mater. Today Proc. 45, 5591–5596 (2021). https://doi.org/10.1016/j.matpr.2021.02.313
C. Jacobi, W. Van Atteveldt, K. Welbers, Quantitative analysis of large amounts of journalistic texts using topic modelling. Amounts J. Texts 0811 (2015). https://doi.org/10.1080/21670811.2015.1093271
T. Bergmanis, S. Goldwater, Context sensitive neural lemmatization with lematus, in NAACL HLT 2018—2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long Papers) (2018), pp. 1391–1400. https://doi.org/10.18653/v1/n18-1126
D. Elgesem, I. Feinerer, L. Steskal, Bloggers’ Responses to the snowden affair: combining automated and manual methods in the analysis of news blogging. Comput. Support. Coop. Work CSCW An Int. J. 25(2–3), 167–191 (2016). https://doi.org/10.1007/s10606-016-9251-z
D. Maier et al., Applying lda topic modeling in communication research: toward a valid and reliable methodology. Commun. Methods Meas. 12(2–3), 93–118 (2018). https://doi.org/10.1080/19312458.2018.1430754
Y. Hu, A. John, F. Wang, S. Kambhampati, ET-LDA: Joint topic modeling for aligning events and their twitter feedback. Proc. Natl. Conf. Artif. Intell. 1, 59–65 (2012)
A. Panichella, B. Dit, R. Oliveto, M. Di Penta, D. Poshynanyk, A. De Lucia, “How to effectively use topic models for software engineering tasks? An approach based on Genetic Algorithms, in Proceedings of International Conference on Software Engineering (2013), pp. 522–531. https://doi.org/10.1109/ICSE.2013.6606598
Y. Kim, K. Shim, TWILITE: A recommendation system for Twitter using a probabilistic model based on latent Dirichlet allocation. Inf. Syst. 42, 59–77 (2014). https://doi.org/10.1016/j.is.2013.11.003
D. Gritsenko, The Palgrave Handbook of Digital Russia Studies (2020)
Y. Hu, J. Boyd-Graber, B. Satinoff, A. Smith, Interactive topic modeling. Mach. Learn. 95(3), 423–469 (2014). https://doi.org/10.1007/s10994-013-5413-0
A. Wahdan, S. Hantoobi, M. Al-emran, Early detecting students at risk using machine learning predictive models (2022)
K. Vorontsov, A. Potapenko, Tutorial on probabilistic topic modeling : additive regularization for stochastic matrix factorization (2014)
A. Daud, J. Li, L. Zhou, F. Muhammad, Knowledge discovery through directed probabilistic topic models : a survey (2009). https://doi.org/10.1007/s11704-009-0062-y
J. Boyd-Graber, D. Mimno, Applications of Topic Models, vol. XX, no. Xx (2017), pp. 1–154. https://doi.org/10.1561/XXXXXXXXXX
K. Management, Mining Student Information System Records to Predict Students’ Academic Performance. يميداكلأا مه ءادأ ب ؤبنتلل ة ب لطلا تامولعم م ا ظن تلاجس نيدعت by AMJED TARIQ MOHAMMAD ABU SAA,” no. Nov 2018
Q.T. Zeng, D. Redd, T. Rindflesch, J. Nebeker, Synonym, topic model and predicate-based query expansion for retrieving clinical documents. AMIA Annu. Symp. Proc. 2012, 1050–1059 (2012)
J.F. Burnham, Scopus database: a review. Biomed. Digital Libr. 3(1), 1–8 (2006). https://doi.org/10.1186/1742-5581-3-1
I. Martynov, J. Klima-frysch, J. Schoenberger, A scientometric analysis of neuroblastoma research (2020), pp. 1–10
M. Röder, A. Both, A. Hinneburg, Exploring the space of topic coherence measures, in WSDM 2015—Proceedings of the 8th ACM International Conference on Web Search and Data Mining (2015), , pp. 399–408. https://doi.org/10.1145/2684822.2685324
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Afolabi, I.T., Uzor, C.N. (2022). Topic Modelling for Research Perception: Techniques, Processes and a Case Study. In: Al-Emran, M., Shaalan, K. (eds) Recent Innovations in Artificial Intelligence and Smart Applications. Studies in Computational Intelligence, vol 1061. Springer, Cham. https://doi.org/10.1007/978-3-031-14748-7_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-14748-7_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14747-0
Online ISBN: 978-3-031-14748-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)