Advertisement

An Online Service for Topics and Trends Analysis in Medical Literature

  • Spyridon Kavvadias
  • George Drosatos
  • Eleni Kaldoudi
Conference paper
Part of the IFMBE Proceedings book series (IFMBE, volume 68/1)

Abstract

Topic modeling refers to a suite of probabilistic algorithms for extracting word patterns from a collection of documents aiming for data clustering and detection of research trends. We developed an online service that implements different variations of Latent Dirichlet Allocation (LDA) algorithm. Scientific literature origin from targeted search queries in PubMed, works as input while output files are available for every step of the process. Researchers can compare the results of different corpora, preprocessing texts and topic modeling parameters in a quick and organized way. Information regarding topics help users assign labels and group them to categories. Visualization of data is a contribution of our service with graphs generated on the fly providing information about the corpora, the topics, groups of topics and categories as well. We rely in modern technologies and follow the principles of agile software development to achieve scalability and discreet design.

Keywords

Topic modeling Content analysis Trend analysis Visualization 

Notes

Acknowledgements

This work was supported by the FP7-ICT project CARRE (Grant No. 611140), funded in part by the European Commission and Greek National Matching funds (DUTH KE81442).

Conflict of Interest

The authors declare that they have no conflict of interest.

References

  1. 1.
    Paul, M., Girju, R.: Topic modeling of research fields: An interdisciplinary perspective. In: International Conference Recent Advances in Natural Language Processing (RANLP 2009), pp. 337–342 (2009).Google Scholar
  2. 2.
    Liu, L., Tang, L., Dong, W., Yao, S., Zhou, W: An overview of topic modeling and its current applications in bioinformatics. SpringerPlus5(1), 1608 (2016).Google Scholar
  3. 3.
    Blei, M., D., Andrew, Y., J., Jordan, I., M.: Latent dirichlet allocation. Journal of Machine Learning Research, Vol. 3, pp. 993–1022 (2003).Google Scholar
  4. 4.
    Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American society for information science, 41(6), 391 (1990).Google Scholar
  5. 5.
    Hofmann, T.: Probabilistic latent semantic analysis. In: 15th Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers Inc. pp. 289–296 (1999).Google Scholar
  6. 6.
    Scrivner, O., Davis, J.: Topic modeling of scholarly articles: Interactive text mining suite. In: Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialogue 2016” (2016).Google Scholar
  7. 7.
    Kim, D., Swanson, B. F., Hughes, M. C., Sudderth, E. B.: Refinery: An open source topic modeling web platform. Journal of Machine Learning Research, 18(12), 1–5 (2017).Google Scholar
  8. 8.
    Gardner, M. J., Lutes, J., Lund, J., Hansen, J., Walker, D., Ringger, E., Seppi, K.: The topic browser: An interactive tool for browsing topic models. In: NIPS Workshop on Challenges of Data Visualization (Vol. 2) (2010).Google Scholar
  9. 9.
    Blei, M.: Probabilistic topic models. Communications of the ACM, 55(4):77–84, (2012).Google Scholar
  10. 10.
    Jurafsky, D., Martin, J. H: Speech and language processing. 3rd edn. Pearson, London (2017).Google Scholar
  11. 11.
    La Rosa, M., Fiannaca, A., Rizzo, R., Urso, A.: Probabilistic topic modeling for the analysis and classification of genomic sequences. BMC Bioinformatics, 16(6), S2 (2015).Google Scholar
  12. 12.
    Rasiwasia, N., Vasconcelos, N.: Latent dirichlet allocation models for image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11), 2665–2679 (2013).Google Scholar
  13. 13.
    Lau, J. H., Collier, N., Baldwin, T.: On-line trend analysis with topic models: #twitter trends detection topic model online. In: 24th International Conference on Computational Linguistics, pp. 1519–1534 (2012).Google Scholar
  14. 14.
    Binkley, D., Heinz, D., Lawrie, D., Overfelt, J.: Understanding LDA in source code analysis. In: 22nd International Conference on Program Comprehension, pp. 26–36, ACM, New York, NY, USA (2014).Google Scholar
  15. 15.
    Topic Modeling Software, http://www.cs.columbia.edu/~blei/topicmodeling_software.html, last accessed 2018/02/05.
  16. 16.
    Grün, B., Hornik, K.: topicmodels: An R package for fitting topic models. Journal of Statistical Software, 40(13), 1–30 (2011).Google Scholar
  17. 17.
    MALLET: A machine learning for language toolkit, http://mallet.cs.umass.edu, last accessed 2018/02/05.
  18. 18.
    jLDADMM: A Java package for the LDA and DMM topic models, http://jldadmm.sourceforge.net, last accessed 2018/02/05.
  19. 19.
    Krovetz, R.: Viewing morphology as an inference process. In: 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 191–202, ACM, New York, NY, USA (1993).Google Scholar
  20. 20.
    Priva, U. C., Austerweil, J. L.: Analyzing the history of Cognition using topic models. Cognition, 135, 4–9 (2015).Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.School of MedicineDemocritus University of ThraceAlexandroupoliGreece

Personalised recommendations