, Volume 119, Issue 2, pp 749–770 | Cite as

Generating a representative keyword subset pertaining to an academic conference series

  • Agniv Adhikari
  • Paramita Das
  • Abhik MukherjeeEmail author


The breadth and velocity of innovation has resulted in explosion of research documents day by day. Academic conferences are being arranged worldwide, most of them in regular intervals, thereby generating a huge volume of research documents. Extracting undiscovered knowledge from the conference papers and thereby finding the inter-relationship of conference research topics is a challenging task. This paper attempts towards knowledge discovery for the conference with the help of keywords mentioned in the papers presented therein. The scheme proposed here tries to include the entire set of conference research papers using a small subset of all available keywords. The correctness and complexity of the scheme are analyzed. Proof of concept is established through some flagship conference held annually round the globe. The performance is favourable when compared with available text mining methods, as far as practicable. Results indicate that the scheme could be useful in characterizing topical themes of academic conferences, which may benefit both participants and organizers.


Text mining Knowledge discovery Greedy algorithms Computational complexity Heuristic algorithms Knowledge representation 


  1. Beel, J., Gipp, B., Langer, S., & Breitinger, C. (2016). Research-paper recommender systems: A literature survey. International Journal on Digital Libraries, 17(4), 305–338.CrossRefGoogle Scholar
  2. Beliga, S., Mestrovic, A., & Martincic-Ipsic, S. (2015). An overview of graph-based keyword extraction methods and approaches. Journal of Information and Organizational Sciences, 39, 1–20.Google Scholar
  3. Bhattacharya, S., & Basu, P. K. (1998). Mapping a research area at the micro level using co-word analysis. Scientometrics, 43(3), 359–372.CrossRefGoogle Scholar
  4. Blei, D. M., Andrew, N. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(45), 993–1022.zbMATHGoogle Scholar
  5. Chang, J. & Blei, D. (2009). Relational topic models for document networks. AISTATS, PMLR, 5, 81–88.Google Scholar
  6. Chiu, W. T., & Ho, Y. S. (2007). Bibliometric analysis of tsunami research. Scientometrics, 73(1), 3–17.CrossRefGoogle Scholar
  7. Frank, V., & Adam, P. (2009). Search, show context, expand on demand: Supporting large graph exploration with degree-of-interest. IEEE Transactions on Visualization and Computer Graphics, 15(6), 953–960.CrossRefGoogle Scholar
  8. Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences (PNAS), 101(1), 5228–5235.CrossRefGoogle Scholar
  9. Hristovski, D., Friedman, C., Rindflesch, T., & Peterlin, B. (2008). Literature-based knowledge discovery using natural language processing. Literature-Based Discovery, 15, 133–152.CrossRefGoogle Scholar
  10. Kazuhiro, S. (2015). Hypothesis discovery exploiting closed chains of relations. Transactions on Large-Scale Data and Knowledge-Centered Systems XXII, 9430, 145–164.CrossRefGoogle Scholar
  11. Kostiantyn K. & Andreas K. (2015). Text visualization techniques: Taxonomy, visual survey, and community insights. In Proc. IEEE pacific visualization symposium, pacificVis (Vol. 2015, pp. 117–121). IEEE.Google Scholar
  12. Kostoff, R. N., Block, J. A., Solka, J. L., Briggs, M. B., Rushenberg, R. L., Stump, J. A., Johnson, D., Lyons, T. J., & Wyatt, J. R. (2008). Literature-related discovery (LRD): Lessons learned, and future research directions. Technological Forecasting and Social Change, 75(2), 276–299.CrossRefGoogle Scholar
  13. Marc, W., Kors, J., & Barend, M. (2005). Online tools to support literature-based discovery in the life sciences. Briefings in Bioinformatics, 6(3), 277–286.CrossRefGoogle Scholar
  14. Nan, C., Jimeng, S., Yu-Ru, L., David, G., Shixia, L., & Huamin, Q. (2010). FacetAtlas: Multifaceted visualization for rich text corpora. IEEE Transactions on Visualization and Computer Graphics, 16(6), 1172–1181.CrossRefGoogle Scholar
  15. Radhakrishnan, S., Erbis, S., Isaacs, J. A., & Kamarthi, S. (2017). Novel keyword co-occurrence network-based methods to foster systematic reviews of scientific literature. PLoS ONE, 12(3), e0172,778.CrossRefGoogle Scholar
  16. Smalheiser, N. (2012). Literature-based discovery: Beyond the ABCs. Journal of the Association for Information Science and Technology, 63(2), 218–224.Google Scholar
  17. Spitz, A., & Gertz, M. (2018). Entity-centric topic extraction and exploration: A network-based approach. European Conference on Information Retrieval. Scholar
  18. Su, H. N., & Lee, P. C. (2010). Mapping knowledge structure by keyword co-occurrence: A first look at journal papers in technology foresight. Scientometrics, 85(1), 65–79.CrossRefGoogle Scholar
  19. Susan, H., Beth, H., & Lucy, N. (2000). ThemeRiver: Visualizing theme changes over time. information visualization (InfoVis). IEEE Transactions on Visualization and Computer Graphics, 16(6), 115–123.Google Scholar
  20. Swanson, D., & Smalheiser, N. (1986). Undiscovered public knowledge: A ten-year update. The Library Quarterly, 56(2), 103–118.CrossRefGoogle Scholar
  21. Swanson, D., & Smalheiser, N. (1997). An interactive system for finding complementary literatures: A stimulus to scientific discovery. Artificial Intelligence, 91(2), 183–203.CrossRefzbMATHGoogle Scholar
  22. Weiwei, C., Yingcai, W., Shixia, L., Furu, W., Michelle, Z., & Huamin, Q. (2010). Context preserving dynamic word cloud visualization. IEEE Pacific Visualization Symposium (PacificVis), 16(6), 121–128.Google Scholar
  23. Wu, H., Luk, R., Wong, K., & Kwok, K. (2008). Interpreting TF-IDF term weights as making relevance decisions. ACM Transactions on Information Systems, 26(3), 13.1–13.37.CrossRefGoogle Scholar
  24. Yakub, S., Eu-Gene, S., & Sylvester, O. (2017). Emerging approaches in literature-based discovery: Techniques and performance review. The Knowledge Engineering Review, 2017a, 32.Google Scholar
  25. Yanhua, C., Lijun, W., Ming, D., & Jing, H. (2009). Exemplar-based visualization of large document corpus. IEEE Transactions on Visualization and Computer Graphics, 15(6), 1161–1168.CrossRefGoogle Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2019

Authors and Affiliations

  1. 1.Computer DivisionCSIR-Central Glass & Ceramic Research InstituteJadavpur, KolkataIndia
  2. 2.Department of Computer Science and TechnologyIndian Institute of Engineering Science and TechnologyShibpurIndia

Personalised recommendations