Domain Knowledge Driven Key Term Extraction for IT Services

  • Prateeti MohapatraEmail author
  • Yu Deng
  • Abhirut Gupta
  • Gargi Dasgupta
  • Amit Paradkar
  • Ruchi Mahindru
  • Daniela Rosu
  • Shu Tao
  • Pooja Aggarwal
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11236)


IT service support agents are trained on knowledge sources with large volumes of domain-specific documents, including product manuals and troubleshooting contents. Self-assist applications, such as search and support chat-bots must integrate such knowledge in order to conduct effective user interactions. In particular, the very large volume of domain-specific terms referenced in training documents must be accurately identified and qualified for relevance to specific context of support actions. We propose a weakly-supervised approach for extraction of key terms from IT support documents. The approach integrates domain knowledge to refine the extraction results. Our approach obviates the need for extensive expert work creating manual annotation and dictionary collection, as typically required in traditional supervised solutions, as well as the limited accuracy obtained in unsupervised methods. Results show that domain knowledge based refinement helps improve the overall accuracy of mined key terms by 25–30%.


Key term extraction Domain knowledge IT support 


  1. 1.
    Contractor, D., Singla, P., Mausam: Entity-balanced Gaussian pLSA for automated comparison. In: HLT-NAACL (2016)Google Scholar
  2. 2.
    Deng, Y., et al.: Advanced search system for IT support services. IBM J. Res. Dev. 61(1), 3:27–3:40 (2017). Scholar
  3. 3.
    Florescu, C., Caragea, C.: PositionRank: an unsupervised approach to keyphrase extraction from scholarly documents. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1105–1115. Association for Computational Linguistics (2017).
  4. 4.
    Glass, M., Gliozzo, A., Hassanzadeh, O., Mihindukulasooriya, N., Rossiello, G.: Inducing implicit relations from text using distantly supervised deep nets. In: Proceedings of the 17th International Semantic Web Conference (ISWC 2018) (2018)Google Scholar
  5. 5.
    Gollapalli, S.D., Caragea, C.: Extracting keyphrases from research papers using citation networks. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, pp. 1629–1635 (2014)Google Scholar
  6. 6.
    Gupta, A., Akula, A., Dasgupta, G., Aggarwal, P., Mohapatra, P.: Desire: deep semantic understanding and retrieval for technical support services. In: Drira, K., et al. (eds.) ICSOC 2016. LNCS, vol. 10380, pp. 207–210. Springer, Cham (2017). Scholar
  7. 7.
    Gupta, A., Akula, A., Dasgupta, G., Aggarwal, P., Mohapatra, P.: Desire: deep semantic understanding and retrieval for technical support services. In: Drira, K. (ed.) ICSOC 2016. LNCS, vol. 10380, pp. 207–210. Springer, Cham (2017). Scholar
  8. 8.
    Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, EMNLP 2003, pp. 216–223, Stroudsburg, PA, USA (2003)Google Scholar
  9. 9.
    Kim, S.N., Baldwin, T., Kan, M.Y.: An unsupervised approach to domain-specific term extraction. In: Proceedings of the Australasian Language Technology Association Workshop 2009, pp. 94–98 (2009)Google Scholar
  10. 10.
    Koch, R.: The 80/20 Principle: The Secret to Achieving More with Less (1999)Google Scholar
  11. 11.
    Liu, T., long Wang, X., Yi, G., Xu, Z.M., Wang, Q.: Domain-specific term extraction and its application in text classification. In: Proceedings of 8th Joint Conference on Information Sciences, pp. 1481–1484 (2005)Google Scholar
  12. 12.
    Liu, Z., Huang, W., Zheng, Y., Sun, M.: Automatic keyphrase extraction via topic decomposition. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. pp. 366–376. EMNLP ’10, Association for Computational Linguistics, Stroudsburg, PA, USA (2010).
  13. 13.
    Liu, Z., Li, P., Zheng, Y., Sun, M.: Clustering to find exemplar terms for keyphrase extraction. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1, EMNLP 2009, pp. 257–266, Stroudsburg, PA, USA (2009)Google Scholar
  14. 14.
    Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)CrossRefGoogle Scholar
  15. 15.
    McCallum, A., Li, W.: Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4, pp. 188–191. Association for Computational Linguistics (2003)Google Scholar
  16. 16.
    McCord, M.C.: Slot grammar. In: Studer, R. (ed.) Natural Language and Logic. LNCS, vol. 459, pp. 118–145. Springer, Heidelberg (1990). Scholar
  17. 17.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 26, pp. 3111–3119. Curran Associates, Inc. (2013)Google Scholar
  18. 18.
    Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: AAAI 1999. AAAI (1999)Google Scholar
  19. 19.
    da Silva Conrado, M., Salgueiro Pardo, T.A., Rezende, S.O.: A machine learning approach to automatic term extraction using a rich feature set. In: Proceedings of HLT-NAACL 2013, pp. 16–23 (2013).
  20. 20.
    Tjong Kim Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003-Volume 4, pp. 142–147. Association for Computational Linguistics (2003)Google Scholar
  21. 21.
    Vergara, S., El-Khouly, M., Tantawi, M.E., Marla, S., Sri, L.: Building Cognitive Applications with IBM Watson Services: Volume 7 Natural Language Understanding (2017)Google Scholar
  22. 22.
    Wang, R., Liu, W., McDonald, C.: Featureless domain-specific term extraction with minimal labelled data. In: Proceedings of the Australasian Language Technology Association Workshop 2016, pp. 103–112 (2016)Google Scholar
  23. 23.
    Zhang, C., Niu, Z., Jiang, P., Fu, H.: Domain-specific term extraction from free texts. In: 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 1290–1293. IEEE (2012)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Prateeti Mohapatra
    • 1
    Email author
  • Yu Deng
    • 2
  • Abhirut Gupta
    • 1
  • Gargi Dasgupta
    • 1
  • Amit Paradkar
    • 2
  • Ruchi Mahindru
    • 2
  • Daniela Rosu
    • 2
  • Shu Tao
    • 2
  • Pooja Aggarwal
    • 1
  1. 1.IBM ResearchBangaloreIndia
  2. 2.IBM ResearchYorktown HeightsUSA

Personalised recommendations