On Deriving Data Summarization through Ontologies to Meet User Preferences

  • Troels Andreasen
  • Henrik Bulskov
Part of the Studies in Computational Intelligence book series (SCI, volume 223)


A summary is a comprehensive description that grasps the essence of a subject. A text, a collection of text documents, a query answer can be summarized by simple means such as an automatically generated list of the most frequent words or ”advanced” by a meaningful textual description of the subject. In between these two extremes are summaries by means of selected concepts exploiting background knowledge providing selected key concepts. We address in this paper an approach where conceptual summaries are provided through a conceptualization as given by an ontology. The idea is to restrict a background ontology to the set of concepts that appears in the text to be summarized and therebyl provide a structure, a so-called instantiated ontology, that is specific to the domain of the text and can be used to condense to a summary not only quantitatively but also conceptually covers the subject of the text. In this chapter we introduce different approaches to summarization. We consider a strictly ontologly based approach where summaries are derived solely from the instantiated ontology, a conceptual clustering over the instantiated concepts based on a semantic similarity measure, and an approach based on probabilities.


Short Path Length Query Answer Semantic Similarity Measure Connectivity Cluster Conceptual Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Jensen, P.A., Nilsson, J.F.: Ontology-based Semantics for Prepositions, in Syntax and Semantics of Prepositions. In: Paint-Dizier, P. (ed.) Text, Speech and Language Technology, vol. 29. Springer, Heidelberg (2006)Google Scholar
  2. 2.
    Aronson, S.R.: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: Proc. AMIA Symp., pp. 17–21 (2001)Google Scholar
  3. 3.
    Abney, S.: Partial parsing via finite-state cascades. In: Proceedings of the ESSLLI 1996 Robust Parsing Workshop (1996)Google Scholar
  4. 4.
    Bodenreider, O.: The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Research 32, D267–D270 (2004)CrossRefGoogle Scholar
  5. 5.
    Hahn, U., Mani, I.: The Challenges of Automatic Summarization Computer (November 2000)Google Scholar
  6. 6.
    Melli, G., Wang, Y., Liu, Y., Kashani, M.M., Shi, Z., Gu, B., Sarkar, A., Popowich, F.: Description of SQUASH, the SFU Question Answering Summary Handler for the DUC 2005 Summarization Task. In: Proceedings of DUC 2005, Vancouver, Canada, pp. 103–110 (2005)Google Scholar
  7. 7.
    Shi, Z., Melli, G., Wang, Y., Liu, Y., Gu, B., Kashani, M.M., Sarkar, A., Popowich, F.: Question Answering Summarization of Multiple Biomedical Documents. In: Kobti, Z., Wu, D. (eds.) Canadian AI 2007. LNCS, vol. 4509, pp. 284–295. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  8. 8.
    Andreasen, T., Bulskov, H.: Conceptual Querying Through Ontologies. Fuzzy Sets and Systems (2008) (to appear)Google Scholar
  9. 9.
    Nilsson, J.F.: A logico-algebraic framework for ontologies – ONTOLOG. In: Jensen, P.A., Skadhauge, P. (eds.) First International OntoQuery Workshop, University of Southern Denmark (2001)Google Scholar
  10. 10.
    Miller, G.A., Chodorow, M., Landes, S., Leacock, C., Thomas, R.G.: Using a semantic concordance for sense identification. In: Proc. of the ARPA Human Language Technology Workshop, pp. 240–243 (1994)Google Scholar
  11. 11.
    Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  12. 12.
    Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man, and Cybernetics 19(1), 17–30 (1989)CrossRefGoogle Scholar
  13. 13.
    Unified Medical Language System U.S. National Library of Medicine,
  14. 14.
    Medical Literature Analysis and Retrieval System Online U.S. National Library of Medicine,
  15. 15.
    Bulskov, H., Knappe, R., Andreasen, T.: On measuring similarity for conceptual querying. In: Andreasen, T., Motro, A., Christiansen, H., Larsen, H.L. (eds.) FQAS 2002. LNCS, vol. 2522, pp. 100–111. Springer, Heidelberg (2002)Google Scholar
  16. 16.
    Resnik, P.: Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language (1999)Google Scholar
  17. 17.
    Andreasen, T., Knappe, R., Bulskov, H.: Domain-specific similarity and retrieval. In: Proceedings IFSA 2005, pp. 496–502. Tsinghua University Press (2005)Google Scholar
  18. 18.
    Andreasen, T., Jensen, P.A., Nilsson, J.F., Paggio, P., Pedersen, B.S., Thomsen, H.E.: Content-based text querying with ontological descriptors. Data Knowledge Engineering 48(2), 199–219Google Scholar
  19. 19.
    Yager, R.R., Petry, F.E.: A Multicriteria Approach to Data Summarization Using Concept Hierarchies. IEEE Trans. on Fuzzy Sys. 14(6) (2006)Google Scholar
  20. 20.
    Bulskov, H., Andreasen, T., Terney, T.V.: Conceptual Summaries as Query Answers. In: Fuzzy Information Processing Society, 2007. NAFIPS apos 2007. Annual Meeting of the North American, June 24-27, pp. 458–462 (2007)Google Scholar
  21. 21.
    Andreasen, T., Bulskov, H.: Conceptual Querying Through Ontologies. In: Fuzzy Sets and Systems (2008) (to appear)Google Scholar
  22. 22.
    Zhou, X., Han, H.: Survey of word sense disambiguation approaches. In: 18th FLAIRS Conference (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Troels Andreasen
    • 1
  • Henrik Bulskov
    • 1
  1. 1.Roskilde UniversityRoskildeDenmark

Personalised recommendations