Skip to main content

Combining Data-Driven and Semantic Approaches for Text Mining

  • Chapter
Foundations for the Web of Information and Services

Abstract

While the amount of structured data published on the Web keeps growing (fostered in particular by the Linked Open Data initiative), the Web still comprises of mainly unstructured—in particular textual—content and is therefore a Web for human consumption. Thus, an important question is which techniques are most suitable to enable people to effectively access the large body of unstructured information available on the Web, whether it is semantic or not. While the hope is that semantic technologies can be combined with standard Information Retrieval approaches to enable more accurate retrieval, some researchers have argued against this view. They claim that only data-driven or inductive approaches are applicable to tasks requiring the organization of unstructured (mainly textual) data for retrieval purposes. We argue that the dichotomy between data-driven/inductive and semantic approaches is indeed a false one. We further argue that bottom-up or inductive approaches can be successfully combined with top-down or semantic approaches and illustrate this for a number of tasks such as Ontology Learning, Information Retrieval, Information Extraction and Text Mining.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    While it is true that fuzzy and non-monotonic extensions to description logics and OWL have been proposed, we puristicly view OWL as a non-fuzzy and monotonic logic here.

  2. 2.

    http://icame.uib.no/brown/bcm.html.

  3. 3.

    http://www.cis.upenn.edu/~treebank/.

  4. 4.

    http://km.aifb.kit.edu/projects/swsc.

  5. 5.

    TFIDF is a widely used statistical distribution value of terms in documents given a corpus. For a specific term and document, the TFIDF value is the product of the term frequency (TF)—the number of occurrences of the term in the given document—and the inverse document frequency (IDF)—the inverse number of documents in the corpus that contain the term.

  6. 6.

    Further extensions such as those by Bloehdorn and Moschitti [5] combine this idea with more complex so-called tree kernel functions for text structure.

  7. 7.

    http://www.opencalais.com.

References

  1. Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Proceedings of SIGMOD Conference, pp. 207–216 (1993)

    Google Scholar 

  2. Basili, R., Moschitti A., Pazienza M.T., Zanzotto, F.M.: A contrastive approach to term extraction. In: Proceedings of the 4th Terminology and Artificial Intelligence Conference (TIA), May, pp. 119–128 (2001)

    Google Scholar 

  3. Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Scientific American (May Issue) (2001)

    Google Scholar 

  4. Bloehdorn, S., Hotho, A.: Text classification by boosting weak learners based on terms and concepts. In: Proceedings of the 4th IEEE International Conference on Data Mining (ICDM)

    Google Scholar 

  5. Bloehdorn, S., Moschitti, A.: Combined syntactic and semantic kernels for text classification. In: Amati, G., Carpineto, C., Romano, G. (eds.) Proceedings of the 29th European Conference on Information Retrieval (ECIR), Rome, Italy, pp. 307–318. Springer, Berlin (2007)

    Google Scholar 

  6. Bloehdorn, S., Basili, R., Cammisa, M., Moschitti, A.: Semantic kernels for text classification based on topological measures of feature similarity. In: Proceedings of the 6th IEEE International Conference on Data Mining (ICDM), Hong Kong, China. IEEE Comput. Soc., Los Alamitos (2006)

    Google Scholar 

  7. Bloehdorn, S., Cimiano, P., Hotho, A.: Learning ontologies to improve text clustering and classification. In: Spiliopoulou, M., Kruse, R., Nürnberger, A., Borgelt, C., Gaul, W. (eds.) Proceedings of the 29th Annual Conference of the German Classification Society (GfKl), Magdeburg, Germany, 2005, pp. 334–341. Springer, Berlin (2006)

    Google Scholar 

  8. Bloehdorn, S., Cimiano, P., Duke, A., Haase, P., Heizmann, J., Thurlow, I., Völker, J.: Ontology-based question answering for digital libraries. In: Proceedings of the 11th European Conference on Research and Advanced Technologies for Digital Libraries (ECDL), September 2007. Lecture Notes in Computer Science, vol. 4675. Springer, Berlin (2007). ISBN 978-3-540-74850-2

    Chapter  Google Scholar 

  9. Blohm, S., Cimiano, P.: Using the web to reduce data sparseness in pattern-based information extraction. In: Proceedings of the 11th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), Warsaw, Poland, pp. 18–29. Springer, Berlin (2007)

    Google Scholar 

  10. Blohm, S., Cimiano, P., Stemle, E.: Harvesting relations from the web—quantifying the impact of filtering functions. In: Proceedings of the 22nd Conference on Artificial Intelligence (AAAI), pp. 1316–1323. AAAI Press, Menlo Park (2007)

    Google Scholar 

  11. Blohm, S., Buza, K., Cimiano, P., Schmidt-Thieme, L.: Relation extraction for the semantic web with taxonomic sequential patterns. In: Sugumaran, V., Gulla, J.A. (eds.) Applied Semantic Web Technologies. Taylor & Francis, London (2011, to appear)

    Google Scholar 

  12. Bonino, D., Corno, F.: Self-similarity metric for index pruning in conceptual vector space models. In: DEXA Workshops, pp. 225–229. IEEE Comput. Soc., Los Alamitos (2008)

    Google Scholar 

  13. Brants, T., Popat, A., Xu, P.J.D., Och, F.J.: Large language models in machine translation. In: Proceedings of the 2007 Conference on Empirical Methods in Natural Language Processing (EMNLP), (2007)

    Google Scholar 

  14. Brewster, C., Ciravegna, F., Wilks, Y.: Background and foreground knowledge in dynamic ontology construction. In: Proceedings of the SIGIR Semantic Web Workshop, (2003)

    Google Scholar 

  15. Brin, S.: Extracting patterns and relations from the world wide web. In: Selected Papers from the International Workshop on the World Wide Web and Databases (WebDB), London, UK, pp. 172–183. Springer, Berlin (1999). ISBN 3-540-65890-4

    Google Scholar 

  16. Brunzel, M.: The XTREEM methods for ontology learning from web documents. In: Buitelaar, P., Cimiano, P. (eds.) Ontology Learning and Population: Bridging the Gap Between Text and Knowledge, January. Frontiers in Artificial Intelligence and Applications, vol. 167, pp. 3–26. IOS Press, Amsterdam (2008)

    Google Scholar 

  17. Buitelaar, P., Cimiano, P., Magnini, B.: Ontology learning from Text: Methods, Evaluation and Applications, Juli. Frontiers in Artificial Intelligence, vol. 123. IOS Press, Amsterdam (2005)

    Google Scholar 

  18. Chodorow, M., Byrd, R.J., Heidorn, G.E.: Extracting semantic hierarchies from a large on-line dictionary. In: Proceedings of the 23rd Annual Meeting on Association for Computational Linguistics (ACL), pp. 299–304. Association for Computational Linguistics, Stroudsburg (1985)

    Chapter  Google Scholar 

  19. Cimiano, P.: Ontology Learning and Population from Text: Algorithms, Evaluation and Applications. Springer, Berlin (2006). ISBN 978-0-387-30632-2

    Google Scholar 

  20. Cimiano, P.: Ontology learning and population from text. PhD thesis, Universität Karlsruhe (TH), Germany (2006)

    Google Scholar 

  21. Cimiano, P., Völker, J.: Text2Onto—a framework for ontology learning and data-driven change discovery. In: Montoyo, A., Munoz, R., Metais, E. (eds.) Proceedings of the 10th International Conference on Applications of Natural Language to Information Systems (NLDB), Alicante, Spain, June. Lecture Notes in Computer Science, vol. 3513, pp. 227–238. Springer, Berlin (2005)

    Google Scholar 

  22. Cimiano, P., Wenderoth, J.: Automatic acquisition of ranked qualia structures from the web. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), June, pp. 888–895 (2007)

    Google Scholar 

  23. Cimiano, P., Handschuh, S., Staab, S.: Towards the self-annotating web. In: Proceedings of the 13th International World Wide Web Conference (WWW), May, pp. 462–471. ACM, New York (2004). ISBN 1-58113-844-X

    Chapter  Google Scholar 

  24. Cimiano, P., Hotho, A., Staab, S.: Comparing conceptual, divise and agglomerative clustering for learning taxonomies from text. In: de Mántaras, R.L., Saitta, L. (eds.) Proceedings of the 16th European Conference on Artificial Intelligence (ECAI), Valencia, Spain, pp. 435–439. IOS Press, Amsterdam (2004). ISBN 1-58603-452-9

    Google Scholar 

  25. Cimiano, P., Hotho, A., Staab, S.: Learning concept hierarchies from text corpora using formal concept analysis. Journal of Artificial Intelligence Research 24, 305–339 (2005)

    MATH  Google Scholar 

  26. Cimiano, P., Ladwig, G., Staab, S.: Gimme the context: context-driven automatic semantic annotation with C-PANKOW. In: Ellis, A., Hagino, T. (eds.) Proceedings of the 14th International World Wide Web Conference (WWW), Chiba, Japan, May, pp. 332–341. ACM, New York (2005)

    Chapter  Google Scholar 

  27. Cimiano, P., Pivk, A., Schmidt-Thieme, L., Staab, S.: Learning taxonomic relations from heterogeneous sources of evidence. In: Buitelaar, P., Cimiano, P., Magnini, B. (eds.) Ontology Learning from Text: Methods, Evaluation and Applications, July. Frontiers in Artificial Intelligence, vol. 123, pp. 59–73. IOS Press, Amsterdam (2005)

    Google Scholar 

  28. Cimiano, P., Schultz, A., Sizov, S., Sorg, P., Staab, S.: Explicit versus latent concept models for cross-language information retrieval. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), pp. 1513–1518 (2009)

    Google Scholar 

  29. Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. Journal of the American Society of Information Science 41(6), 391–407 (1990)

    Article  Google Scholar 

  30. Drouin, P.: Detection of domain specific terminology using corpora comparison. In: Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC), pp. 79–82. European Language Resources Association, Paris (2004)

    Google Scholar 

  31. Drumm, C., Schmitt, M., Do, H.H., Rahm, E.: Quickmig: automatic schema matching for data migration projects. In: CIKM, pp. 107–116 (2007)

    Google Scholar 

  32. Dumais, S., Letsche, T., Littman, M., Landauer, T.: Automatic cross-language retrieval using latent semantic indexing. In: Proceedings of the AAAI Symposium on Cross-Language Text and Speech Retrieval (1997)

    Google Scholar 

  33. Ehrig, M.: Ontology Alignment: Bridging the Semantic Gap. Semantic Web and Beyond: Computing for Human Experience, vol. 4. Springer, Berlin (2007). ISBN 978-0-387-36501-5

    Google Scholar 

  34. Evans, R.: A framework for named entity recognition in the open domain. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP), pp. 137–144 (2003)

    Google Scholar 

  35. Feldman, R., Dagan, I.: Knowledge discovery in texts (KDT). In: Fayyad, U.M., Uthurusamy, R. (eds.) Proceedings of the First International Conference on Knowledge Discovery (KDD 1996), Montreal, Quebec, Canada, August 20–21, pp. 112–117. AAAI Press, Menlo Park (1995)

    Google Scholar 

  36. Fellbaum, C.: WordNet. An Electronic Lexical Database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  37. Firth, J.R.: A Synopsis of Linguistic Theory, 1930–1955. Studies in Linguistic Analysis, pp. 1–32 (1957)

    Google Scholar 

  38. Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), pp. 1606–1611 (2007)

    Google Scholar 

  39. Gärdenfors, P.: Conceptual Spaces: The Geometry of Thought. MIT Press, London (2000)

    Google Scholar 

  40. Giesbrecht, E.: In search of semantic compositionality in vector spaces. In: ICCS, pp. 173–184 (2009)

    Google Scholar 

  41. Gonzalo, J., Verdejo, F., Chugur, I., Cigarran, J.: Indexing with WordNet synsets can improve text retrieval. In: Proceedings of the COLING/ACL ’98 Workshop on Usage of WordNet for NLP, Montreal, Canada, pp. 38–44 (1998)

    Google Scholar 

  42. Guthrie, L., Slator, B.M., Wilks, Y., Bruce, R.: Is there content in empty heads? In: Proceedings of the 13th Conference on Computational Linguistics (COLING), Morristown, NJ, USA pp. 138–143. Association for Computational Linguistics, Stroudsburg (1990). ISBN 952-90-2028-7

    Chapter  Google Scholar 

  43. Haase, P., Völker, J.: Ontology learning and reasoning—dealing with uncertainty and inconsistency. In: da Costa, P.C.G., d’Amato, C., Fanizzi, N., Laskey, K.B., Laskey, K.J., Lukasiewicz, T., Nickles, M., Pool, M. (eds.) Uncertainty Reasoning for the Semantic Web I. Lecture Notes in Artificial Intelligence, vol. 5327. Springer, Berlin (2008). ISBN 978-3-540-89764-4. ISWC International Workshop, URSW 2005–2007. Revised Selected and Invited Papers

    Chapter  Google Scholar 

  44. Haase, P., Schnizler, B., Broekstra, J., Ehrig, M., Harmelen, F., Mika, M., Plechawski, M., Pyszlak, P., Siebes, R., Staab, S., Tempich, C.: Bibster—a semantics-based bibliographic peer-to-peer system. Journal of Web Semantics 2(1), 99–103 (2005)

    Article  Google Scholar 

  45. Haase, P., Stojanovic, N., Sure, Y., Völker, J.: Personalized information retrieval in bibster, a semantics-based bibliographic peer-to-peer system. In: Tochtermann, K., Maurer, H. (eds.) Proceedings of the 5th International Conference on Knowledge Management (I-KNOW), July, pp. 104–111 (2005). JUCS, July

    Google Scholar 

  46. Halevy, A.Y., Norvig, P., Pereira, F.: The unreasonable effectiveness of data. IEEE Intelligent Systems 24(2), 8–12 (2009)

    Article  Google Scholar 

  47. Hall, J., Nilsson, J., Nivre, J., Megyesi, B., Nilsson, M., Saers, M.: Single malt or blended? A study in multilingual parser optimization. In: Proc. of the Joint Conf. on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL (2007)

    Google Scholar 

  48. Harris, Z.: Linguistic transformations for information retrieval. In: Proceedings of the International Conference on Scientific Information, vol. 2, Washington, DC (1959)

    Google Scholar 

  49. Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th Conference on Computational Linguistics, vol. 2. Association for Computational Linguistics, Stroudsburg (1992)

    Google Scholar 

  50. Hotho, A.: Clustern Mit Hintergrundwissen. Dissertationen zur Künstlichen Intelligenz, vol. 286. Akademische Verlagsgesellschaft, Berlin (2004). In German. Originally published as PhD thesis, Universität Karlsruhe (TH), Karlsruhe, Germany (2004)

    Google Scholar 

  51. Hotho, A., Staab, S., Stumme, G.: Explaining text clustering results using semantic structures. In: Principles of Data Mining and Knowledge Discovery, 7th European Conference, PKDD 2003, Dubrovnik, Croatia, September 22–26, 2003. Lecture Notes in Computer Science, pp. 217–228. Springer, Berlin (2003)

    Google Scholar 

  52. Hotho, A., Staab, S., Stumme, G.: Ontologies improve text document clustering. In: Proc. of the ICDM 03, The 2003 IEEE International Conference on Data Mining, pp. 541–544 (2003)

    Google Scholar 

  53. Hotho, A., Nürnberger, A., Paaß, G.: A brief survey of text mining. LDV Forum—GLDV Journal for Computational Linguistics and Language Technology 20(1), 19–62 (2005). ISSN 0175-1336

    Google Scholar 

  54. Jaimes, A., Smith, J.R.: Semi-automatic, data-driven construction of multimedia ontologies. In: Proceedings of the International Conference on Multimedia and Expo (ICME), Washington, DC, USA, pp. 781–784. IEEE Comput. Soc., Los Alamitos (2003). ISBN 0-7803-7965-9

    Google Scholar 

  55. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Upper Saddle River (1988)

    MATH  Google Scholar 

  56. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computing Surveys 31(3), 264–323 (1999)

    Article  Google Scholar 

  57. Jäschke, R., Hotho, A., Schmitz, C., Ganter, B., Stumme, G.: Discovering shared conceptualizations in folksonomies. Journal of Web Semantics 6(1), 38–53 (2008). ISSN 1570-8268

    Article  Google Scholar 

  58. Kashyap, V., Ramakrishnan, C., Thomas, C., Sheth, A.: TaxaMiner: an experimentation framework for automated taxonomy bootstrapping. International Journal of Web and Grid Services 1(2), 240–266 (2005). ISSN 1741-1106

    Article  Google Scholar 

  59. Katz, S.M., Gauvain, J.L., Lamel, L.F., Adda, G., Mariani, J.: Estimation of probabilities from sparse data for the language model component of a speech recognizer. International Journal of Pattern Recognition and Artificial Intelligence 8 (1987)

    Google Scholar 

  60. Kavalec, M., Svátek, V.: A study on automated relation labelling in ontology learning. In: Buitelaar, P., Cimiano, P., Magnini, B. (eds.) Ontology Learning from Text: Methods, Evaluation and Applications. Frontiers in Artificial Intelligence and Applications, vol. 123, pp. 44–58. IOS Press, Amsterdam (2005)

    Google Scholar 

  61. Landauer, T.K., Dumais, S.T.: A solution to Plato’s problem: the latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review 104, 211–240 (1997)

    Article  Google Scholar 

  62. Li, M., Du, X.-y., Wang, S.: Learning ontology from relational database. In: Proceedings of the 4th International Conference on Machine Learning and Cybernetics, pp. 3410–3415 (2005)

    Google Scholar 

  63. Lund, K., Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instrumentation, and Computers, 203–220 (1996)

    Google Scholar 

  64. Mädche, A.: Ontology learning for the semantic web. PhD thesis, Universität Karlsruhe (TH), Germany (2001)

    Google Scholar 

  65. Mädche, A., Staab, S.: Discovering conceptual relations from text. In: Horn, W. (ed.) Proceedings of the 14th European Conference on Artificial Intelligence (ECAI), August, pp. 321–325. IOS Press, Amsterdam (2000)

    Google Scholar 

  66. Mädche, A., Volz, R.: The text-to-onto ontology extraction and maintenance system. In: Workshop on Integrating Data Mining and Knowledge Management at the 1st International Conference on Data Mining (ICDM) (2001)

    Google Scholar 

  67. Meilicke, C., Völker, J., Stuckenschmidt, H.: Debugging mappings between lightweight ontologies. In: Proceedings of the 16th International Conference on Knowledge Engineering and Knowledge Management (EKAW), September. Lecture Notes in Artificial Intelligence, pp. 93–108. Springer, Berlin (2008). Best Paper Award!

    Google Scholar 

  68. Miller, G.A.: WordNet: a lexical database for English. Communications of the ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  69. Moench, E., Ullrich, M., Schnurr, H.-P., Angele, J.: Semanticminer—ontology-based knowledge retrieval. Journal of Universal Computer Science 9(7), 682–696 (2003)

    Google Scholar 

  70. Müller, C., Gurevych, I.: Using Wikipedia and Wiktionary in domain-specific information retrieval. In: Working Notes of the Annual CLEF Meeting (2008)

    Google Scholar 

  71. Newbold, N., Vrusias, B., Gillam, L.: Lexical ontology extraction using terminology analysis: automating video annotation. In: Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odjik, J., Piperidis, S., Tapias, D. (eds.) Proceedings of the 6th International Language Resources and Evaluation (LREC), Marrakech, Morocco, May. ELRA, Paris (2008)

    Google Scholar 

  72. Ogata, N., Collier, N.: Ontology express: statistical and non-monotonic learning of domain ontologies from text. In: Proceedings of the Workshop on Ontology Learning and Population (OLP) at the 16th European Conference on Artificial Intelligence (ECAI), August (2004)

    Google Scholar 

  73. Papka, R., Allan, J.: On-line new event detection using single pass clustering. Technical report, University of Massachusetts, Amherst, MA, USA 1998

    Google Scholar 

  74. Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: AAAI ’99/IAAI ’99: Proceedings of the Sixteenth National Conference on Artificial Intelligence and the Eleventh Innovative Applications of Artificial Intelligence Conference Innovative Applications of Artificial Intelligence, pp. 474–479. American Association for Artificial Intelligence, Menlo Park (1999). ISBN 0-262-51106-1

    Google Scholar 

  75. Sabou, M.: Building web service ontologies. PhD thesis, Vrije Universiteit Amsterdam, The Netherlands (2006)

    Google Scholar 

  76. Sahlgren, M.: An introduction to random indexing. In: Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering, TKE, (2005)

    Google Scholar 

  77. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing and Management 24(5), 513–523 (1988)

    Article  Google Scholar 

  78. Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)

    MATH  Google Scholar 

  79. Sanchez, D.: Domain ontology learning from the web. PhD thesis, Universitat Politècnica de Catalunya, Spain (2007)

    Google Scholar 

  80. Schmitz, C., Hotho, A., Jäschke, R., Stumme, G.: Mining association rules in folksonomies. In: Batagelj, V., Bock, H.-H., Ferligoj, A., Ziberna, A. (eds.) Data Science and Classification (Proc. IFCS 2006 Conference), Ljubljana, July. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 261–270. Springer, Berlin (2006). ISBN 978-3-540-34415-5. doi:10.1007/3-540-34416-0_28

    Chapter  Google Scholar 

  81. Schütze, H.: Word space. In: Hanson, S., Cowan, J., Giles, C. (eds.) Advances in Neural Information Processing Systems 5. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  82. Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)

    Article  Google Scholar 

  83. Simperl, E., Tempich, C., Vrandečić, D.: A methodology for ontology learning. In: Buitelaar, P., Cimiano, P. (eds.) Ontology Learning and Population: Bridging the Gap Between Text and Knowledge, January. Frontiers in Artificial Intelligence and Applications, vol. 167, pp. 225–249. IOS Press, Amsterdam (2008)

    Google Scholar 

  84. Sorg, P., Cimiano, P.: Cross-lingual information retrieval with explicit semantic analysis. In: Working Notes of the Annual CLEF Meeting (2008)

    Google Scholar 

  85. Sorg, P., Cimiano, P.: An experimental comparison of explicit semantic analysis implementations for cross-language retrieval. In: Proceedings of 14th International Conference on Applications of Natural Language to Information Systems (NLDB), Saarbrücken (2009)

    Google Scholar 

  86. Stojanovic, N.: On the role of the librarian agent in ontology-based knowledge management systems. Journal of Universal Computer Science 9(7), 697–718 (2003)

    Google Scholar 

  87. Sure, Y., Hitzler, P., Eberhart, A., Studer, R.: The semantic web in one day. IEEE Intelligent Systems 20(3), 85–87 (2005). ISBN 1541-1672. doi:10.1109/MIS.2005.54

    Article  Google Scholar 

  88. Völker, J.: Learning expressive ontologies. PhD thesis, Universität Karlsruhe (TH), Germany (2008)

    Google Scholar 

  89. Völker, J., Rudolph, S.: Lexico-logical acquisition of OWL DL axioms—an integrated approach to ontology refinement. In: Medina, R., Obiedkov, S. (eds.) Proceedings of the 6th International Conference on Formal Concept Analysis (ICFCA), February. Lecture Notes in Artificial Intelligence, vol. 4933, pp. 62–77. Springer, Berlin (2008)

    Chapter  Google Scholar 

  90. Völker, J., Rudolph, S.: Fostering web intelligence by semi-automatic OWL ontology refinement. In: Proceedings of the 7th International Conference on Web Intelligence (WI), December. IEEE Press, New York (2008). Regular paper

    Google Scholar 

  91. Völker, J., Vrandečić, D., Sure, Y.: Automatic evaluation of ontologies (AEON). In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) Proceedings of the 4th International Semantic Web Conference (ISWC), November. Lecture Notes in Computer Science, vol. 3729, pp. 716–731. Springer, Berlin (2005)

    Google Scholar 

  92. Völker, J., Hitzler, P., Cimiano, P.: Acquisition of OWL DL axioms from lexical resources. In: Franconi, E., Kifer, M., May, W. (eds.) Proceedings of the 4th European Semantic Web Conference (ESWC), June. Lecture Notes in Computer Science, vol. 4519, pp. 670–685. Springer, Berlin (2007)

    Google Scholar 

  93. Völker, J., Vrandečić, D., Sure, Y., Hotho, A.: Learning disjointness. In: Franconi, E., Kifer, M., May, W. (eds.) Proceedings of the 4th European Semantic Web Conference (ESWC), June. Lecture Notes in Computer Science, vol. 4519, pp. 175–189. Springer, Berlin (2007)

    Google Scholar 

  94. Völker, J., Vrandečić, D., Sure, Y., Hotho, A.: AEON—an approach to the automatic evaluation of ontologies. Journal of Applied Ontology 3(1–2), 41–62 (2008). Special Issue on Ontological Foundations of Conceptual Modeling

    Google Scholar 

  95. Widdows, D.: Semantic vector products: some initial investigations. In: Proceedings of the Second AAAI Symposium on Quantum Interaction (QI) (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stephan Bloehdorn .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Bloehdorn, S. et al. (2011). Combining Data-Driven and Semantic Approaches for Text Mining. In: Fensel, D. (eds) Foundations for the Web of Information and Services. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19797-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19797-0_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19796-3

  • Online ISBN: 978-3-642-19797-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics