Using Generalization of Syntactic Parse Trees for Taxonomy Capture on the Web

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6828)


We implement a scalable mechanism to build a taxonomy of entities which improves relevance of search engine in a vertical domain. Taxonomy construction starts from the seed entities and mines the web for new entities associated with them. To form these new entities, machine learning of syntactic parse trees (syntactic generalization) is applied to form commonalities between various search results for existing entities on the web. Taxonomy and syntactic generalization is applied to relevance improvement in search and text similarity assessment in commercial setting; evaluation results show substantial contribution of both sources.


learning taxonomy learning syntactic parse tree syntactic generalization search relevance 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alani, H., Brewster, C.: Ontology ranking based on the analysis of concept structures. In: K-CAP 2005 Proceedings of the 3rd International Conference on Knowledge Capture (2005)Google Scholar
  2. 2.
    Heddon, H.: Better Living Through Taxonomies. Digital Web Magazine (2008),
  3. 3.
    Allen, J.F.: Natural Language Understanding, Benjamin Cummings (1987)Google Scholar
  4. 4.
    Chakrabarti, D., Faloutsos, C.: Graph Mining: Laws, Generators, and Algorithms. ACM Computing Surveys 38(1) (2006)Google Scholar
  5. 5.
    Dzikovska, M., Swift, M., Allen, J., de Beaumont, W.: Generic parsing for multi-domain semantic interpretation. In: International Workshop on Parsing Technologies (IWPT 2005), Vancouver BC (2005)Google Scholar
  6. 6.
    Cardie, C., Mooney, R.J.: Machine Learning and Natural Language. Machine Learning 1(5) (1999)Google Scholar
  7. 7.
    Carreras, X., Marquez, L.: Introduction to the CoNLL-2004 shared task: Semantic role labeling. In: Proceedings of the Eighth Conference on Computational Natural Language Learning, pp. 89–97. ACL, Boston (2004)Google Scholar
  8. 8.
    Galitsky, B.: Natural Language Question Answering System: Technique of Semantic Headers. In: Advanced Knowledge International, Australia (2003)Google Scholar
  9. 9.
    Galitsky, B., Dobrocsi, G., de la Rosa, J.L., Kuznetsov, S.O.: From Generalization of Syntactic Parse Trees to Conceptual Graphs. In: Croitoru, M., Ferré, S., Lukose, D. (eds.) ICCS 2010. LNCS, vol. 6208, pp. 185–190. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  10. 10.
    Galitsky, B.: Disambiguation Via Default Rules Under Answering Complex Questions. Intl. J. AI. Tools 14(1-2) (2005)Google Scholar
  11. 11.
    Howard, R.W.: Classifying types of concept and conceptual structure: Some taxonomies. Journal of Cognitive Psychology 4(2), 81–111 (1992)CrossRefGoogle Scholar
  12. 12.
    Plotkin., G.D.: A note on inductive generalization. In: Meltzer, Michie (eds.) Machine Intelligence, vol. 5, pp. 153–163. Edinburgh University Press, Edinburgh (1970)Google Scholar
  13. 13.
    Ravichandran, D., Hovy, E.: Learning surface text patterns for a Question Answering system. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL 2002), Philadelphia, PA (2002)Google Scholar
  14. 14.
    Lin, D., Pantel, P.: DIRT: discovery of inference rules from text. In: Proc. of ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2001, pp. 323–328 (2001)Google Scholar
  15. 15.
    Durme, B.V., Huang, Y., Kupsc, A., Nyberg, E.: Towards light semantic processing for question answering. In: HLT Workshop on Text Meaning (2003)Google Scholar
  16. 16.
    Kapoor, S., Ramesh, H.: Algorithms for Enumerating All Spanning Trees of Undirected and Weighted Graphs. SIAM J. Computing 24, 247–265 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    De la Rosa, J.L., Rovira, M., Beer, M., Montaner, M., Gibovic, D.: Reducing Administrative Burden by Online Information and Referral Services. In: Reddick, C.G. (ed.) Citizens and E-Government: Evaluating Policy and Management, pp. 131–157. IGI Global, Austin (2010)CrossRefGoogle Scholar
  18. 18.
    López Arjona, A.M., Rigall, M.M., de la Rosa i Esteva, J.L., Regàs, M.M.R.I.: POP2.0: A search engine for public information services in local government. In: Angulo, C., Godo, L. (eds.) Artificial Intelligence Research and Development, vol. 163, pp. 255–262. IOS Press, Amsterdam (2007)Google Scholar
  19. 19.
    Kozareva, Z., Hovy, E., Riloff, E.: Learning and Evaluating the Content and Structure of a Term Taxonomy. In: Learning by Reading and Learning to Read AAAI Spring Symposium, Stanford CA (2009)Google Scholar
  20. 20.
    Liu, J., Birnbaum, L.: What do they think? Aggregating local views about news events and topics. In: WWW 2008, pp. 1021–1022 (2008)Google Scholar
  21. 21.
    Liu, J., Birnbaum, L.: Measuring Semantic Similarity between Named Entities by Searching the Web Directory. Web Intelligence, 461–465 (2007)Google Scholar
  22. 22.
    Kerschberg, L., Kim, W., Scime, A.: A Semantic Taxonomy-Based Personalizable Meta-Search Agent. In: Truszkowski, W., Hinchey, M., Rouff, C.A. (eds.) WRAC 2002. LNCS, vol. 2564, pp. 3–31. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  23. 23.
    Roth, C.: Compact, evolving community taxonomies using concept lattices ICCS 14, July 17-21, Aalborg, DK (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  1. 1.University of GironaGironaSpain
  2. 2.Higher School of EconomicsMoscowRussia

Personalised recommendations