Parse Thicket Representation for Multi-sentence Search

  • Boris A. Galitsky
  • Sergei O. Kuznetsov
  • Daniel Usikov
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7735)


We develop a graph representation and learning technique for parse structures for sentences and paragraphs of text. This technique is used to improve relevance answering complex questions where an answer is included in multiple sentences. We introduce Parse Thicket as a sum of syntactic parse trees augmented by a number of arcs for inter-sentence word-word relations such as coreference and taxonomic. These arcs are also derived from other sources, including Rhetoric Structure theory, and respective indexing rules are introduced, which identify inter-sentence relations and joins phrases connected by these relations in the search index. Generalization of syntactic parse trees (as a similarity measure between sentences) is defined as a set of maximum common sub-trees for two parse trees. Generalization of a pair of parse thickets to measure relevance of a question and an answer, distributed in multiple sentences, is defined as a set of maximal common sub-parse thickets. The proposed approach is evaluated in the product search domain of, where user query includes product names, features and expressions for user needs, and the query keywords occur in different sentences of text. We demonstrate that search relevance is improved by single sentence-level generalization, and further increased by parse thicket generalization. The proposed approach is evaluated in the product search domain of, where user query includes product names, features and expressions for user needs, and the query keywords occur in different sentences of text.


learning taxonomy learning syntactic parse tree syntactic generalization search relevance 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aleman-Meza, B., Halaschek, C., Arpinar, I., Sheth, A.: A Context-Aware Semantic Association Ranking. In: Proc. First Int’l Workshop Semantic Web and Databases (SWDB 2003), pp. 33–50 (2003)Google Scholar
  2. 2.
    Bar-Haim, R., Dagan, I., Greental, I., Shnarch, E.: Semantic Inference at the Lexical-Syntactic Level AAAI (2005)Google Scholar
  3. 3.
    Bhogal, J., Macfarlane, A., Smith, P.: A review of ontology based query expansion. Information Processing & Management 43(4), 866–886 (2007)CrossRefGoogle Scholar
  4. 4.
    Chali, Y., Hasan, S.A., Joty, S.R.: Improving graph-based random walks for complex question answering using syntactic, shallow semantic and extended string subsequence kernels. Inf. Process. Manage. 47(6), 843–855 (2011)CrossRefGoogle Scholar
  5. 5.
    Ercan, G., Cicekli, I.: Using lexical chains for keyword extraction. Information Processing & Management 43(6), 1705–1714 (2007)CrossRefGoogle Scholar
  6. 6.
    Galitsky, B.: Natural Language Question Answering System: Technique of Semantic Headers. Advanced Knowledge International, Australia (2003)Google Scholar
  7. 7.
    Galitsky, B., González, M.P., Chesñevar, C.I.: A novel approach for classifying customer complaints through graphs similarities in argumentative dialogue. Decision Support Systems 46(3), 717–729 (2009)CrossRefGoogle Scholar
  8. 8.
    Galitsky, B., Dobrocsi, G., de la Rosa, J.L.: Inferring semantic properties of sentences mining syntactic parse trees. Data & Knowledge Engineering 81-82, 21–45 (2012)CrossRefGoogle Scholar
  9. 9.
    Galitsky, B., Dobrocsi, G., de la Rosa, J.L., Kuznetsov, S.O.: Using Generalization of Syntactic Parse Trees for Taxonomy Capture on the Web. In: 19th International Conference on Conceptual Structures, ICCS 2011, pp. 104–117 (2011)Google Scholar
  10. 10.
    Kapoor, S., Ramesh, H.: Algorithms for Enumerating All Spanning Trees of Undirected and Weighted Graphs. SIAM J. Computing 24, 247–265 (1995)MathSciNetzbMATHCrossRefGoogle Scholar
  11. 11.
    Kim, J.-J., Pezik, P., Rebholz-Schuhmann, D.: MedEvi: Retrieving textual evidence of relations between biomedical concepts from Medline. Bioinformatics 24(11), 1410–1412 (2008)CrossRefGoogle Scholar
  12. 12.
    Mann, W.C., Christian, M.I., Matthiessen, M., Thompson, S.A.: Rhetorical Structure Theory and Text Analysis. In: Mann, W.C., Thompson, S.A. (eds.), pp. 39–78. John Benjamins, Amsterdam (1992)Google Scholar
  13. 13.
    Moschitti, A.: Efficient Convolution Kernels for Dependency and Constituent Syntactic Trees. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 318–329. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Plotkin, G.D.: A note on inductive generalization. In: Meltzer, B., Michie, D. (eds.) Machine Intelligence, vol. 5, pp. 153–163. Elsevier North-Holland, New York (1970)Google Scholar
  15. 15.
    Punyakanok, V., Roth, D., Yih, W.: The Necessity of Syntactic Parsing for Semantic Role Labeling. In: IJCAI (2005)Google Scholar
  16. 16.
  17. 17.
    Marcu, D.: From Discourse Structures to Text Summaries. In: Mani, I., Maybury, M. (eds.) Proceedings of ACL Workshop on Intelligent Scalable Text Summarization, Madrid, Spain, pp. 82–88 (1997)Google Scholar
  18. 18.
    Mihalcea, R., Tarau, P.: TextRank: Bringing Order into Texts. In: Empirial Methods in NLP (2004)Google Scholar
  19. 19.
    Punyakanok, V., Roth, D., Yih, W.: Mapping dependencies trees: an application to question answering. In: Proceedings of AI & Math., Florida, USA (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Boris A. Galitsky
    • 1
  • Sergei O. Kuznetsov
    • 2
  • Daniel Usikov
    • 3
  1. 1.eBay IncSan JoseUSA
  2. 2.Higher School of EconomicsMoscowRussia
  3. 3.Dept. of PhysicsUniversity of MarylandUSA

Personalised recommendations