Parse Thicket Representation for Multi-sentence Search
We develop a graph representation and learning technique for parse structures for sentences and paragraphs of text. This technique is used to improve relevance answering complex questions where an answer is included in multiple sentences. We introduce Parse Thicket as a sum of syntactic parse trees augmented by a number of arcs for inter-sentence word-word relations such as coreference and taxonomic. These arcs are also derived from other sources, including Rhetoric Structure theory, and respective indexing rules are introduced, which identify inter-sentence relations and joins phrases connected by these relations in the search index. Generalization of syntactic parse trees (as a similarity measure between sentences) is defined as a set of maximum common sub-trees for two parse trees. Generalization of a pair of parse thickets to measure relevance of a question and an answer, distributed in multiple sentences, is defined as a set of maximal common sub-parse thickets. The proposed approach is evaluated in the product search domain of eBay.com, where user query includes product names, features and expressions for user needs, and the query keywords occur in different sentences of text. We demonstrate that search relevance is improved by single sentence-level generalization, and further increased by parse thicket generalization. The proposed approach is evaluated in the product search domain of eBay.com, where user query includes product names, features and expressions for user needs, and the query keywords occur in different sentences of text.
Keywordslearning taxonomy learning syntactic parse tree syntactic generalization search relevance
Unable to display preview. Download preview PDF.
- 1.Aleman-Meza, B., Halaschek, C., Arpinar, I., Sheth, A.: A Context-Aware Semantic Association Ranking. In: Proc. First Int’l Workshop Semantic Web and Databases (SWDB 2003), pp. 33–50 (2003)Google Scholar
- 2.Bar-Haim, R., Dagan, I., Greental, I., Shnarch, E.: Semantic Inference at the Lexical-Syntactic Level AAAI (2005)Google Scholar
- 6.Galitsky, B.: Natural Language Question Answering System: Technique of Semantic Headers. Advanced Knowledge International, Australia (2003)Google Scholar
- 9.Galitsky, B., Dobrocsi, G., de la Rosa, J.L., Kuznetsov, S.O.: Using Generalization of Syntactic Parse Trees for Taxonomy Capture on the Web. In: 19th International Conference on Conceptual Structures, ICCS 2011, pp. 104–117 (2011)Google Scholar
- 12.Mann, W.C., Christian, M.I., Matthiessen, M., Thompson, S.A.: Rhetorical Structure Theory and Text Analysis. In: Mann, W.C., Thompson, S.A. (eds.), pp. 39–78. John Benjamins, Amsterdam (1992)Google Scholar
- 14.Plotkin, G.D.: A note on inductive generalization. In: Meltzer, B., Michie, D. (eds.) Machine Intelligence, vol. 5, pp. 153–163. Elsevier North-Holland, New York (1970)Google Scholar
- 15.Punyakanok, V., Roth, D., Yih, W.: The Necessity of Syntactic Parsing for Semantic Role Labeling. In: IJCAI (2005)Google Scholar
- 17.Marcu, D.: From Discourse Structures to Text Summaries. In: Mani, I., Maybury, M. (eds.) Proceedings of ACL Workshop on Intelligent Scalable Text Summarization, Madrid, Spain, pp. 82–88 (1997)Google Scholar
- 18.Mihalcea, R., Tarau, P.: TextRank: Bringing Order into Texts. In: Empirial Methods in NLP (2004)Google Scholar
- 19.Punyakanok, V., Roth, D., Yih, W.: Mapping dependencies trees: an application to question answering. In: Proceedings of AI & Math., Florida, USA (2004)Google Scholar