Abstract
Topic Models like Latent Dirichlet Allocation have been widely used for their robustness in estimating text models through mixtures of latent topics. Although LDA has been mostly used as a strictly lexicalized approach, it can be effectively applicable to a much richer set of linguistic structures. A novel application of LDA is here presented that acquires suitable grammatical generalizations for semantic tasks tightly dependent on NL syntax. We show how the resulting topics represent suitable generalizations over syntactic structures and lexical information as well. The evaluation on two different classification tasks, such as predicate recognition and question classification, shows that state of the art results are obtained.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baker, C., Ellsworth, M., Erk, K.: Semeval-2007 task 19: Frame semantic structure extraction. In: Proc. of SemEval 2007, Czech Republic, pp. 99–104 (2007)
Baker, C.F., Fillmore, C.J., Lowe, J.B.: The berkeley framenet project (1998)
Blei, D., McAuliffe, J.: Supervised topic models. In: Proceedings of Advances in Neural Information Processing Systems (2007)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research 3(4-5), 993–1022 (2003)
Boyd-Graber, J., Blei, D.: Syntactic topic models. In: Proceedings of Advances in Neural Information Processing Systems (2008)
Boyd-Graber, J., Blei, D., Zhu, X.: A topic model for word sense disambiguation. In: Proc.of the Joint Conference on EMNLP and CoNLL, pp. 1024–1033 (2007)
Brody, S., Lapata, M.: Bayesian word sense induction. In: Proceedings of the Conference of the European Chapter of the ACL, pp. 103–111 (2009)
Collins, M., Duffy, N.: New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In: ACL 2002 (2002)
Nello, C., John, S.-T., Huma, L.: Latent semantic kernels. J. Intell. Inf. Syst. 18(2-3), 127–152 (2002)
Erk, K., Pado, S.: Shalmaneser - a flexible toolbox for semantic role assignment. In: Proceedings of LREC 2006, Genoa, Italy (2006)
Fillmore, C.J.: Frames and the semantics of understanding. Quaderni di semantica 6(2), 222–254 (1985)
Gildea, D., Jurafsky, D.: Automatic Labeling of Semantic Roles. Computational Linguistics 28(3), 245–288 (2002)
Griffiths, T., Steyvers, M., Blei, D., Tenenbaum, J.: Integrating topics and syntax. In: Proceedings of NIPS 2005, pp. 537–544 (2005)
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences, 5228–5235 (2004)
Johansson, R., Nugues, P.: Semantic structure extraction using nonprojective dependency trees. In: Proceedings of SemEval 2007, Czech Republic (2007)
Kwok, C.C.T., Etzioni, O., Weld, D.S.: Scaling question answering to the web. In: WWW, pp. 150–161 (2001)
Landauer, T., Dumais, S.: A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review 104(2), 211–240 (1997)
Li, X., Roth, D.: Learning question classifiers. In: Proceedings of ACL 2002 (2002)
Li, X., Roth, D.: Learning question classifiers: the role of semantic information. Nat. Lang. Eng. 12(3), 229–249 (2006)
Minka, T., Lafferty, J.: Expectation-propagation for the generative aspect model. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 352–359 (2002)
Moschitti, A.: Efficient convolution kernels for dependency and constituent syntactic trees. In: ECML, Berlin, Germany, pp. 318–329 (2006); Machine Learning
Moschitti, A., Pighin, D., Basili, R.: Tree Kernels for Semantic Role Labeling. Computational Linguistics Special Issue on Semantic Role Labeling (3), 245–288 (2008)
Moschitti, A., Quarteroni, S., Basili, R., Manandhar, S.: Exploiting syntactic and shallow semantic kernels for question answer classification. In: Proceedings of ACL 2007 (2007)
Pradhan, S., Hacioglu, K., Krugler, V., Ward, W., Martin, J.H., Jurafsky, D.: Support Vector Learning for Semantic Argument Classification. Machine Learning 60(1-3), 11–39 (2005)
Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceedings of Uncertainty in Artificial Intelligence, pp. 487–494 (2004)
Tomás, D., Giuliano, C.: A semi-supervised approach to question classification. In: Proc. of the 17th European Symposium on Artificial Neural Networks, Bruges, Belgium (2009)
Toutanova, K., Johnson, M.: A bayesian lda-based model for semi-supervised part-of-speech tagging. In: Proceedings of Advances in Neural Information Processing Systems (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Basili, R., Giannone, C., Croce, D., Domeniconi, C. (2011). Latent Topic Models of Surface Syntactic Information. In: Pirrone, R., Sorbello, F. (eds) AI*IA 2011: Artificial Intelligence Around Man and Beyond. AI*IA 2011. Lecture Notes in Computer Science(), vol 6934. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23954-0_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-23954-0_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23953-3
Online ISBN: 978-3-642-23954-0
eBook Packages: Computer ScienceComputer Science (R0)