When Rules Meet Bigrams

Wehrli, Eric; Nerima, Luka

doi:10.1007/978-3-642-54906-9_21

Eric Wehrli¹⁷ &
Luka Nerima¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8403))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

2031 Accesses
1 Citations

Abstract

This paper discusses an on-going project aiming at improving the quality and the efficiency of a rule-based parser by the addition of a statistical component. The proposed technique relies on bigrams of pairs (word+category) selected from the homographs contained in our lexical database and computed over a large section of the Hansard corpus, previously tagged. The bigram table is used by the parser to rank and prune the set of alternatives. To evaluate the gains obtained by the hybrid system, we conducted two manual evaluations. One over a small subset of the Hansard corpus, the other one with a corpus of about 50 articles taken from the magazine The Economist. In both cases, we compare analyses obtained by the parser with and without the statistical component, focusing only on one important source of mistakes, the confusion between nominal and verbal readings for ambiguous words such as announce, sets, costs, labour, etc.

Thanks to Meghdad Farahmand and Yves Scherrer for useful comments and contributions to this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Adolphs, P., Oepen, S., Callmeier, U., Crysmann, B., Flickinger, D., Kiefer, B.: Some Fine Points of Hybrid Natural Language Parsing. In: Proceedings of LREC 2008, Marrakesh, Morocco (2008)
Google Scholar
Blache, P., Rauzy, S.: Probabiliser les Grammaires de Propriétés. In: Proceedings of the TALN-Mixeur Workshop, TALN 2013, Sables d’Olonne, pp. 108–111 (2013)
Google Scholar
Klavans, J., Resnik, P. (eds.): The Balancing Act: Combining Symbolic and Statistical Approaches to Language. MIT Press (1996)
Google Scholar
Klein, D., Manning, C.D.: Accurate Unlexicalized Parsing. In: Proceedings of the 41st Meeting of the Association for Computational Linguistics, pp. 423–430 (2003)
Google Scholar
Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., Marinov, S., Marsi, E.: MaltParser: A Language-independent System for Data-driven Dependency Parsing. Natural Language Engineering 13(2), 95–135 (2007)
Google Scholar
Petrov, S., Dipanjan, D., McDonald, R.: A Universal Part-of-Speech Tagset. In: Proceedings of the LREC 2012, Istanbul, Turkey (2012)
Google Scholar
Santorini, B.: Part-of-Speech Tagging Guidelines for the Penn Treebank Project (3rd Revision, 2nd printing) (1990), http://www.cis.upenn.edu/treebank
Schneider, G.: Hybrid Long-Distance Functional Dependency Parsing. Ph.D. dissertation, Institute of Computational Linguistics, University of Zurich (2008)
Google Scholar
Sennrich, R., Schneider, G., Volk, M., Warin, M.: A New Hybrid Dependency Parser for German. In: Proceedings of GSCL-Conference (2009)
Google Scholar
Wehrli, E.: Fips, a ‘Deep’ Linguistic Multilingual Parser. In: Proceedings of the ACL 2007 Workshop on Deep Linguistic Processing, Prague, Czech Republic, pp. 120–127 (2007)
Google Scholar
Wehrli, E., Nerima, L.: L’Analyseur Syntaxique Fips. In: IWPT 2009, Workshop on French Parsers, Paris (2009), http://alpage.inria.fr/iwpt09/atala/fips.pdf

Download references

Author information

Authors and Affiliations

Laboratoire d’Analyse et de Technologie du Langage - CUI, University of Geneva, Battelle - 7 route de Drize, CH 1227, Carouge, Switzerland
Eric Wehrli & Luka Nerima

Authors

Eric Wehrli
View author publications
You can also search for this author in PubMed Google Scholar
Luka Nerima
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research, National Polytechnic Institute, Av. Juan Dios Bátiz, Col. Nueva Industrial Vallejo, 07738, Mexico D.F., Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wehrli, E., Nerima, L. (2014). When Rules Meet Bigrams. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2014. Lecture Notes in Computer Science, vol 8403. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54906-9_21

Download citation

DOI: https://doi.org/10.1007/978-3-642-54906-9_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54905-2
Online ISBN: 978-3-642-54906-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics