Advertisement

Evaluating Italian Parsing Across Syntactic Formalisms and Annotation Schemes

  • Anita AlicanteEmail author
  • Cristina Bosco
  • Anna Corazza
  • Alberto Lavelli
Chapter
Part of the Studies in Computational Intelligence book series (SCI, volume 589)

Abstract

This paper describes some results about the way syntactic representations and parsing methodologies affect the performance of systems for parsing Italian. Italian has a rich morphology, especially with respect to Verbal suffixes, that can provide a parser with useful information for making the correct choices. With respect to syntactic representation, the experiments are based on a treebank for Italian, which has been delivered both in a dependency and in a constituency formalism, and for each of them also annotated at different degrees of specificity. The two paradigms are compared, and the different degrees of specificity in marking some syntactic phenomena are pointed out. On the basis of this treebank, statistical parsers have been evaluated. The results have shown that both the representation format and the parsing approach strongly affect the performance, that in some cases are very close and in others drastically different from the ones that constitute the state of the art for English.

Keywords

Parsing Word order Morphologically rich languages 

References

  1. 1.
    Alicante, A., Bosco, C., Corazza, A., Lavelli, A.: A treebank-based study on the influence of Italian word order on parsing performance. In: LREC, pp. 1985–1992 (2012)Google Scholar
  2. 2.
    Bosco, C.: A richer annotation schema for an Italian treebank. In: Proceedings of European Summer School on Logic Language and Information, Birmingham, UK (2000), http://www.di.unito.it/~bosco/publicat/esslli00.zip
  3. 3.
    Bosco, C.: Grammatical relation’s system in treebank annotation. In: Proceedings of Student Research Workshop of Joint ACL/EACL Meeting, Toulose, France (2001), http://www.di.unito.it/~bosco/publicat/acl-stud-ses-01.zip
  4. 4.
    Bosco, C.: A grammatical relation system for treebank annotation, Ph.D. thesis, University of Torino (2004)Google Scholar
  5. 5.
    Bosco, C.: Multiple-step treebank conversion: from dependency to Penn format. In: Proceedings of Linguistic Annotation Workshop at the ACL’07 (2007)Google Scholar
  6. 6.
    Bosco, C.: Linguistic knowledge extraction from corpus parallel annotations. In: Proceedings of XL Congresso della Società di Linguistica Italiana, Vercelli (2009), http://www.di.unito.it/~bosco/publicat/sli06.zip
  7. 7.
    Bos, J., Bosco, C., Mazzei, A.: Converting a dependency treebank to a categorial grammar treebank for Italian. In: Proceedings of the Eighth Workshop on Treebanks and Linguistic Theories, pp. 27–38. Milan (2009)Google Scholar
  8. 8.
    Bosco, C., Lavelli, A.: Annotation schema oriented evaluation for parsing validation. In: Proceedings of the 9th Workshop on Treebanks and Linguistic Theories (TLT-9), pp. 19–30. Tartu, Estonia (2010)Google Scholar
  9. 9.
    Bosco, C., Mazzei, A., Lavelli, A.: Looking back to the Evalita constituency parsing task: 2007–2011. In: Magnini, B., Cutugno, F., Falcone, M., Pianta, E. (eds.) Evaluation of Natural Language and Speech Tools for Italian—Proceedings of EVALITA 2011, pp. 46–57 (2012)Google Scholar
  10. 10.
    Bosco, C., Lombardo, V.: A relation-schema for treebank annotation. In: A. Cappelli, F.T. (ed.) Advances in Artificial Intelligence, LNCS, vol. 2829. Springer, Berlin (2003), http://www.di.unito.it/~bosco/publicat/aiia-03.zip
  11. 11.
    Bosco, C., Lombardo, V.: Comparing linguistic information in treebank annotations. In: Proceedings of the 5th International Language Resources and Evaluation Conference (2006), http://www.di.unito.it/~bosco/publicat/lrec06.zip
  12. 12.
    Bosco, C., Lombardo, V., Lesmo, L., Vassallo, D.: Building a treebank for Italian: a data-driven annotation schema. In: Proceedings of 2nd International Conference on Language Resources and Evaluation, Athens, Greece (2000), http://www.di.unito.it/~bosco/publicat/lrec00.zip
  13. 13.
    Bosco, C., Mazzei, A., Lombardo, V.: Evalita parsing task: an analysis of the first parsing system contest for Italian. Intell. Artif. 2(IV), 30–33 (2007)Google Scholar
  14. 14.
    Bosco, C., Mazzei, A., Lombardo, V.: Evalita’09 parsing task: constituency parsers and the Penn format for Italian. In: Proceedings of Evalita’09 (2009)Google Scholar
  15. 15.
    Bosco, C., Montemagni, S., Mazzei, A., Lombardo, V., Dell’Orletta, F., Lenci, A.: Evalita’09 parsing task: comparing dependency parsers and treebanks. In: Proceedings of Evalita’09, Reggio Emilia (2009)Google Scholar
  16. 16.
    Bosco, C., Montemagni, S., Mazzei, A., Lombardo, V., Dell’Orletta, F., Lenci, A., Lesmo, L., Attardi, G., Simi, M., Lavelli, A., Hall, J., Nilsson, J., Nivre, J.: Comparing the influence of different treebank annotations on dependency parsing. In: Proceedings of Language Resources and Evaluation Conference, pp. 1794–1801. Malta (2010)Google Scholar
  17. 17.
    Cheung, J.C., Penn, G.: Topological field parsing of German. In: Proceedings of ACL-IJCNLP’09, pp. 64–72. Singapore (2009)Google Scholar
  18. 18.
    Collins, M., Hajic, J., Ramshaw, L., Tillmann, C.: A statistical parser of Czech. In: Proceedings of the ACL’99 (1999)Google Scholar
  19. 19.
    Corazza, A., Lavelli, A., Satta, G.: An information-theoretic measure to evaluate parsing difficulty across treebanks. ACM Trans. Speech Lang. Process. 9(4), 7:1–7:31 (2013). http://doi.acm.org/10.1145/2407736.2407737
  20. 20.
    Dell’Orletta, F., Marchi, S., Montemagni, S., Venturi, G.: Domain adaptation for dependency parsing at Evalita 2011. In: Magnini, B., Cutugno, F., Falcone, M., Pianta, E. (eds.) Evaluation of Natural Language and Speech Tools for Italian—Proceedings of EVALITA 2011, pp. 58–69 (2012)Google Scholar
  21. 21.
    Green, S., Manning, C.D.: Better Arabic parsing: Baselines, evaluations, and analysis. In: Proceedings of COLING 2010 (2010)Google Scholar
  22. 22.
    Hajič, J., Böhmová, A., Hajičová, E., Vidová-Hladká, B.: The prague dependency treebank: a three-level annotation scenario. In: Abeillé, A. (ed.) Treebanks: Building and Using Parsed Corpora, pp. 103–127. Kluwer, Amsterdam (2000)Google Scholar
  23. 23.
    Hudson, R.: Word Grammar. Basil Blackwell, Oxford (1984)Google Scholar
  24. 24.
    Jones, B.E.M.: Exploring the role of punctuation in parsing natural text. In: Proceedings of COLING’94, pp. 421–425. Kyoto (1994)Google Scholar
  25. 25.
    Kübler, S., Rehbein, I., van Genabith, J.: TePaCoC a corpus for testing parser performance on complex German grammatical constructions. In: Proceedings of TLT-7, pp. 15–28. Groningen, The Netherlands (2009)Google Scholar
  26. 26.
    Lavelli, A., Hall, J., Nilsson, J., Nivre, J.: MaltParser at the Evalita 2009 dependency parsing task. In: Proceedings of Evalita’09, Reggio Emilia (2009)Google Scholar
  27. 27.
    Lesmo, L.: Use of semantic information in a syntactic dependency parser. In: Magnini, B., Cutugno, F., Falcone, M., Pianta, E. (eds.) Evaluation of Natural Language and Speech Tools for Italian—Proceedings of EVALITA 2011, pp. 13–20 (2012)Google Scholar
  28. 28.
    Lesmo, L.: The rule-based parser of the NLP group of the University of Torino. Intell. Artif. 2, 46–47 (2007)Google Scholar
  29. 29.
    Lesmo, L.: The Turin University parser at Evalita 2009. In: Proceedings of Evalita’09, Reggio Emilia (2009)Google Scholar
  30. 30.
    Lesmo, L., Lombardo, V., Bosco, C.: Treebank development: the TUT approach. In: Proceedings of ICON02, Mumbai, India (2002), http://www.di.unito.it/~bosco/publicat/icon02lesmo-et-al.zip
  31. 31.
    Nilsson, J., Nivre, J.: MaltEval: An evaluation and visualization tool for dependency parsing. In: Proceedings of LREC’08, pp. 161–166. Marrakech (2008)Google Scholar
  32. 32.
    Nivre, J., Hall, J., Nilsson, J.: MaltParser: A data-driven parser-generator for dependency parsing. In: Proceedings of LREC’06, pp. 2216–2219. Genova (2006)Google Scholar
  33. 33.
    Petrov, S., Klein, D.: Improved inference for unlexicalized parsing. In: Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, pp. 404–411. Rochester, New York (April 2007). http://www.aclweb.org/anthology/N/N07/N07-1051
  34. 34.
    Rimell, L., Clark, S., Steedman, M.: Unbounded dependency recovery for parser evaluation. In: Proceedings of Empirical Methods in Natural Language Processing ’09, pp. 813–821. Singapore (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Anita Alicante
    • 1
    Email author
  • Cristina Bosco
    • 2
  • Anna Corazza
    • 1
  • Alberto Lavelli
    • 3
  1. 1.Dipartimento di Ingegneria Elettrica e Tecnologie dell’InformazioneUniversità di Napoli Federico IINaplesItaly
  2. 2.Dipartimento di InformaticaUniversità di TorinoTurinItaly
  3. 3.HLT Research UnitFondazione Bruno KesslerPovoItaly

Personalised recommendations