Language Resources and Evaluation

, Volume 49, Issue 1, pp 107–145

Parsing Hebrew CHILDES transcripts

  • Shai Gretz
  • Alon Itai
  • Brian MacWhinney
  • Bracha Nir
  • Shuly Wintner
Original Paper

DOI: 10.1007/s10579-013-9256-x

Cite this article as:
Gretz, S., Itai, A., MacWhinney, B. et al. Lang Resources & Evaluation (2015) 49: 107. doi:10.1007/s10579-013-9256-x
  • 139 Downloads

Abstract

We present a syntactic parser of (transcripts of) spoken Hebrew: a dependency parser of the Hebrew CHILDES database. CHILDES is a corpus of child–adult linguistic interactions. Its Hebrew section has recently been morphologically analyzed and disambiguated, paving the way for syntactic annotation. This paper describes a novel annotation scheme of dependency relations reflecting constructions of child and child-directed Hebrew utterances. A subset of the corpus was annotated with dependency relations according to this scheme, and was used to train two parsers (MaltParser and MEGRASP) with which the rest of the data were parsed. The adequacy of the annotation scheme to the CHILDES data is established through numerous evaluation scenarios. The paper also discusses different annotation approaches to several linguistic phenomena, as well as the contribution of morphological features to the accuracy of parsing.

Keywords

Parsing Dependency grammar Child language Syntactic annotation 

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  • Shai Gretz
    • 1
  • Alon Itai
    • 1
  • Brian MacWhinney
    • 2
  • Bracha Nir
    • 3
  • Shuly Wintner
    • 4
  1. 1.Department of Computer ScienceTechnionHaifaIsrael
  2. 2.Department of PsychologyCarnegie Mellon UniversityPittsburghUSA
  3. 3.Department of Communication DisordersUniversity of HaifaHaifaIsrael
  4. 4.Department of Computer ScienceUniversity of HaifaHaifaIsrael

Personalised recommendations