On the Use of N-Gram Transducers for Dialogue Annotation
The implementation of dialogue systems is one of the most interesting applications of language technologies. Statistical models can be used in this implementation, allowing for a more flexible approach than when using rules defined by a human expert. However, statistical models require large amounts of dialogues annotated with dialogue-function labels (usually Dialogue Acts), and theannotation process is hard and time-consuming. Consequently, the use of other statistical models to obtain faster annotations is really interesting for the development of dialogue systems. In this work we compare two statistical models for dialogue annotation, a more classical Hidden Markov Model (HMM) based model and the new N-gram Transducers (NGT) model. This comparison is performed on two corpora of different nature, the well-known SwitchBoard corpus and the DIHANA corpus. The results show that the NGT model produces a much more accurate annotation that the HMM-based model (even 11% less error in the SwitchBoard corpus).
KeywordsStatistical models Dialogue annotation
Unable to display preview. Download preview PDF.