Analysis, preparation, and optimization of statistical sign language machine translation
- First Online:
- Cite this article as:
- Stein, D., Schmidt, C. & Ney, H. Machine Translation (2012) 26: 325. doi:10.1007/s10590-012-9125-1
Sign languages represent an interesting niche for statistical machine translation that is typically hampered by the scarceness of suitable data, and most papers in this area apply only a few, well-known techniques that are not adapted to small-sized corpora. In this article, we analyze existing data collections and emphasize their quality and usability for statistical machine translation. We also offer findings in the proper preprocessing of a sign language corpus, by introducing sentence end markers, splitting compound words and handling parallel communication channels. Then, we focus on optimization procedures that are tailored to scarce resources, such as scaling factor optimization, alignment optimization and system combination. All methods are evaluated on two of the largest sign language corpora available.
KeywordsSign languages Scarce resources Syntactic methods Parallel input
Unable to display preview. Download preview PDF.