Toward Developing a Very Big Sign Language Parallel Corpus

  • Achraf Othman
  • Zouhour Tmar
  • Mohamed Jemni
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7383)


The Community for researchers in the field of sign language is facing a serious problem which is the absence of a large parallel corpus for signs language. The ASLG-PC12 project, conducted in our laboratory, proposes a rule-based approach for building big parallel corpus between English written texts and American Sign Language Gloss. In this paper, we present a new algorithm to transform a part of English-speech sentence to ASL gloss. This project was started in the beginning of 2011 and it offers today a corpus containing more than one hundred million pairs of sentences between English and ASL gloss. It is available online for free in order to develop and design new algorithms and theories for Sign Language processing, for instance, statistical machine translation and any related fields. We present, in particular, the tasks for generating ASL sentences from the corpus Gutenberg Project that contains only English written texts.


American Sign Language Parallel Corpora Sign Language 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Morrissey, S., Way, A.: Joining hands: Developing a sign language machine translation system with and for the deaf community. In: Proceeding CVHI Conference, Workshop Assistive Technol. People with Vision and Hearing Impairments, Granada, Spain (2007)Google Scholar
  2. 2.
    Morrissey, S.: Assistive technology for deaf people: Translating into and animating Irish sign language. In: Proceeding Young Researchers Consortium, ICCHP, Linz, Austria (2008)Google Scholar
  3. 3.
    American Sign Language Gloss Parallel Corpus 2012 (ASLG-PC 2012) (2012),
  4. 4.
    Jemni, M., Elghoul, O.: Towards Web-Based automatic interpretation of written text to Sign Language. In: First International Conference on ICT & Accessibility, Hammamet, Tunisia (2008) Google Scholar
  5. 5.
    Jemni, M., Elghoul, O.: An avatar based approach for automatic interpretation of text to Sign language. In: 9th European Conference for the Advancement of the Assistive Technologies in Europe, San Sebastian, Spain (2007)Google Scholar
  6. 6.
    Gutenberg Project (2012),
  7. 7.
    Gillick, D.: Sentence boundary detection and the problem with the U.S. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers, Boulder, Colorado (2009)Google Scholar
  8. 8.
    Toutanova, K., Manning, C.D.: Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, EMNLP/VLC 2000 (2000)Google Scholar
  9. 9.
    Stokoe, W.: Sign Language Structure: An Outline of the Visual Communication Systems of the American Deaf. Linstok Press, SilverSpring (1960)Google Scholar
  10. 10.
    Prillwitz, S., Zienert, H.: Hamburg notation system for sign language: Development of a sign writing with computer application. International Studies on Sign Language and Communication of the Deaf, Hamburg, Germany (1990)Google Scholar
  11. 11.
    Sutton, V., Gleaves, R.: SignWriter—the world’s first sign language processor. Deaf Action Committee for SignWriting, La Jolla, CA (1995)Google Scholar
  12. 12.
    Huenerfauth, M., Zhou, L., Gu, E., Allbeck, J.: Evaluation of American sign language generation by native ASL signers. ACM Transaction Accessible Computing (2008)Google Scholar
  13. 13.
    Marshall, I., Sáfár, E.: Sign language generation using HPSG. In: Proceeding 9th International Conference Theoretical Methodological Issues Machine Translation (TMI 2002), Keihanna, Japan (2002)Google Scholar
  14. 14.
    Othman, A., Jemni, M.: Statistical Sign Language Machine Translation: from English written text to American Sign Language Gloss. IJCSI International Journal of Computer Science Issues 8(5(3)) (2011)Google Scholar
  15. 15.
    Jemni, M., Elghoul, O., Makhlouf, S.: A Web-Based Tool to Create Online Courses for Deaf Pupils. In: International Conference on Interactive Mobile and Computer Aided Learning, Amman, Jordan (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Achraf Othman
    • 1
  • Zouhour Tmar
    • 1
  • Mohamed Jemni
    • 1
  1. 1.Research Laboratory LaTICEUniversity of TunisTunisTunisia

Personalised recommendations