Machine Translation

, Volume 22, Issue 3, pp 153–173 | Cite as

Toward communicating simple sentences using pictorial representations

  • Rada MihalceaEmail author
  • Chee Wee Leong


This paper addresses and evaluates the hypothesis that pictorial representations can be used to effectively convey simple sentences across language barriers. The paper makes two main contributions. First, it proposes an approach to augmenting dictionaries with illustrative images using volunteer contributions over the Web. The paper describes the PicNet illustrated dictionary, and evaluates the quality and quantity of the contributions collected through several online activities. Second, starting with this illustrated dictionary, the paper describes a system for the automatic construction of pictorial representations for simple sentences. Comparative evaluations show that a considerable amount of understanding can be achieved using visual descriptions of information, with evaluation figures within a comparable range of those obtained with linguistic representations produced by an automatic machine translation system.


Text-to-picture synthesis Illustrated dictionaries Augmentative and alternative communication 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Alm N, Iwabuchi M, Andreasen P, Nakamura K (2002) A multi-lingual augmentative communication system. In: Univeral access: theoretical perspectives, practice and experience, LNCS 2615. Springer, Berlin, pp 398–408Google Scholar
  2. Barnard K, Forsyth DA (2001) Learning the semantics of words and pictures. In: Proceedings of the IEEE international conference on computer vision. Vancouver, BC, Canada, pp 408–415Google Scholar
  3. Barnard K, Johnson M, Forsyth D (2003) Word sense disambiguation with pictures. In: Proceedings of the HLT-NAACL 2003 Workshop on Learning Word Meaning from Non-Linguistic Data. Edmonton, AL, Canada, pp 13–30Google Scholar
  4. Behrmann M, Byng S (1992) Cognitive neuropsychology in clinical practice. Oxford University Press, Oxford, UKGoogle Scholar
  5. Bloomberg K, Karlan G, Lloyd L (1990) The comparative translucency of initial lexical items represented in five graphic symbol systems and sets. J Speech Hear Res 33(4): 717–725Google Scholar
  6. Boshernitsan M, Downes M (1997) Visual programming languages: a survey. Technical report, U.C. Berkeley, Berkeley, CAGoogle Scholar
  7. Brill E (1992) A simple rule-based part of speech tagger. In: Proceedings of the 3rd conference on applied natural language processing. Trento, Italy, pp 152–155Google Scholar
  8. Carney R, Levin J (2002) Pictorial illustration still improve students’ learning from text. Educ Psychol Rev 14(1): 5–26CrossRefGoogle Scholar
  9. Carpuat M, Ngai G, Fung P, Church K (2002) Creating a bilingual ontology: a corpus-based approach for aligning WordNet and HowNet. In: Proceedings of the 19th international conference on computational linguistics. Taipei, Taiwan, pp 284–292Google Scholar
  10. Chang S, Costagliola G, Orefice S, Polese G, Baker B (1992) A methodology and interactive environment for iconic language design with applications to augmentative communication. In: Proceedings of the IEEE workshop on visual languages. Seattle, WA, pp 110–116Google Scholar
  11. Clay S, Wilhelms J (1996) Put: Language-based interactive manipulation of objects. IEEE Comput Graph Appl 16(2): 31–39CrossRefGoogle Scholar
  12. Coltheart M (1981) The MRC psycholinguistic database. Q J Exp Psychol 33(4): 497–505Google Scholar
  13. Coyne B, Sproat R (2001) Wordseye: an automatic text-to-scene conversion system. In: Proceedings of the ACM conference on computer graphics and interactive techniques. Los Angeles, CA, pp 487–496Google Scholar
  14. Doddington G (2002) Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of human language technology HLT-2002. San Diego, CA, pp 138–145Google Scholar
  15. Evans D, Bowick L, Johnson M, Blenkhorn P (2006) Using iconicity to evaluate symbol use. In: Proceedings of the 10th international conference on computers helping people. Linz, Austria, pp 874–881Google Scholar
  16. Gibbs WW (2002) Saving dying languages. Sci Am 287(2): 79–86Google Scholar
  17. Glenberg AM, Gutierrez T, Levin JR, Japuntich S, Kaschak MP (2004) Activity and imagined activity can enhance young children’s reading comprehension. J Educ Psychol 96(3): 424–436CrossRefGoogle Scholar
  18. Hanson E, Hartzema A (1995) Evaluating pictograms as an aid for counselling elderly and low-literate patients. J Pharm Mark Manage 9(3): 41–54CrossRefGoogle Scholar
  19. Haupt L, Alant E (2003) The iconicity of picture communication symbols for rural Zulu children. S Afr J Commun Disord 49: 40–49Google Scholar
  20. Huer M (2000) Examining perceptions of graphic symbols across cultures: preliminary study of the impact of culture/ethnicity. Augment Altern Commun 16(3): 180–185CrossRefGoogle Scholar
  21. Johansson R, Berglund A, Danielsson M, Nugues P (2005) Automatic text-to-scene conversion in the traffic accident domain. In: Proceedings of the nineteenth international joint conference on artificial intelligence. Edinburgh, UK, pp 1073–1078Google Scholar
  22. Komlodi A, Hou W, Preece J, Druin A, Golub E, Alburo J, Liao S, Elkiss A, Resnik P (2005) Evaluating a cross-cultural children’s online book community: lessons learned for sociability, usability, and cultural exchange. Interact Comput 19(4): 494–511CrossRefGoogle Scholar
  23. Medhi I, Sagar A, Toyama K (2006) Text free user interfaces for illiterate and semi-literate users. In: International conference on information and communication technologies and development. Berkeley, CA, pp 72–82Google Scholar
  24. Mihalcea R, Csomai A (2005) Senselearner: word sense disambiguation for all words in unrestricted text. In: 43rd annual meeting of the association for computational linguistics. Ann Arbor, MI, pp 53–56Google Scholar
  25. Mihalcea R, Corley C, Strapparava C (2006) Corpus-based and knowledge-based approaches to text semantic similarity. In: Proceedings of the 21st national conference on artificial intelligence (AA AI-06). Boston, MA, pp 775–780Google Scholar
  26. Miller G (1995) Wordnet: a lexical database. Commun ACM 38(11): 39–41CrossRefGoogle Scholar
  27. Mizuko M (1987) Transparency and ease of learning of symbols represented by blissymbols, pcs and picsyms. Augment Altern Commun 3(3): 129–136CrossRefGoogle Scholar
  28. Musselwhite C, Ruscello D (1984) Transparency of three communication symbol systems. J Speech Hear Res 27(3): 436–443Google Scholar
  29. Nakamura K, Newell A, Alm N, Waller A (1998) How do members of different language communities compose sentences with a picture-based communication system? A cross-cultural study of picture-based sentences constructed by English and Japanese speakers. Augment Altern Commun 14(2): 71–79CrossRefGoogle Scholar
  30. Nigam R (2003) Do individuals from diverse cultural and ethnic backgrounds perceive graphic symbols differently?. Augment Altern Commun 19(2): 135–136CrossRefGoogle Scholar
  31. Pan JY, Yang HJ, Faloutsos C, Duygulu P (2004) Gcap: graph-based automatic image captioning. In: Proceedings of the 4th international workshop on multimedia data and document engineering. Washington, DC, pp 146–155Google Scholar
  32. Papineni K, Roukos S, Ward T, Zhu W (2002) Bleu: a method for automatic evaluation of machine translation. In: 40th annual meeting of the association for computational linguistics. Philadelphia, PA, pp 311–318Google Scholar
  33. Potter MC, Kroll JF, Yachzel B, Carpenter E, Sherman J (1986) Pictures in sentences: understanding without words. J Exp Psychol 115(3): 281–294Google Scholar
  34. Tufis D, Cristea D (2002) Methodological issues in building the Romanian WordNet and consistency checks in Balkanet. In: Proceedings of the LREC workshop on Wordnet structures and standardisation. Las Palmas, Canary Islands, Spain, pp 35–41Google Scholar
  35. Turian J, Shen L, Melamed ID (2003) Evaluation of machine translation and its evaluation. In: MT Summit IX: Proceedings of the ninth machine translation summit. New Orleans, LA, pp 386–393Google Scholar
  36. von Ahn L, Dabbish L (2004) Labeling images with a computer game. In: Proceedings of the conference on human factors in computing systems. Vienna, Austria, pp 319–326Google Scholar
  37. Vossen P (1998) EuroWordNet: a multilingual database with lexical semantic networks. Kluwer Academic Publishers, Dordrecht, The NetherlandszbMATHGoogle Scholar
  38. Yamada A, Yamamoto T, Ikeda H, Nishida T, Doshita S (1992) Reconstructing spatial image from natural language texts. In: Proceedings of the fifteenth [sic] international conference on computational linguistics. Nantes, France, pp 1279–1283Google Scholar
  39. Zhu X, Goldberg AB, Eldawy M, Dyer CR, Strock B (2007) A text-to-picture synthesis system for augmenting communication. In: Proceedings of the 22nd AAAI conference on artificial intelligence. Vancouver, BC, Canada, pp 1590–1595Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2009

Authors and Affiliations

  1. 1.Computer Science DepartmentUniversity of North TexasDentonUSA

Personalised recommendations