Multimedia Systems

, Volume 22, Issue 1, pp 5–16 | Cite as

Chat with illustration

  • Yu Jiang
  • Jing LiuEmail author
  • Hanqing Lu
Special Issue Paper


Instant messaging service is an important aspect of social media and sprung up in last decades. Traditional instant messaging service transfers information mainly based on textual message, while the visual message is ignored to a great extent. Such instant messaging service is thus far from satisfactory in all-around information communication. In this paper, we propose a novel visual assisted instant messaging scheme named Chat with illustration (CWI), which presents users visual messages associated with textual message automatically. When users start their chat, the system first identifies meaningful keywords from dialogue content and analyzes grammatical and logical relations. Then CWI explores keyword-based image search on a hierarchically clustering image database which is built offline. Finally, according to grammatical and logical relations, CWI assembles these images properly and presents an optimal visual message. With the combination of textual and visual message, users could get a more interesting and vivid communication experience. Especially for different native language speakers, CWI can help them cross language barrier to some degree. In addition, a visual dialogue summarization is also proposed, which help users recall the past dialogue. The in-depth user studies demonstrate the effectiveness of our visual assisted instant messaging scheme.


Instant messaging service Text-to-picture Layout 



This work was supported by the 973 Program (2010CB327905) and National Natural Science Foundation of China (61272329 and 61273034).


  1. 1.
    Adorni, G., Manzo, M.D., Giunchiglia, F.: Natural language driven image generation. In: Proceedings of the 10th International Conference on Computational Linguistics, pp. 495–500 (1984)Google Scholar
  2. 2.
    Akerberg, O., Svensson, H., Schulz, B., Nugues, P.: Carsim: An automatic 3d text-to-scene conversion system applied to road accident reports. In: Conference of the European Chapter of the Association for, Computational Linguistics, pp. 191–194 (2003)Google Scholar
  3. 3.
    Bui, D., Nakamura, C., Bray, B.E., Zeng-Treitler, Q.: Automated illustration of patients instructions. In: AMIA Annual Symposium Proc., pp. 1158–1167 (2012)Google Scholar
  4. 4.
    Carney, R.N., Levin, J.R.: Pictorial illustrations still improve students’ learning from text. Educ. Psychol. Rev. 14(1), 5–26 (2002)CrossRefGoogle Scholar
  5. 5.
    Cheng, M.M., Zhang, G.X., Mitra, N.J., Huang, X., Hu, S.M.: Global contrast based salient region detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 409–416 (2011)Google Scholar
  6. 6.
    Cilibrasi, R., Vitnyi, P.M.B.: The google similarity distance. IEEE Trans. Knowl. Data Eng. 19(3), 370–383 (2007)CrossRefGoogle Scholar
  7. 7.
    Coyne, B., Sproat, R.: Wordseye: an automatic text-to-scene conversion system. In: Annual Conference on Computer Graphics, pp. 487–496 (2001)Google Scholar
  8. 8.
    Dupuy, S., Egges, A., Legendre, V., Nugues, P.: Generating a 3d simulation of a car accident from a written description in natural language: the carsim system. Computing Research Repository cs.CL/0105 (2001)Google Scholar
  9. 9.
    Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315, 972–976 (2007)CrossRefMathSciNetzbMATHGoogle Scholar
  10. 10.
    Goldberg, A.B., Zhu, X., Dyer, C.R., Eldawy, M., Heng, L.: Easy as abc?: Facilitating pictorial communication via semantically enhanced layout. In: Proceedings of the Twelfth Conference on Computational Natural Language Learning, CoNLL ’08, pp. 119–126 (2008)Google Scholar
  11. 11.
    Ishai, A., Haxby, J.V., Ungerleider, L.G.: Visual imagery of famous faces: effects of memory and attention revealed by fmri. Neuroimage 17, 1729–1741 (2002)CrossRefGoogle Scholar
  12. 12.
    Johansson, R., Berglund, A., Danielsson, M., Nugues, P.: Automatic text-to-scene conversion in the traffic accident domain. In: International Joint Conference on Artificial Intelligence, pp. 1073–1078 (2005)Google Scholar
  13. 13.
    Joshi, D., Wang, J.Z., Li, J.: The story picturing engine - a system for automatic text illustration. ACM Trans. Multimed. Comput. Commun. Appl. 2, 68–89 (2006)CrossRefGoogle Scholar
  14. 14.
    Kennedy, L.S., Naaman, M.: Generating diverse and representative image search results for landmarks. In: Proceedings of the 17th international conference on World Wide Web, pp. 297–306 (2008)Google Scholar
  15. 15.
    Levie, W.H., Lentz, R.: Effects of text illustrations: A review of research. Educ. Technol. Res. Dev. 30(4), 195–232 (1982)Google Scholar
  16. 16.
    Li, H., Tang, J., Li, G., Chua, T-S.: Word2image: towards visual interpreting of words. In: Proceedings of the 16th ACM International Conference on Multimedia, pp. 813–816. ACM, New York (2008)Google Scholar
  17. 17.
    de Marnee, M.C., Manning, C.D.: Stanford typed dependencies manual. Stanford University (2008)Google Scholar
  18. 18.
    Mihalcea, R., Leong, C.W.: Toward communicating simple sentences using pictorial representations. Mach. Transl. 22, 153–173 (2008)CrossRefGoogle Scholar
  19. 19.
    Ustalov, D.: A text-to-picture system for russian language. In: Proceedings of the Sixth Russian Young Scientists Conference in, Information Retrieval, pp. 35–44 (2012)Google Scholar
  20. 20.
    Wang, M., Yang, K., Hua, X.S., Zhang, H.J.: Towards a relevant and diverse search of social images. Trans. Multi. 12(8), 829–842 (2010)CrossRefGoogle Scholar
  21. 21.
    Yamada, A., Yamamoto, T., Ikeda, H., Nishida, T., Doshita, S.: Reconstructing spatial image from natural language texts. In: Proceedings of the 14th Conference on Computational Linguistics, pp. 1279–1283 (1992)Google Scholar
  22. 22.
    Zhu, X., Goldberg, A.B., Eldawy, M., Dyer, C.R., Strock, B.: A text-to-picture synthesis system for augmenting communication. In: Proceedings of the 22nd national conference on Artificial intelligence, pp. 1590–1595 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.National Laboratory of Pattern Recognition, Institute of AutomationChinese Academy of SciencesBeijingChina

Personalised recommendations