LivingKnowledge: A Platform and Testbed for Fact and Opinion Extraction from Multimodal Data

  • David Dupplaw
  • Michael Matthews
  • Richard Johansson
  • Paul Lewis
Part of the Communications in Computer and Information Science book series (CCIS, volume 255)


In this paper, we describe the work we are undertaking in producing a truly multimedia platform for the analysis of facts and opinions on the web. The system integrates the analysis of multimodal data (images, text and page layout) into a distributable platform that can be built upon for various applications. We give an overview of the natural language processing tools that have been developed for extracting facts and opinions from the textual content of articles, the image analysis techniques used to extract facts and to help support the opinions found in the contextually related written information, as well as other multimodal tools developed for the analysis of online articles. We describe two applications that have been developed as part of ongoing work of the LivingKnowledge project: the News Media Analysis application for the semi-automation of the work of a media analysis company and the Future Predictor application which allows exploration of claims that are made through time.


Natural Language Processing Sentiment Analysis Opinion Extraction Opinion Analysis Multimodal Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bird, S., Liberman, M.: A formal framework for linguistic annotation. Speech Communication 33(1,2), 23–60 (2001)CrossRefzbMATHGoogle Scholar
  2. 2.
    Ciaramita, M., Altun, Y.: Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In: Processings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, pp. 594–602 (2006)Google Scholar
  3. 3.
    Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A framework and graphical development environment for robust NLP tools and applications. In: Proceedings of the ACL (2002)Google Scholar
  4. 4.
    Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: Liblinear: A library for large linear classification. JMLR 9, 1871–1874 (2008)zbMATHGoogle Scholar
  5. 5.
    Ferrucci, D., Lally, A.: UIMA: an architectural approach to unstructured information processing in the corporate research environment. Natural Language Engineering 10(3-4), 327–348 (2004)CrossRefGoogle Scholar
  6. 6.
    Joachims, T.: Learning to Classify Text using Support Vector Machines. Kluwer/Springer (2002)Google Scholar
  7. 7.
    Johansson, R., Moschitti, A.: A flexible representation of heterogeneous annotation data. In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC 2010), Valetta, Malta, pp. 3712–3715 (2010)Google Scholar
  8. 8.
    Johansson, R., Moschitti, A.: Reranking models in fine-grained opinion analysis. In: Proceedings of the 23rd International Conference of Computational Linguistics (Coling 2010), Beijing, China, pp. 519–527 (2010)Google Scholar
  9. 9.
    Johansson, R., Nugues, P.: Dependency-based syntactic–semantic analysis with PropBank and NomBank. In: CoNLL 2008: Proceedings of the Twelfth Conference on Natural Language Learning, Manchester, United Kingdom, pp. 183–187 (2008)Google Scholar
  10. 10.
    Palmer, M., Gildea, D., Kingsbury, P.: The proposition bank: An annotated corpus of semantic roles. Computational Linguistics 31(1), 71–106 (2005)CrossRefGoogle Scholar
  11. 11.
    Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, Philadelphia, United States, pp. 79–86 (2002)Google Scholar
  12. 12.
    Rosa, A.D., Uccheddu, F., Costanzo, A., Piva, A., Barni, M.: Exploring image dependencies: a new challenge in image forensics. SPIE, vol. 7541, p. 75410X (2010),
  13. 13.
    Sandhaus, E.: The New York Times annotated corpus. Linguistic Data Consortium (2008)Google Scholar
  14. 14.
    Siersdorfer, S., Hare, J., Minack, E., Deng, F.: Analyzing and predicting sentiment of images on the social web. In: ACM Multimedia 2010, pp. 715–718. ACM (October 2010),
  15. 15.
    Stone, P.J., Dunphy, D.C., Smith, M.S., Ogilvie, D.M.: Associates: The General Inquirer: A Computer Approach to Content Analysis. MIT Press (1966)Google Scholar
  16. 16.
    Wiebe, J., Wilson, T., Cardie, C.: Annotating expressions of opinions and emotions in language. Language Resources and Evaluation 39(2-3), 165–210 (2005)CrossRefGoogle Scholar
  17. 17.
    Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, Canada, pp. 347–354 (2005)Google Scholar
  18. 18.
    Zhao, W., Chellappa, R., Phillips, P.J., Rosenfeld, A.: Face recognition: A literature survey. ACM Comput. Surv. 35, 399–458 (2003), CrossRefGoogle Scholar
  19. 19.
    Zontone, P., Boato, G., Hare, J., Lewis, P., Siersdorfer, S., Minack, E.: Image and collateral text in support of auto-annotation and sentiment analysis. In: TextGraphs-5: Graph-based Methods for Natural Language Processing, pp. 88–92. The Association for Computational Linguistics (July 2010),
  20. 20.
    Zontone, P., Boato, G., Natale, F.G.B.D., Rosa, A.D., Barni, M., Piva, A., Hare, J., Dupplaw, D., Lewis, P.: Image diversity analysis: Context, opinion and bias. In: The First International Workshop on Living Web: Making Web Diversity a true asset, vol. 515, CEUR-WS (October 2009),

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • David Dupplaw
    • 1
  • Michael Matthews
    • 2
  • Richard Johansson
    • 3
  • Paul Lewis
    • 1
  1. 1.University of SouthamptonUK
  2. 2.Barcelona MediaSpain
  3. 3.University of TrentoItaly

Personalised recommendations