Policy-Aware Content Reuse on the Web

  • Oshani Seneviratne
  • Lalana Kagal
  • Tim Berners-Lee
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5823)


The Web allows users to share their work very effectively leading to the rapid re-use and remixing of content on the Web including text, images, and videos. Scientific research data, social networks, blogs, photo sharing sites and other such applications known collectively as the Social Web have lots of increasingly complex information. Such information from several Web pages can be very easily aggregated, mashed up and presented in other Web pages. Content generation of this nature inevitably leads to many copyright and license violations, motivating research into effective methods to detect and prevent such violations.

This is supported by an experiment on Creative Commons (CC) attribution license violations from samples of Web pages that had at least one embedded Flickr image, which revealed that the attribution license violation rate of Flickr images on the Web is around 70-90%. Our primary objective is to enable users to do the right thing and comply with CC licenses associated with Web media, instead of preventing them from doing the wrong thing or detecting violations of these licenses. As a solution, we have implemented two applications: (1) Attribution License Violations Validator, which can be used to validate users’ derived work against attribution licenses of reused media and, (2) Semantic Clipboard, which provides license awareness of Web media and enables users to copy them along with the appropriate license metadata.


  1. 1.
    Attributor - Subscription based web monitoring platform for content reuse detection,
  2. 2. - Hosting, distribution and advertising platform for creators of web showsGoogle Scholar
  3. 3.
    Bonatti, P.A., Duma, C., Fuchs, N., Nejdl, W., Olmedilla, D., Peer, J., Shahmehri, N.: Semantic web policies - a discussion of requirements and research issues. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 712–724. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  4. 4.
    Creative Commons BY 3.0 Unported Legal Code,
  5. 5.
    Creative Commons Customized Search in Google,
  6. 6.
  7. 7.
    Exchangeable Image File Format,
  8. 8.
    Feigenbaum, J., Freedman, M.J., Sander, T., Shostack, A.: Privacy engineering for digital rights management systems. In: Sander, T. (ed.) DRM 2001. LNCS, vol. 2320, pp. 76–105. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  9. 9.
  10. 10.
  11. 11.
    Abelson, H., Adida, B., Linksvayer, M., Yergler, N.: ccREL: The Creative Commons Rights Expression Language. Creative Commons Wiki (2008)Google Scholar
  12. 12.
    How to attribute Flickr images,
  13. 13.
    Dworak, H.: Creative Commons License Validation Service,
  14. 14.
    International Press Telecommunications Council Photo Metadata Format,
  15. 15.
    Jones, H.C.: Xhtml documents with inline, policy-aware provenance. Master’s thesis, Massachusetts Institute of Technology (May 2007)Google Scholar
  16. 16.
    Doctor, K.: Blog Entry on Attributor Fair Syndication Consortium Completes Newspaper Trifecta,
  17. 17.
    Kim, J.W., Candan, K.S., Tatemura, J.: Efficient overlap and content reuse detection in blogs and online news articles. In: 18th International World Wide Web Conference WWW 2009 (April 2009)Google Scholar
  18. 18.
    Kishor, P., Seneviratne, O.: Public policy: Mashing-up technology and law. In: Mashing-up Culture: The Rise of User-generated Content, COUNTER workshop, Uppsala University (May 2009)Google Scholar
  19. 19.
    MozCC - Firefox extension to discover Creative Commons licenses,
  20. 20.
  21. 21.
    picScout - Image tracker for stock photography agencies and professional photographers,
  22. 22.
    Protocol for Web Description Resources (POWDER),
  23. 23.
    RDF, Resource Description Framework,
  24. 24.
    RDFa, Resource Description Framework in Attributes,
  25. 25.
    Shivakumar, N., Garcia-Molina, H.: Scam: A copy detection mechanism for digital documents. In: Second Annual Conference on the Theory and Practice of Digital Libraries (1995)Google Scholar
  26. 26.
    SpinXpress - Collaborative media production platform,
  27. 27.
    Think Free - Java based web office suite,
  28. 28.
    Berners-Lee, T., Hollenbach, J., Lu, K., Presbrey, J., Prud’ommeaux, E., Schraefel, M.C.: Tabulator Redux: Browing and Writing Linked Data. In: Linked Data on the Web Workshop at WWW (2008)Google Scholar
  29. 29.
    Weitzner, D.J., Abelson, H., Berners-Lee, T., Feigenbaum, J., Hendler, J., Sussman, G.J.: Information accountability. Communications of the ACM (June 2008)Google Scholar
  30. 30.
    Faulkner, W.: Tales From The IT Side: PicScout, Getty Images and Goodbye iStockPhoto..!,
  31. 31.
    XMP - Extensible Metadata Platform,
  32. 32.
    Yahoo Creative Commons Search,

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Oshani Seneviratne
    • 1
  • Lalana Kagal
    • 1
  • Tim Berners-Lee
    • 1
  1. 1.MIT CSAILCambridgeUSA

Personalised recommendations