World Wide Web

, Volume 13, Issue 3, pp 215–249 | Cite as

A General Framework for Web Content Filtering

Article

Abstract

Web content filtering is a means to make end-users aware of the ‘quality’ of Web resources by evaluating their contents and/or characteristics against users’ preferences. Although they can be used for a variety of purposes, Web content filtering tools are mainly deployed as a service for parental control purposes, and for regulating the access to Web content by users connected to the networks of enterprises, libraries, schools, etc. Current Web filtering tools are based on well established techniques, such as data mining and firewall blocking, and they typically cater to the filtering requirements of very specific end-user categories. Therefore, what is lacking is a unified filtering framework able to support all the possible application domains, and making it possible to enforce interoperability among the different filtering approaches and the systems based on them. In this paper, a multi-strategy approach is described, which integrates the available techniques and focuses on the use of metadata for rating and filtering Web information. Such an approach consists of a filtering meta-model, referred to as MFM (Multi-strategy Filtering Model), which provides a general representation of the Web content filtering domain, independently from its possible applications, and of two prototype implementations, partially carried out in the framework of the EU projects EUFORBIA and QUATRO, and designed for different application domains: user protection and Web quality assurance, respectively.

Keywords

metadata-based Web content filtering Web quality assurance content-based access control 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Adam, N.R., Atluri, V., Bertino, E., Ferrari, E.: A content-based authorization model for digital libraries. IEEE Trans. Knowl. Data Eng. 14(2), 296–315 (2002). doi:10.1109/69.991718 CrossRefGoogle Scholar
  2. 2.
    Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005). doi:10.1109/TKDE.2005.99 CrossRefGoogle Scholar
  3. 3.
    Archer, P.: QUATRO Vocabulary – Version 1.0. QUATRO Technical Specification (2006).http://www.quatro-project.org/vocabulary/
  4. 4.
    Archer, P., Ferrari, E., Karkaletsis, V., Konstantopoulos, S., Koukourikos, A., Perego, A.: QUATRO Plus: quality you can trust? In: ESWC 2009 Workshop on Trust and Privacy on the Social and Semantic Web (SPOT 2009), CEUR Workshop Proceedings, vol. 447. CEUR-WS.org (2009). http://ceur-ws.org/Vol-447/paper1.pdf
  5. 5.
    Archer, P., Perego, A., Smith, K.: Protocol for Web description resources (POWDER): grouping of resources. W3C Recommendation, World Wide Web Consortium (2009). http://www.w3.org/TR/powder-grouping/
  6. 6.
    Archer, P., Shimuzu, N., Ahmed, K., Brickley, D., Appelquist, D., Chandrinos, K.: RDF content labels: schema description. QUATRO Technical Specification (2005). http://www.w3.org/2004/12/q/doc/content-labels-schema.htm
  7. 7.
    Archer, P., Smith, K., Perego, A.: Protocol for Web description resources (POWDER): description resources. W3C Recommendation, World Wide Web Consortium (2009). http://www.w3.org/TR/powder-dr/
  8. 8.
    Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P. (eds.): The Description Logic Handbook. Theory, Implementation, and Applications. Cambridge University Press, Cambridge (2003). doi:10.2277/0521781760 MATHGoogle Scholar
  9. 9.
    Bechhofer, S., van Harmelen, F., Hendler, J., Horrocks, I., McGuinness, D.L., Patel-Schneider, P.F., Stein, L.A.: OWL Web Ontology Language: Reference. W3C Recommendation, World Wide Web Consortium (2004). http://www.w3.org/TR/owl-ref/
  10. 10.
    Berners-Lee, T.: Cwm – A general purpose data processor for the Semantic Web. Project Web Site, World Wide Web Consortium (2008). http://www.w3.org/2000/10/swap/doc/cwm.html
  11. 11.
    Berners-Lee, T., Connolly, D., Kagal, L., Scharf, Y., Hendler, J.: N3Logic: a logical framework for the World Wide Web. Theory Pract. Log. Program 8(3), 249–269 (2008). doi:10.1017/S1471068407003213 MATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Bertino, E., Ferrari, E., Perego, A.: Content-based filtering of Web documents: The Ma\(\mathcal{X}\) system and the EUFORBIA project. Int. J. Inf. Secur. 2(1), 45–58 (2003). doi:10.1007/s10207-003-0024-6 CrossRefGoogle Scholar
  13. 13.
    Bertino, E., Ferrari, E., Perego, A.: Web content filtering. In: Ferrari, E., Thuraisingham, B. (eds.) Web and Information Security, chap. 6, pp. 112–132. IDEA Group, Hershey (2006)Google Scholar
  14. 14.
    Bonatti, P.A., Olmedilla, D.: Driving and monitoring provisional trust negotiation with metapolicies. In: 6th IEEE International Workshop on Policies for Distributed Systems and Networks (POLICY 2005), pp. 14–23. IEEE CS, Silver Spring (2005). doi:10.1109/POLICY.2005.13 CrossRefGoogle Scholar
  15. 15.
    Damianou, N., Dulay, N., Lupu, E., Sloman, M.: The Ponder policy specification language. In: International Workshop on Policies for Distributed Systems and Networks (POLICY 2001), LNCS, vol. 1995, pp. 18–38. Springer, New York (2001). doi:10.1007/3-540-44569-2_2 Google Scholar
  16. 16.
    de Bruijn, J., Lara, R., Polleres, A., Fensel, D.: OWL DL vs. OWL Flight: conceptual modeling and reasoning for the Semantic Web. In: 14th International Conference on World Wide Web (WWW 2005), pp. 623–632. ACM, New York (2005). doi:10.1145/1060745.1060836 CrossRefGoogle Scholar
  17. 17.
    Ferraiolo, D.F., Kuhn, D.R., Chandramouli, R. (eds.): Role-Based Access Control, 2nd edn. Artech House, Norwood (2007)Google Scholar
  18. 18.
    Flesca, S., Greco, S., Tagarelli, A., Zumpano, E.: Mining user preferences, page content and usage to personalize website navigation. World Wide Web 8(3), 317–345 (2005). doi:10.1007/s11280-005-1315-9 CrossRefGoogle Scholar
  19. 19.
    Golder, S.A., Huberman, B.A.: The structure of collaborative tagging systems. Comput. Res. Repos. abs/cs/0508082 (2005). http://arxiv.org/abs/cs/0508082
  20. 20.
    Horrocks, I., Patel-Schneider, P.F., Boley, H., Tabet, S., Grosof, B., Dean, M.: SWRL: A Semantic Web rule language combining OWL and RuleML. W3C Member Submission, World Wide Web Consortium (2004). http://www.w3.org/Submission/SWRL/
  21. 21.
    Kagal, L., Paolucci, M., Srinivasan, N., Denker, G., Finin, T.W., Sycara, K.P.: Authorization and privacy for semantic Web services. IEEE Intell. Syst. 19(4), 50–56 (2004). doi:10.1109/MIS.2004.23 CrossRefGoogle Scholar
  22. 22.
    Karkaletsis, V., Perego, A., Archer, P., Stamatakis, K., Nasikas, P., Rose, D.: Quality labeling of Web content: The QUATRO approach. In: WWW 2006 Workshop on Models of Trust for the Web (MTW 2006), CEUR Workshop Proceedings, vol. 190. CEUR-WS.org (2006). http://ceur-ws.org/Vol-190/paper09.pdf
  23. 23.
    Konstantopoulos, S., Archer, P.: Protocol for Web description resources (POWDER): formal semantics. W3C Recommendation, World Wide Web Consortium (2009). http://www.w3.org/TR/powder-formal/
  24. 24.
    Lagoze, C., Hunter, J.: The ABC ontology and model. J. Digit. Inf. 2(2) (2001). http://journals.tdl.org/jodi/article/view/44/47
  25. 25.
    Motik, B., Horrocks, I., Sattler, U.: Bridging the gap between OWL and relational databases. In: 16th International Conference on World Wide Web (WWW 2007), pp. 807–816. ACM, New York (2007). doi:10.1145/1242572.1242681 CrossRefGoogle Scholar
  26. 26.
    OASIS: eXtensible Access Control Markup Language (XACML) – Version 2.0. OASIS Standard (2005). http://docs.oasis-open.org/xacml/2.0/access_control-xacml-2.0-core-spec-os.pdf
  27. 27.
    Resnick, P., Miller, J.: PICS: Internet access controls without censorship. Commun. ACM 39(10), 87–93 (1996). doi:10.1145/236156.236175 CrossRefGoogle Scholar
  28. 28.
    Sandhu, R.S., Coyne, E.J., Feinstein, H.L., Youman, C.E.: Role-based access control models. IEEE Comp. 29(2), 38–47 (1996). doi:10.1109/2.485845 Google Scholar
  29. 29.
    Uszok, A., Bradshaw, J.M., Johnson, M., Jeffers, R., Tate, A., Dalton, J., Aitken, S.: KAoS policy management for semantic Web services. IEEE Intell. Syst. 19(4), 32–41 (2004). doi:10.1109/MIS.2004.31 CrossRefGoogle Scholar
  30. 30.
    Voß, J.: Tagging, folksonomy & Co—Renaissance of manual indexing? Comput. Res. Repos. abs/cs/0701072 (2007). http://arxiv.org/abs/cs/0701072
  31. 31.
    Weitzner, D.J., Hendler, J., Berners-Lee, T., Connolly, D.: Creating a policy-aware Web: discretionary, rule-based access for the World Wide Web. In: E. Ferrari, B. Thuraisingham (eds.) Web and Information Security, chap. 1, pp. 1–31. IDEA Group, Hershey (2006)Google Scholar
  32. 32.
    Winslett, M., Ching, N., Jones, V.E., Slepchin, I.: Using digital credentials on the World Wide Web. J. Comput. Secur. 5(3), 255–267 (1997)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.CS Department, CERIASPurdue UniversityWest LafayetteUSA
  2. 2.Dipartimento di Informatica e ComunicazioneUniversità degli Studi dell’InsubriaVareseItaly

Personalised recommendations