Rule-based verification of Web sites

  • M. AlpuenteEmail author
  • D. Ballis
  • M. Falaschi
Special Section on Leveraging Applications of Formal Methods


In this paper, we develop a framework for the automated verification of Web sites, which can be used to specify integrity conditions for a given Web site, and then automatically check whether these conditions are fulfilled. First, we provide a rewriting-based, formal specification language which allows us to define syntactic as well as semantic properties of the Web site. Then, we formalize a verification technique which detects both incorrect/forbidden patterns as well as lack of information, that is, incomplete/missing Web pages inside the Web site. Useful information is gathered during the verification process which can be used to repair the Web site. Our methodology is based on a novel rewriting-based technique, called partial rewriting, in which the traditional pattern matching mechanism is replaced by tree simulation, a suitable technique for recognizing patterns inside semistructured documents. The framework has been implemented in the prototype GVerdi, which is publicly available.


Regular Expression Correctness Rule Semistructured Data Irreducible Form Completeness Rule 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web. From Relations to Semistructured Data and XML. Morgan Kaufmann (2000)Google Scholar
  2. 2.
    Alpuente, M., Ballis, D., Falaschi, M.: A rewriting-based framework for Web sites verification. In: Proceedings of 1st International Workshop on Ruled-Based Programming (RULE’04), vol. 124(1). ENTCS, Elsevier (2004)Google Scholar
  3. 3.
    Alpuente M., Ballis D., Falaschi M.(2004). VERDI: an automated tool for Web sites Verification. In: Alferes J.J., Leite J. (eds) Proceedings of the 9th European Conference on Logics in Artificial Intelligence (JELIA’04), vol. 3229 of Lecture Notes in Computer Science. Springer, Berlin Heidelberg New York, pp. 726–729Google Scholar
  4. 4.
    Baader F., Nipkow T.(1998). Term Rewriting and All That. Cambridge University Press, CambridgeGoogle Scholar
  5. 5.
    Ballis, D.: Rule-based Software Verification and Correction. PhD thesis, University of Udine and Technical University of Valencia (2005)Google Scholar
  6. 6.
    Ballis, D., García Vivó, J.: A rule-based system for Web site verification. In: Proceedings of 1st International Workshop on Automated Specification and Verification of Web Sites (WWV’05). ENTCS, Elsevier, 2005. To appear.Google Scholar
  7. 7.
    Baxter, I.D., Ricca, F., Tonella, P.: Web application transformations based on rewrite rules. Inform. Softw. Technol. 44 (13), (2002)Google Scholar
  8. 8.
    Bertino E., Mesiti M., Guerrin G.(2004). A matching algorithm for measuring the structural similarity between an XML document and a DTD and its applications. Inform. Syst. 29(1): 23–46CrossRefGoogle Scholar
  9. 9.
    Bezem M.(2003). TeReSe, Term Rewriting Systems, chapter Mathematical background (Appendix A). Cambridge University Press, CambridgeGoogle Scholar
  10. 10.
    Bry, F., Schaffert, S.: Towards a declarative query and transformation language for XML and semistructured data: simulation unification. In: Proceedings of the International Conference on Logic Programming (ICLP’02) vol. 2401 of LNCS. Springer Berlin Heidelberg New York (2002)Google Scholar
  11. 11.
    Bry, F., Schaffert, S.: The XML query language xcerpt: design principles, examples, and semantics. Technical report, Available at: Scholar
  12. 12.
    Buneman, P., Davidson, S.B., Hillebrand, G.G., Suciu, D.: A query language and optimization techniques for unstructured data. In: Proceedings of ACM SIGMOD International Conference on Management of Data (ICMD’96) (1996)Google Scholar
  13. 13.
    Capra L., Emmerich W., Finkelstein A., Nentwich C.(2002). XLINKIT: a consistency checking and smart link generation service. ACM Trans. Internet Technol. 2 (2): 151–185Google Scholar
  14. 14.
    Cortesi A., Dovier A., Quintarelli E., Tanca L.(2002). Operational and abstract semantics of a graphical query language. Theore. Comput. Sci. 275, 521–560CrossRefMathSciNetzbMATHGoogle Scholar
  15. 15.
    Dershowitz N., Plaisted D.(2001). Rewriting. Handbook Automated Reasoning 1, 535–610CrossRefGoogle Scholar
  16. 16.
    Despeyroux, T., Trousse, B.: Semantic verification of Web sites using natural semantics. In: Proceedings of 6th Conference on Content-Based Multimedia Information Access (RIAO’00) (2000)Google Scholar
  17. 17.
    Di Sciascio, E., Donini, F.M., Mongiello, M., Piscitelli, G.: Web applications design and maintenance using symbolic model checking. In: Proceedings 7th European Conference on Software Maintenance and Reengineering, pp. 63. IEEE Computer Society (2003)Google Scholar
  18. 18.
    Easterbrook S.M., Nuseibeh B., Russo A.(2000). Leveraging inconsistency in software development. IEEE Comp. 33 (4): 24–29Google Scholar
  19. 19.
    Ellmer E., Emmerich W., Finkelstein A., Nentwich C.(2003). Flexible consistency checking. ACM Trans. Softw. Eng. 12(1): 28–63CrossRefGoogle Scholar
  20. 20.
    Fan W., Libkin L.(2002). On XML integrity constraints in the presence of DTDs. J. ACM 49(3): 368–406CrossRefMathSciNetGoogle Scholar
  21. 21.
    Fernandez, M., Florescu, D., Levy, A., Suciu, D.: Verifying integrity constraints on Web sites. In: Proceedings of Sixteenth International Joint Conference on Artificial Intelligence (IJCAI’99) vol. 2 pp. 614–619. Morgan Kaufmann (1999)Google Scholar
  22. 22.
    Fernandez, M.F., Suciu, D.: Optimizing regular path expressions using graph schemas. In: Proceedings of International Conference on Data Engineering (ICDE’98), pp. 14–23 (1998)Google Scholar
  23. 23.
    M. Hanus (ed.). Curry: an integrated functional logic language. Available at: http://www-i2.informatik. /curry (1999)Google Scholar
  24. 24.
    Henzinger, M.R., Henzinger, T.A., Kopke, P.W.: Computing simulations on finite and infinite graphs. In: IEEE Sympo Foundations Comput. Sci. 453–462 (1995)Google Scholar
  25. 25.
    Hopcroft, J.E., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley (1979)Google Scholar
  26. 26.
    Hosoya, H., Pierce, B.: Regular expressions pattern matching for XML. In: Proceedings of 25th ACM SIGPLAN-SIGACT International Symposium POPL, pp. 67–80. ACM (2001)Google Scholar
  27. 27.
    Hussmann, H.: Unification in conditional-equational Theories. In: Proceedings of ALP’88, pp. 31–40. Springer LNCS 343 (1988)Google Scholar
  28. 28.
    Imagiware. Inc. Doctor HTML: quality assessment for the Web. Available at: RxHTML/.Google Scholar
  29. 29.
    Kirchner, C., Qian, Z., Singh, P.K,. Stuber, J.: Xemantics: a Rewriting Calculus-Based Semantics of XSLT. Rapport de recherche A01-R-386, LORIA (2001)Google Scholar
  30. 30.
    Klop J.W. (1992). Term rewriting systems. In: Abramsky S., Gabbay D., Maibaum T., (eds) Handbook of Logic in Computer Science vol. I. Oxford University Press, Oxford, pp. 1–112Google Scholar
  31. 31.
    Moreno-Navarro J.J., Rodriguez-Artalejo.: BABEL: a functional and logic programming language based on a constructor discipline and narrowing. In: Grabowski I., Lescanne P., Wechler W., (eds) Proceedings of the International Conference on Algebraic and Logic Programming. pp. 223–232 Springer LNCS 343 (1988)Google Scholar
  32. 32.
    Nentwich, C., Emmerich, W., Finkelstein, A.: Consistency management with repair actions. In: Proceedings of the 25th International Conference on Software Engineering (ICSE’03). IEEE Computer Society (2003)Google Scholar
  33. 33.
    Nesbit, S.: HTML Tidy: keeping it clean Available at: webauthors/06-16-00-3.shtml(2000)Google Scholar
  34. 34.
    The Open Group. Unix Regular Expressions. Available at: xbd/re.html.Google Scholar
  35. 35.
    Academisa Sinica Computing Centre. The schematron: an XML structure validation language using pattern in trees. Available at: schematron/schematron.htmlGoogle Scholar
  36. 36.
    World Wide Web Consortium (W3C). Extensible Markup Language (XML) 1.0, second edition Available at: http:// (1999)Google Scholar
  37. 37.
    World Wide Web Consortium (W3C). XML path language (XPath) Available at: (1999)Google Scholar
  38. 38.
    World Wide Web Consortium (W3C). Extensible HyperText Markup Language (XHTML) Available at: http://www. (2000)Google Scholar

Copyright information

© Springer-Verlag 2006

Authors and Affiliations

  1. 1.DSICUniversidad Politécnica de ValenciaValenciaSpain
  2. 2.Dip. Matematica e InformaticaUdineItaly
  3. 3.Dip. di Scienze Matematiche e InformaticheSienaItaly

Personalised recommendations