Skip to main content
Log in

Analysis and Correction of Web Documents’ Non-Compliance with Web Standards

  • Published:
Journal of the Korean Physical Society Aims and scope Submit manuscript

Abstract

Based on the justification for equal accessibility of the World Wide Web (Web for short), we analyzed the non-compliance of collected web documents with web standards through a statistical physics approach. The web documents were examined by using a validator that classified the noncompliance into errors and warnings of different types. We found that the frequency distributions of errors and warnings in a web document followed a power-law distribution and that a strong correlation existed between the numbers of errors and warnings. In addition, some errors or warnings were identified much more frequently than others, which could be modeled by a geometric distribution. By utilizing these properties, we proposed a scheme to correct non-compliance that focused on the most frequently occurring errors and warnings. We empirically tested the proposed method against the collected web documents and showed that the proposed method effectively corrected about 47% and 85% of errors and warnings, respectively. We also used network theory to analyze correlations within and between different errors and warnings in correction results and found that some types of errors and/or warnings affected each other in the correction. In this paper, correction results of the proposed method are compared with those of Tidy, and different characteristics between the two correction methods are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. The Web Standard Project, https://doi.org/www.webstandards.org (accessed 13 July 2018).

  2. S. L. Henry, Web Accessibility: Web Standards and Regulatory Compliance (Apress, New York, 2006), pp. 1–52.

    Book  Google Scholar 

  3. J. Zeldman and E. Marcotte, Designing with Web Standards (New Riders Press, Berkeley CA, 2009).

    Google Scholar 

  4. S. Collison, A. Budd and C. Moll, CSS Mastery: Advanced Web Standards Solutions (Black & White) (Springer-Verlag, New York, 2009).

    Google Scholar 

  5. J. Niederst, Web Design in a Nutshell: A Desktop Quick Reference (O’Reilly Media, Sebastopol CA, 2006).

    Google Scholar 

  6. D. Smith and T. Negrino, JavaScript: Visual QuickStart Guide (Peachpit Press, San Francisco, 2014).

    Google Scholar 

  7. L. F. Sikos, Web Standards: Mastering HTML5, CSS3, and XML (Apress, New York, 2014).

    Google Scholar 

  8. World Wide Web Consortium. https://doi.org/www.w3.org/, 2018 (accessed 13 July 2018).

  9. P. L. Krapivsky and S. Redner, Comput. Networks 39, 261 (2002).

    Article  Google Scholar 

  10. M. Takayasu, K. Fukuda and H. Takayasu, Physica A 274, 140 (1999).

    Article  ADS  Google Scholar 

  11. R. Pastor-Satorras and A. Vespignani, Phys. Rev. Lett. 86, 3200 (2001).

    Article  ADS  Google Scholar 

  12. R. Pastor-Satorras and A. Vespignani, Evolution and Structure of the Internet: A Statistical Physics Approach (Cambridge University Press, Cambridge, 2008).

    Google Scholar 

  13. The W3C Markup Validation Service, https://doi.org/validator.w3.org/about.html, (accessed 13 July, 2018).

  14. Document Object Model (DOM), https://doi.org/www.w3.org/DOM/#what, 2018 (accessed 13 July, 2018).

  15. D. Raggett, Clean up your Web pages with HTML Tidy, https://doi.org/www.w3.org/People/Raggett/tidy, 2018 (accessed 13 July 2018).

    Google Scholar 

  16. Openwebspider, https://doi.org/www.openwebspider.org, 2018 (accessed 13 July 2018).

  17. M. Newman, J. Contemp. Phys. 46, 323 (2005).

    Article  ADS  Google Scholar 

  18. A. Clauset, C. Shaliz and M. Newman, SIAM Rev. 51, 661 (2009).

    Article  ADS  MathSciNet  Google Scholar 

  19. C. Petersen, J. G. Simonsen and C. Lioma, ACM TOIS 34, 1 (2016).

    Article  Google Scholar 

  20. D. Roberts and D. Turcotte, Fractals 6, 351 (1998).

    Article  Google Scholar 

  21. For a review, see A.-L. Barabasi, Science 325, 412 (2009).

    Article  ADS  MathSciNet  Google Scholar 

  22. A. Vazquez, R. Pastor-Satorras and A. Vespignani, Phys. Rev. E 65, 066130 (2002).

    Article  ADS  Google Scholar 

  23. L. Zhao, L. Park and Y.-C. Lai, Phys. Rev. E 70, 035101(R) (2004).

    Article  ADS  Google Scholar 

  24. J. Campbell, A. Lo and A. MacKinlay, The Econometrics of Financial Markets (Princeton University Press, New Jersey, 1996).

    MATH  Google Scholar 

  25. G. Box and R. Meyer, Technometrics 28, 11 (1986).

    Article  MathSciNet  Google Scholar 

  26. R. Pressman, Software Engineering: A Practitioner’s Approach (McGraw-Hill, Boston, 2010).

    MATH  Google Scholar 

  27. J. Zimmerman, Applying the Pareto Principle (80-20 Rule) to Baseball, https://doi.org/www.beyondtheboxscore.com/2010/6/4/1501048/applying-the-parento-principle-80, 2018 (accessed 13 July 2018).

    Google Scholar 

  28. B. Efron and R. Tibshirani, An Introduction to the Bootstrap (Chapman & Hall/CRC, Boca Raton, 1993).

    Book  MATH  Google Scholar 

  29. J. L. Rodgers and W. A. Nicewander, The American Statistician 42, 59 (1988).

    Article  Google Scholar 

  30. K. Baba, R. Shibata and M. Sibuya, Aust. N.Z. J. Stat. 46, 657 (2004).

    Article  MathSciNet  Google Scholar 

  31. R. A. Fisher, Biometrika (Biometrika Trust) 10, 507 (1915).

    Google Scholar 

  32. Y. Benjamini and Y. Hochberg, J. R. Stat. Soc. Series B (Methodological) 57, 289 (1995).

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the Korea Research Foundation Grant funded by the Korean Government (MOEHRD) (NRF-2018R1D1A3B07042338).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chang-Yong Lee.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chae, SY., Lee, CY. Analysis and Correction of Web Documents’ Non-Compliance with Web Standards. J. Korean Phys. Soc. 74, 731–743 (2019). https://doi.org/10.3938/jkps.74.731

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3938/jkps.74.731

Keywords

Navigation