Abstract
Based on the justification for equal accessibility of the World Wide Web (Web for short), we analyzed the non-compliance of collected web documents with web standards through a statistical physics approach. The web documents were examined by using a validator that classified the noncompliance into errors and warnings of different types. We found that the frequency distributions of errors and warnings in a web document followed a power-law distribution and that a strong correlation existed between the numbers of errors and warnings. In addition, some errors or warnings were identified much more frequently than others, which could be modeled by a geometric distribution. By utilizing these properties, we proposed a scheme to correct non-compliance that focused on the most frequently occurring errors and warnings. We empirically tested the proposed method against the collected web documents and showed that the proposed method effectively corrected about 47% and 85% of errors and warnings, respectively. We also used network theory to analyze correlations within and between different errors and warnings in correction results and found that some types of errors and/or warnings affected each other in the correction. In this paper, correction results of the proposed method are compared with those of Tidy, and different characteristics between the two correction methods are discussed.
Similar content being viewed by others
References
The Web Standard Project, https://doi.org/www.webstandards.org (accessed 13 July 2018).
S. L. Henry, Web Accessibility: Web Standards and Regulatory Compliance (Apress, New York, 2006), pp. 1–52.
J. Zeldman and E. Marcotte, Designing with Web Standards (New Riders Press, Berkeley CA, 2009).
S. Collison, A. Budd and C. Moll, CSS Mastery: Advanced Web Standards Solutions (Black & White) (Springer-Verlag, New York, 2009).
J. Niederst, Web Design in a Nutshell: A Desktop Quick Reference (O’Reilly Media, Sebastopol CA, 2006).
D. Smith and T. Negrino, JavaScript: Visual QuickStart Guide (Peachpit Press, San Francisco, 2014).
L. F. Sikos, Web Standards: Mastering HTML5, CSS3, and XML (Apress, New York, 2014).
World Wide Web Consortium. https://doi.org/www.w3.org/, 2018 (accessed 13 July 2018).
P. L. Krapivsky and S. Redner, Comput. Networks 39, 261 (2002).
M. Takayasu, K. Fukuda and H. Takayasu, Physica A 274, 140 (1999).
R. Pastor-Satorras and A. Vespignani, Phys. Rev. Lett. 86, 3200 (2001).
R. Pastor-Satorras and A. Vespignani, Evolution and Structure of the Internet: A Statistical Physics Approach (Cambridge University Press, Cambridge, 2008).
The W3C Markup Validation Service, https://doi.org/validator.w3.org/about.html, (accessed 13 July, 2018).
Document Object Model (DOM), https://doi.org/www.w3.org/DOM/#what, 2018 (accessed 13 July, 2018).
D. Raggett, Clean up your Web pages with HTML Tidy, https://doi.org/www.w3.org/People/Raggett/tidy, 2018 (accessed 13 July 2018).
Openwebspider, https://doi.org/www.openwebspider.org, 2018 (accessed 13 July 2018).
M. Newman, J. Contemp. Phys. 46, 323 (2005).
A. Clauset, C. Shaliz and M. Newman, SIAM Rev. 51, 661 (2009).
C. Petersen, J. G. Simonsen and C. Lioma, ACM TOIS 34, 1 (2016).
D. Roberts and D. Turcotte, Fractals 6, 351 (1998).
For a review, see A.-L. Barabasi, Science 325, 412 (2009).
A. Vazquez, R. Pastor-Satorras and A. Vespignani, Phys. Rev. E 65, 066130 (2002).
L. Zhao, L. Park and Y.-C. Lai, Phys. Rev. E 70, 035101(R) (2004).
J. Campbell, A. Lo and A. MacKinlay, The Econometrics of Financial Markets (Princeton University Press, New Jersey, 1996).
G. Box and R. Meyer, Technometrics 28, 11 (1986).
R. Pressman, Software Engineering: A Practitioner’s Approach (McGraw-Hill, Boston, 2010).
J. Zimmerman, Applying the Pareto Principle (80-20 Rule) to Baseball, https://doi.org/www.beyondtheboxscore.com/2010/6/4/1501048/applying-the-parento-principle-80, 2018 (accessed 13 July 2018).
B. Efron and R. Tibshirani, An Introduction to the Bootstrap (Chapman & Hall/CRC, Boca Raton, 1993).
J. L. Rodgers and W. A. Nicewander, The American Statistician 42, 59 (1988).
K. Baba, R. Shibata and M. Sibuya, Aust. N.Z. J. Stat. 46, 657 (2004).
R. A. Fisher, Biometrika (Biometrika Trust) 10, 507 (1915).
Y. Benjamini and Y. Hochberg, J. R. Stat. Soc. Series B (Methodological) 57, 289 (1995).
Acknowledgments
This work was supported by the Korea Research Foundation Grant funded by the Korean Government (MOEHRD) (NRF-2018R1D1A3B07042338).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Chae, SY., Lee, CY. Analysis and Correction of Web Documents’ Non-Compliance with Web Standards. J. Korean Phys. Soc. 74, 731–743 (2019). https://doi.org/10.3938/jkps.74.731
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3938/jkps.74.731