An approach and an Eclipse-based environment for enhancing the navigation structure of Web sites

  • Giuseppe ScannielloEmail author
  • Damiano Distante
  • Michele Risi
Special Section on Web Systems Evolution


This paper presents an approach based on information retrieval and clustering techniques for automatically enhancing the navigation structure of a Web site for improving navigability. The approach increments the set of navigation links provided in each page of the site with a semantic navigation map, i.e., a set of links enabling navigating from a given page to other pages of the site showing similar or related content. The approach uses Latent Semantic Indexing to compute a dissimilarity measure between the pages of the site and a graph-theoretic clustering algorithm to group pages showing similar or related content according to the calculated dissimilarity measure. AJAX code is finally used to extend each Web page with an associated semantic navigation map. The paper also presents a prototype of a tool developed to support the approach and the results from a case study conducted to assess the validity and feasibility of the proposal.


Web site evolution Navigation structure Navigation evolution Reverse engineering Clone detection Clustering Information retrieval Latent semantic indexing Semantic clustering Semantic navigation map 


  1. 1.
    Antoniol, G., Canfora, G., Casazza, G., De Lucia, A.: Web site reengineering using RMM. In: Proceedings of the 2nd International Workshop on Web Site Evolution, pp. 9–16, Zurich, Switzerland (2000)Google Scholar
  2. 2.
    Bernardi, M., Di Lucca, G. A., and Distante, D.: Reverse engineering of web applications to abstract user-centered conceptual models. In: Proceedings of the 10th International Symposium on Web Site Evolution, pp. 55–64. IEEE Press (2008)Google Scholar
  3. 3.
    Boldyreff C., Tonella P.: Web site evolution. Special Issue J. Softw. Maintenance 6(1-2), 1–4 (2004)Google Scholar
  4. 4.
    Boldyreff, C., and Kewish, R.: Reverse Engineering to Achieve Maintainable WWW Sites. In: Proceedings of the 8th IEEE Working Conference on Reverse Engineering, pp. 249–257, Stuttgart, Germany, IEEE CS Press (2001)Google Scholar
  5. 5.
    Cabot, J., Gómez, C.: A catalogue of refactorings for navigation models. In: Proceedings of the 8th International Conference on Web Engineering, pp. 75–85, Yorktown Heights, New York, IEEE CS Press (2008)Google Scholar
  6. 6.
    Ceri S., Fraternali P., Bongio A.: Web Modeling Language (WebML): a modeling language for designing web sites. Comput. Netw. 33(1–6), 137–157 (2000)CrossRefGoogle Scholar
  7. 7.
    Cran, D., Pascarello, E., Darren J.: Ajax in Action. Manning Publications Co. (2005). ISBN: 1932394613Google Scholar
  8. 8.
    De Lucia, A., Scanniello, G., Tortora, G.: Identifying similar pages in web applications using a competitive clustering algorithm. J. Softw. Maintenance Evol. 19(5), 281–296. Wiley, New York (2007)Google Scholar
  9. 9.
    De Lucia, A., Risi, M., Scanniello, G., Tortora, G.: Clustering algorithms and latent semantic indexing to identify similar pages in web applications. In: Proceedings of the 9th IEEE International Symposium on Web Site Evolution, pp. 65–72, Paris, France, IEEE CS Press (2007)Google Scholar
  10. 10.
    De Lucia, A., Francese, R., Scanniello, G., Tortora, G.: Identifying cloned navigational patterns in web applications. J. Web Eng. 5(2), 150–174. Rinton Press (2006)Google Scholar
  11. 11.
    Deerwester S., Dumais S.T., Furnas G.W., Landauer T.K., Harshman R.: Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 41, 391–407 (1990)CrossRefGoogle Scholar
  12. 12.
    Di Lucca, G.A., Di Penta, M., Fasolino, A.R.: An approach to identify duplicated web pages. In: Proceedings of the 26th Annual International Computer Software and Application Conference, pp. 481–486, Oxford, UK, IEEE CS Press, (2002)Google Scholar
  13. 13.
    Di Lucca, G.A., Di Penta, M., Antoniol, G., Casazza, G.: An approach for reverse engineering of web-based applications. In: Proceedings of the 8th IEEE Working Conference on Reverse Engineering, pp. 231–240, Stuttgart, Germany, IEEE CS Press (2001)Google Scholar
  14. 14.
    Distante, D., Rossi, G., Canfora, G., Tilley, S.: A comprehensive design model for integrating business processes in web applications. Int. J. Web Eng. Technol. 2(1), 43–72. Inderscience Publishers (2007)Google Scholar
  15. 15.
    Eichmann, D.: Evolving an engineered web. In: Proceedings of the International Workshop Web Site Evolution, pp. 12–16, Atlanta, GA (1999)Google Scholar
  16. 16.
    Flynn P.J., Jain A.K., Murty M.N.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)CrossRefGoogle Scholar
  17. 17.
    Garrido, A., Rossi, G., Distante, D.: Model refactoring in web applications. In: Proceedings of the 9th International Symposium on Web Site Evolution, pp. 89-96, IEEE CS Press (2007)Google Scholar
  18. 18.
    Garzotto, F., Perrone, V.: On the acceptability of conceptual design models for web applications. In: Proceedings of Conceptual Modeling for Novel Application Domains—ER’03 Workshops (Chicago, US). LNCS, vol. 2814, pp. 92–104 (2003)Google Scholar
  19. 19.
    Guttman L.: Some necessary conditions for common factor analysis. Psychometrika 19, 149–161 (1954)zbMATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Harman, D.: Ranking algorithms. In: Information Retrieval: Data Structures and Algorithms, pp. 363–392. Prentice-Hall, Englewood Cliffs, NJ (1992)Google Scholar
  21. 21.
    Kaiser H.F.: The application of electronic computers to factor analysis. Educ. Psychol. Meas. 20, 141–151 (1960)CrossRefGoogle Scholar
  22. 22.
    Kappel, G., Pröll, B., Reich, S., Retschitzegger, W. (eds): Web Engineering: The Discipline of Systematic Development of Web Applications. Wiley, New York (2006)Google Scholar
  23. 23.
    Koch, N., Kraus, A., Hennicker, R.: The authoring process of the UML-based web engineering approach. In: Proceedings of the 1st International Workshop on Web-Oriented Software Technology, pp. 105–119, Valencia, Spain (2001)Google Scholar
  24. 24.
    Kuhn, A., Ducasse, S., Girba, T.: Enriching reverse engineering with semantic clustering. In: Proceedings of 12th Working Conference on Reverse Engineering, 10–20, IEEE CS Press (2005)Google Scholar
  25. 25.
    Landauer T.K., Dumais S.T.: Solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychol. Rev. 104(2), 211–240 (1997)CrossRefGoogle Scholar
  26. 26.
    Levenshtein V.L: Binary codes capable of correcting deletions, insertions, and reversals. Cybern. Control Theory 10, 707–710 (1966)MathSciNetGoogle Scholar
  27. 27.
    Lowe, D., Kong, X.: NavOptim coding: supporting website navigation optimisation using effort minimisation. In: 2004 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 91–97, Beijing, China, IEEE CS Press (2004)Google Scholar
  28. 28.
    Maletic, J.I., Marcus, A.: Supporting program comprehension using semantic and structural information. In: Proceedings of 23rd International Conference on Software Engineering, pp. 103-112, Toronto, Ont., Canada (2001)Google Scholar
  29. 29.
    Nakov, P.: Latent semantic analysis for German literature investigation. In: Proceedings of the International Conference, 7th Fuzzy Days on Computational Intelligence, Theory and Applications, pp. 834–841, London, UK, Springer (2001)Google Scholar
  30. 30.
    Oudshoff A.M., Bosloper I.E., Klos T.B., Spaanenburg L.: Knowledge discovery in virtual community texts: clustering virtual communities. J. Intell. Fuzzy Syst. 14(1), 13–24 (2003)zbMATHGoogle Scholar
  31. 31.
    Pearson, J.M., Pearson, A.: An exploratory study into determining the relative importance of key criteria in web usability: a multi-criteria approach. J. Comput. Inform. Syst. (2008)Google Scholar
  32. 32.
    Ricca F., Tonella P.: Understanding and restructuring web sites with reweb. IEEE Multimedia 8(2), 40–51 (2001)CrossRefGoogle Scholar
  33. 33.
    Ricca, F., Tonella, P.: Using clustering to support the migration from static to dynamic web pages. In: Proceedings of International Workshop on Program Comprehension, pp. 207–216, Portland, Oregon, USA (2003)Google Scholar
  34. 34.
    Ricca F., Tonella P., Girardi C., Pianta E.: Improving web site understanding with keyword-based clustering. J. Softw. Maintenance Evol. Res. Pract. 20(1), 1–29 (2008)CrossRefGoogle Scholar
  35. 35.
    Schwabe, D., Rossi, G.: An object-oriented approach to web-based application design. Theory and Practice of Object Systems (TAPOS), Special Issue on the Internet 4(4), pp. 207–225 (1998)Google Scholar
  36. 36.
    Scanniello, G., Distante, D., Risi, M.: Using semantic clustering to enhance the navigation structure of web sites. In: Proceedings of the 10th International Symposium on Web Site Evolution, pp. 55–64, IEEE CS Press (2008)Google Scholar
  37. 37.
    Tonella, P., Ricca, F., Pianta, E., Girardi, C.: Restructuring multilingual web sites. In: Proceedings of the 18th International Conference on Software Maintenance (ICSM 2002), pp. 290–299, Montreal, Canada, IEEE CS Press (2002)Google Scholar
  38. 38.
    Wild, F., Stahl, C., Stermsek, G., Neumann, G., Penya, Y.: Parameters driving effectiveness of automated essay scoring with LSA. In: Proceedings of the 9th Computer Assisted Assessment Conference (CAA 2005), pp.485–494, Loughborough, UK (2005)Google Scholar
  39. 39.
    Wohlin C., Runeson P., Host M., Ohlsson M.C., Regnell B., Wesslen A.: Experimentation in Software Engineering—An Introduction. Kluwer Academic Publishers Group, London (2000)zbMATHGoogle Scholar
  40. 40.
    Tsakonas G., Papatheodorou C.: Exploring usefulness and usability in the evaluation of open access digital libraries. Int. J. Inform. Process. Manage. 44(3), 1234–1250. Pergamon Press, Inc., New York (2008)Google Scholar
  41. 41.
    Tilley, S.: Ten years of web site evolution. In: Proceedings of the 10th IEEE International Symposium on Web Site Evolution, pp. 11–17, IEEE Press (2008)Google Scholar

Copyright information

© Springer-Verlag 2009

Authors and Affiliations

  • Giuseppe Scanniello
    • 1
    Email author
  • Damiano Distante
    • 2
  • Michele Risi
    • 3
  1. 1.Department of Mathematics and Computer ScienceUniversity of BasilicataPotenzaItaly
  2. 2.Faculty of EconomicsTel.M.A. UniversityRomeItaly
  3. 3.Department of Mathematics and Computer ScienceUniversity of SalernoSalernoItaly

Personalised recommendations