Abstract
The need to understand the fabric of relationships that are building up on the World Wide Web calls for the application of tools that allow one to extract the underlying knowledge. Some of the most interesting relationships are those that are brought to light by co-linking analysis (the Web analogue of cocitation analysis). We here propose such an analysis based on the co-links that are generated within a closed web environment, using multivariate statistics (Principal Component Analysis, and Multidimensional Scaling) and a connection-based technique (Kohonen's Self-Organizing Maps). An application was made to a generic thematic environment, and the underlying relationships and structures were manifest in the interpretation of the results.
Similar content being viewed by others
References
ALMIND, T. C., INGWERSEN, P. (1997), Informetric analyses on the World Wide Web: methodological approaches to ‘Webometrics’. Journal of Documentation, 53 (4): 404-426.
BJöRNEBORN, L., INGWERSEN, P. (2001). Perspectives of webometrics. Scientometrics, 50 (1): 65-82.
BOUDOURIDES, M. A., SIGRIT, B., ALEVIZOS, P. D. (1999), Webometrics and the self-organization of the European Information Society. Available at: http://hyperion.math.upatras.gr/webometrics (visited: 16 October 2000).
CHEN, H., HOUSTON, A., SEWELL, R., SCHATZ, B. (1998), Internet browsing and searching: user evaluations of category map and concept space techniques. Journal of the American Society for Information Science, 49 (7): 582-603.
CHEN, H., COOPER, M. D. (2001), Using clustering techniques to detect usage patterns in a Web-based Information System. Journal of the American Society for Information Science & Technology, 52 (11): 888-904.
CHU, H., HE, S., THELWALL, M. (2002), Library and Information Science Schools in Canada and USA: a Webometric perspective. Journal of Education for Library and Information Science, 43 (2): 110-125.
CRONIN, B. (2001), Bibliometrics and beyond: some thoughts on web-based citation analysis. Journal of Information Science, 27 (1): 1-7.
DING, Y., CHOWDHURY, G. G., FOO, S. (2000). Journals as markers of intellectual space: journals of the information retrieval area, 1987-1997. Scientometrics, 47 (1): 55-73.
EGGHE, L. (2000). New informetric aspects of the Internet: some reflections-many problems. Journal of Information Science, 26 (5): 329-335.
FABA-PéREZ, C. (2003), Análisis cibermétrico de la información WEB: el caso de Extremadura en Internet. PhD. Thesis, University of Granada, Spain.
FUJIGAKI, Y. (1998). The citation system: citation networks as repeatedly focusing on difference, continuous re-evaluation, and as persistent knowledge accumulation. Scientometrics, 43 (1): 77-85.
GARCíA-SANTIAGO, M. D. (2001), Topología de la información en la World Wide Web: modelo experimental y bibliométrico en una red hipertextual nacional. PhD. Thesis, Univ. of Granada, Spain.
GARFIELD, E. (1998), From Citation Indexes to Informetrics: Is the tail now wagging the dog? Libri, 48 (2): 67-80.
GUERRERO-BOTE, V. P. (1997), Redes Neuronales aplicadas a las Técnicas de Recuperación Documental, PhD. Thesis, University of Granada, Spain.
GUERRERO-BOTE, V. P., MOYA-ANEGÓN, F. DE (2001), Reduction of the Dimension of a Document Space using the Fuzzified Output of a Kohonen Network. Journal of the American Society for Information Science, 52: 1234-1241.
GUERRERO-BOTE, V. P., MOYA-ANEGÓN, F. DE, HERRERO-SOLANA, V. (2002a), Document organization using Kohonen.s algorithm. Information Processing & Management, 38: 79-89.
GUERRERO-BOTE, V. P., MOYA-ANEGÓN, F. DE, HERRERO-SOLANA, V. (2002b), Automatic extraction of relationships between terms by means of Kohonen.s algorithm. Library & Information Science Research, 24: 235-250.
HARTER, S. P., FORD, C. E. (2000), Web-based analyses of e-journal impact: approaches, problems, and issues. Library Science With a Slant to Documentation and Information Studies, 51 (13): 1159-1176.
HE, Y., HIU, S. C. (2002), Mining a Web Citation Database for author co-citation analysis. Information Processing and Management, 38: 491-508.
HERRERO-SOLANA, V. (2000), Modelos de representación visual de la información bibliográfica: aproximaciones multivariantes y conexionistas. PhD. Thesis, Univ. of Granada, Spain.
HILERA, J. R., MARTíNEZ, V. J. (1995), Redes neuronales artificiales: fundamentos, modelos y aplicaciones. Madrid: RAMA.
KASKI, S. (1999), Fast winner search for SOM-based monitoring and retrieval of high-dimensional data. In: Proceedings of the Ninth International Conference on Artificial Neural Networks (ICANN99). London: Institution of Electrical Engineers, pp. 940-945.
KIM, H. J. (2000), Motivations for hyperlinking in scholarly electronic articles: a qualitative study. Journal of the American Society for Information Science, 51 (10): 887-899.
KOHONEN, T. (1982), Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43 (1): 59-69.
KOHONEN, T. (1989), Self-Organization and Associative Memory. Berlin: Springer Verlag.
KOHONEN, T. (1990), The self-organizing Map. In: Proceedings of the IEEE, pp. 1464-1480.
KOHONEN, T. (1995). Self-Organization Maps. Berlin, Heidelberg: Springer Verlag.
KOHONEN, T., KASKI, S., LAGUS, K., SALOJARVI, J., HONKELA, J., PAATERO, V., SAARELA, A. (1999), Self-organization of a massive text document collection. In: OJA, E., KASKI, S. (Eds), Kohonen Maps. Amsterdam: Elsevier, pp. 171-182.
LAGUS, K., KASKI, S. (1999), Keyword selection method for characterizing text document maps. In: Proceedings of the Ninth International Conference on Artificial Neural Networks (ICANN99). London: Institution of Electrical Engineers, pp. 371-376.
LAGUS, K., HONKELA, T., KASKI, S., KOHONEN, T. (1999), WEBSOM for textual data mining. Artificial Intelligence Review, 13 (5-6): 345-364.
Larsen, J., L. K. HANSEN, A. SZYMKOWIAK, T. CHRISTIANSEN, T. KOLENDA (2002), Webmining learning from the World Wide Web. Computational Statistics & Data Analysis, 38 (4): 517-532.
LARSON, R. R. (1996), Bibliometrics of the World Wide Web: an exploratory analysis of the intellectual structure of cyberspace. In: HARDIN, S. (Ed.). Proceedings of the 59th Annual Meeting of the American Society for Information Science (Baltimore, Maryland, 1996), Information Today, Medford, New Jersey, pp. 71-78. Available at: http://sherlock.berkeley.edu/asis96/asis96.html (visited: 14 October 2000).
LIN, X. (1997), Maps displays for information retrieval. Journal of the American Society for Information Science, 48 (1): 40-54.
LÓPEZ-LÓPEZ, P. (1996), Introducción a la Bibliometría. Valencia: Promolibro.
MARSHAKOVA, V. (1973), System of document connections based on references. Nauchno-Tekhnichescaya Informatisya, Series II (6): 3-8.
MCKIERNAN, G. (1996). CitedSites(sm): Citation Indexing of Web Resources. Available at: http://www.public.iastate.edu/∼CYBERSTACKS/Cited.htm (visited: 24 February 2000).
MILMAN, B. L. (1994), Individual co-citation clusters as nuclei of complete and dynamic informetric models of scientific and technological areas. Scientometrics, 31 (1): 45-57.
MOYA-ANEGÓN, F., MOSCOSO, P., OLMEDA, C., ORTIZ-REPSIO, V., HERRERO-SOLANA, V., GUERRERO-BOTE, V. P. (1999), NeuroISOC: un modelo de red neuronal para la representación del conocimiento. In: LÓPEZ-HUERTAS, M. J., FERNáNDEZ-MOLINA, J. C. (Eds), La representación y la organización del conocimiento en sus distintas perspectivas: su influencia en la recuperación de información. Actas del IV Congreso ISKO-España (EOCONSID-99). Granada: ISKO-España, pp. 151-156.
MOYA-ANEGÓN, F. DE, JIMéNEZ-CONTRERAS, E., MONEDA-CORROCHANO, M. DE LA (1998), Research fronts in library and information science in Spain (1985-1994). Scientometrics, 42 (2): 229-246.
PERSSON, O. (1994), The intellectual base and research fronts of JASIS 1986-1990. Journal of the American Society for Information Science, 45 (1): 31-38.
PRICE, D. J. DE SOLLA. (1970), Citation measures of hard science, soft science, technology and non-science. In: NELSON, C. C., POLLOCK, D. E. (Eds), Communication among scientists and engineers. Lexington, Mass.: D. C. Health and Co., pp. 3-22.
ROUSSEAU, R. (1997). Sitations: an exploratory study. Cybermetrics: International Journal of Scientometrics, Informetrics and Bibliometrics, Vol. 1. Available at: http://www.cindoc.csic.es/cybermetrics/articles/v1i1p1.html (visited: 5 September 2000).
SMALL, H. (1973). Co-citation in the scientific literature: a new measure of the relationship between two documents. Journal of the American Society for Information Science, 24 (4): 265-269.
SMALL, H., SWEENEY, E. (1985), Clustering the Science Citation Index using co-citations: 2-mapping science, Scientometrics, 8 (5-6): 321-340.
SMITH, A. G. (1999), The impact of web sites: a comparison between Australasia and Latin America. Available at: http://www.vuw.ac.nz/∼agsmith/publns/austlat/ (visited: 14 May 2001).
VAN DER BESSELAAR, P., LEYDESDORFF, L. (1997), Mapping change in scientific specialties: a scientometric reconstruction of the development of Artificial Intelligence. Journal of the American Society for Information Science, 47 (6): 415-436.
VAN RAAN, A. F. J. (1991), Fractal geometry of information space as represented by co-citation clustering. Scientometrics, 20 (3): 439-449.
VAN RAAN, A. F. J. (2001), Bibliometrics and Internet: some observations and expectations. Scientometrics, 50 (1): 59-63.
VREELAND, R. C. (2000), Law libraries in hyperspace: a citation analysis of World Wide Web sites. Law Library Journal, 92 (1): 9-25.
WHITE, H. D. (1981), Cocited author retrieval online: an experiment with the social indicators literature. Journal of the American Society for Information Science, 32 (1): 16-21.
WHITE, H. D. (1983), A cocitation of the social indicators movement. Journal of the American Society for Information Science, 34 (5): 307-312.
WHITE, H. D., GRIFFITH, B. C. (1981), Author cocitation: a literature measure of intellectual structure. Journal of the American Society for Information Science, 32 (3): 163-171.
WHITE, H. D., MCCAIN, K. W. (1997), Visualization of literatures. In: WILLIAM, M. (Ed.). Annual Review of Information Science and Technology. Medford: Information Today, pp. 99-168.
WHITE, H. D., MCCAIN, K. W. (1998), Visualizing a discipline: an author co-citation analysis of information science, 1872-1995. Journal of the American Society for Information Science, 49 (4): 327-355.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Faba-Pérez, C., Guerrero-Bote, V.P. & De Moya-Anegón, F. Data mining in a closed Web environment. Scientometrics 58, 623–640 (2003). https://doi.org/10.1023/B:SCIE.0000006884.08036.73
Issue Date:
DOI: https://doi.org/10.1023/B:SCIE.0000006884.08036.73