Abstract
The identification of reliable and interesting items on Internet becomes more and more difficult and time consuming. This paper is a position paper describing our intended work in the framework of multimedia information retrieval by browsing techniques within web navigation. It relies on a usage-based indexing of resources: we ignore the nature, the content and the structure of resources. We describe a new approach taking advantage of the similarity between statistical modeling of language and document retrieval systems. A syntax of usage is computed that designs a Statistical Grammar of Usage (SGU). A SGU enables resources classification to perform a personalized navigation assistant tool. It relies both on collaborative filtering to compute virtual communities of users and classical statistical language models. The resulting SGU is a community dependent SGU.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM Press, New York (1999)
Castagnos, S., Boyer, A.: A client/server user-based collaborative filtering algorithm model and implementation. In: Proceedings of the 17th European Conference on Articial Intelligence (ECAI 2006), Riva del Garda, Italy (August 2006)
Castagnos, S., Boyer, A.: Frac+: A distributed collaborative filtering model for client/server architectures. In: 2nd conference on web information systems and technologies (WEBIST 2006), Setùbal, Portugal (2006)
Smaïli, K., et al.: Automatic and manual clustering for large vocabulary speech re cognition: A comparative study. In: European Conference on Speech Communication and Technology, Budapest, Hungary (1999)
Brun, A., Smaïli, K., Haton, J.P.: Contribution to topic identification by using word similarity. In: International Conference on Spoken Language Processing (ICSLP2002) (2002)
Chan, P.: A non-invasive learning approach to building web user profiles. In: 5th International Conference on Knowledge Discovery and Data Mining - Workshop on Web Usage Analysis and User Profiling, San Diego, USA, August 1999, Sage, Thousand Oaks (1999)
Herlocker, J., et al.: Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems (TOIS) 22(1), 5–53 (2004)
Rosenfeld, R.: Two decades of statistical language modeling: Where do we go from here (2000)
Rosenfeld, R.: A maximum entropy approach to adaptative statistical language modeling. Computer Speech and Language 10, 187–228 (1996)
Abramson, N.: Information Theory and Coding. McGraw-Hill, New-York (1963)
Shardanand, U., Maes, P.: Social information filtering: algorithms for automating ”word of mouth”. In: Proceedings of the ACM CHI’95 - Conference on Human Factors in Computing Systems, vol. 1, pp. 210–217. ACM Press, New York (1995)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J. of the Royal Statistical Society 39 (1977)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Boyer, A., Brun, A. (2007). Natural Language Processing for Usage Based Indexing of Web Resources. In: Amati, G., Carpineto, C., Romano, G. (eds) Advances in Information Retrieval. ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71496-5_46
Download citation
DOI: https://doi.org/10.1007/978-3-540-71496-5_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71494-1
Online ISBN: 978-3-540-71496-5
eBook Packages: Computer ScienceComputer Science (R0)