Abstract
The identification of reliable and interesting items on Internet becomes more and more difficult and time consuming. This paper is a position paper describing our intended work in the framework of multimedia information retrieval by browsing techniques within web navigation. It relies on a usage-based indexing of resources: we ignore the nature, the content and the structure of resources. We describe a new approach taking advantage of the similarity between statistical modeling of language and document retrieval systems. A syntax of usage is computed that designs a Statistical Grammar of Usage (SGU). A SGU enables resources classification to perform a personalized navigation assistant tool. It relies both on collaborative filtering to compute virtual communities of users and classical statistical language models. The resulting SGU is a community dependent SGU.
Keywords
- Mutual Information
- Natural Language Processing
- Personalized Indexing
- Multimedia Information Retrieval
- Interesting Item
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM Press, New York (1999)
Castagnos, S., Boyer, A.: A client/server user-based collaborative filtering algorithm model and implementation. In: Proceedings of the 17th European Conference on Articial Intelligence (ECAI 2006), Riva del Garda, Italy (August 2006)
Castagnos, S., Boyer, A.: Frac+: A distributed collaborative filtering model for client/server architectures. In: 2nd conference on web information systems and technologies (WEBIST 2006), Setùbal, Portugal (2006)
Smaïli, K., et al.: Automatic and manual clustering for large vocabulary speech re cognition: A comparative study. In: European Conference on Speech Communication and Technology, Budapest, Hungary (1999)
Brun, A., Smaïli, K., Haton, J.P.: Contribution to topic identification by using word similarity. In: International Conference on Spoken Language Processing (ICSLP2002) (2002)
Chan, P.: A non-invasive learning approach to building web user profiles. In: 5th International Conference on Knowledge Discovery and Data Mining - Workshop on Web Usage Analysis and User Profiling, San Diego, USA, August 1999, Sage, Thousand Oaks (1999)
Herlocker, J., et al.: Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems (TOIS) 22(1), 5–53 (2004)
Rosenfeld, R.: Two decades of statistical language modeling: Where do we go from here (2000)
Rosenfeld, R.: A maximum entropy approach to adaptative statistical language modeling. Computer Speech and Language 10, 187–228 (1996)
Abramson, N.: Information Theory and Coding. McGraw-Hill, New-York (1963)
Shardanand, U., Maes, P.: Social information filtering: algorithms for automating ”word of mouth”. In: Proceedings of the ACM CHI’95 - Conference on Human Factors in Computing Systems, vol. 1, pp. 210–217. ACM Press, New York (1995)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J. of the Royal Statistical Society 39 (1977)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Boyer, A., Brun, A. (2007). Natural Language Processing for Usage Based Indexing of Web Resources. In: Amati, G., Carpineto, C., Romano, G. (eds) Advances in Information Retrieval. ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71496-5_46
Download citation
DOI: https://doi.org/10.1007/978-3-540-71496-5_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71494-1
Online ISBN: 978-3-540-71496-5
eBook Packages: Computer ScienceComputer Science (R0)
