Skip to main content

Natural Language Processing for Usage Based Indexing of Web Resources

  • Conference paper
Advances in Information Retrieval (ECIR 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4425))

Included in the following conference series:

Abstract

The identification of reliable and interesting items on Internet becomes more and more difficult and time consuming. This paper is a position paper describing our intended work in the framework of multimedia information retrieval by browsing techniques within web navigation. It relies on a usage-based indexing of resources: we ignore the nature, the content and the structure of resources. We describe a new approach taking advantage of the similarity between statistical modeling of language and document retrieval systems. A syntax of usage is computed that designs a Statistical Grammar of Usage (SGU). A SGU enables resources classification to perform a personalized navigation assistant tool. It relies both on collaborative filtering to compute virtual communities of users and classical statistical language models. The resulting SGU is a community dependent SGU.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM Press, New York (1999)

    Google Scholar 

  2. Castagnos, S., Boyer, A.: A client/server user-based collaborative filtering algorithm model and implementation. In: Proceedings of the 17th European Conference on Articial Intelligence (ECAI 2006), Riva del Garda, Italy (August 2006)

    Google Scholar 

  3. Castagnos, S., Boyer, A.: Frac+: A distributed collaborative filtering model for client/server architectures. In: 2nd conference on web information systems and technologies (WEBIST 2006), Setùbal, Portugal (2006)

    Google Scholar 

  4. Smaïli, K., et al.: Automatic and manual clustering for large vocabulary speech re cognition: A comparative study. In: European Conference on Speech Communication and Technology, Budapest, Hungary (1999)

    Google Scholar 

  5. Brun, A., Smaïli, K., Haton, J.P.: Contribution to topic identification by using word similarity. In: International Conference on Spoken Language Processing (ICSLP2002) (2002)

    Google Scholar 

  6. Chan, P.: A non-invasive learning approach to building web user profiles. In: 5th International Conference on Knowledge Discovery and Data Mining - Workshop on Web Usage Analysis and User Profiling, San Diego, USA, August 1999, Sage, Thousand Oaks (1999)

    Google Scholar 

  7. Herlocker, J., et al.: Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems (TOIS) 22(1), 5–53 (2004)

    Article  Google Scholar 

  8. Rosenfeld, R.: Two decades of statistical language modeling: Where do we go from here (2000)

    Google Scholar 

  9. Rosenfeld, R.: A maximum entropy approach to adaptative statistical language modeling. Computer Speech and Language 10, 187–228 (1996)

    Article  Google Scholar 

  10. Abramson, N.: Information Theory and Coding. McGraw-Hill, New-York (1963)

    Google Scholar 

  11. Shardanand, U., Maes, P.: Social information filtering: algorithms for automating ”word of mouth”. In: Proceedings of the ACM CHI’95 - Conference on Human Factors in Computing Systems, vol. 1, pp. 210–217. ACM Press, New York (1995)

    Google Scholar 

  12. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J. of the Royal Statistical Society 39 (1977)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Giambattista Amati Claudio Carpineto Giovanni Romano

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Boyer, A., Brun, A. (2007). Natural Language Processing for Usage Based Indexing of Web Resources. In: Amati, G., Carpineto, C., Romano, G. (eds) Advances in Information Retrieval. ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71496-5_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71496-5_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71494-1

  • Online ISBN: 978-3-540-71496-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics