Adaptive Term Weighting through Stochastic Optimization

  • Michael Granitzer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6008)


Term weighting strongly influences the performance of text mining and information retrieval approaches. Usually term weights are determined through statistical estimates based on static weighting schemes. Such static approaches lack the capability to generalize to different domains and different data sets. In this paper, we introduce an on-line learning method for adapting term weights in a supervised manner. Via stochastic optimization we determine a linear transformation of the term space to approximate expected similarity values among documents. We evaluate our approach on 18 standard text data sets and show that the performance improvement of a k-NN classifier ranges between 1% and 12% by using adaptive term weighting as preprocessing step. Further, we provide empirical evidence that our approach is efficient to cope with larger problems.


Information Retrieval Linear Transformation Gradient Descent Weighting Scheme Text Mining 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, Inc, New York (1986)Google Scholar
  2. 2.
    Robertson, S., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M.: Okapi at trec-3. In: Proceedings of the Third Text REtrieval Conference (TREC 1994), pp. 109–126 (1996)Google Scholar
  3. 3.
    Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: SIGIR 1998: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 275–281. ACM, New York (1998)CrossRefGoogle Scholar
  4. 4.
    Fang, H., Zhai, C.: An exploration of axiomatic approaches to information retrieval. In: SIGIR 2005: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 480–487. ACM, New York (2005)CrossRefGoogle Scholar
  5. 5.
    Metzler, D., Zaragoza, H.: Semi-parametric and non-parametric term weighting for information retrieval. In: Azzopardi, L., Kazai, G., Robertson, S., Rüger, S., Shokouhi, M., Song, D., Yilmaz, E. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 42–53. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  6. 6.
    Anh, V.N., Moffat, A.: Simplified similarity scoring using term ranks. In: SIGIR 2005: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 226–233. ACM, New York (2005)CrossRefGoogle Scholar
  7. 7.
    Ernandes, M., Angelini, G., Gori, M., Rigutini, L., Scarselli, F.: Adaptive context-based term (re)weighting an experiment on single-word question answering. Frontiers in Artificial Intelligence and Applications 141, 1 (2006)Google Scholar
  8. 8.
    Shwartz, S.S., Singer, Y., Ng, A.Y.: Online and batch learning of pseudo-metrics. In: ICML 2004: Proceedings of the twenty-first international conference on Machine learning, p. 94+. ACM, New York (2004)CrossRefGoogle Scholar
  9. 9.
    Manning, C.D., Schutze, H.: Foundations of Statistical Natural Language Processing, vol. 8. MIT Press, Cambridge (2000, 2002)Google Scholar
  10. 10.
    Granitzer, M., Neidhart, T., Lux, M.: Learning term spaces based on visual feedback. In: Proc. 17th International Conference on Database and Expert Systems Applications DEXA 2006, pp. 176–180. IEEE, Los Alamitos (2006)CrossRefGoogle Scholar
  11. 11.
    Bottou, L.: Stochastic learning. In: Bousquet, O., von Luxburg, U., Rätsch, G. (eds.) Machine Learning 2003. LNCS (LNAI), vol. 3176, pp. 146–168. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  12. 12.
    Zhao, Y., Karypis, G.: Evaluation of hierarchical clustering algorithms for document datasets. In: Proc. of CIKM 2002, McLean, Virginia, pp. 515–524 (2002)Google Scholar
  13. 13.
    Bishop, C.M.: Neural networks for pattern recognition. Oxford University Press, Oxford (1996)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Michael Granitzer
    • 1
    • 2
  1. 1.Graz University of TechnologyGrazAustria
  2. 2.Know-Center Graz, Url:

Personalised recommendations