Revisiting Fisher Kernels for Document Similarities

  • Martin Nyffenegger
  • Jean-Cédric Chappelier
  • Éric Gaussier
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4212)


This paper presents a new metric to compute similarities between textual documents, based on the Fisher information kernel as proposed by T. Hofmann. By considering a new point-of-view on the embedding vector space and proposing a more appropriate way of handling the Fisher information matrix, we derive a new form of the kernel that yields significant improvements on an information retrieval task. We apply our approach to two different models: Naive Bayes and PLSI.


Fisher Information Matrix Latent Class Model Document Similarity Document Length Fisher Kernel 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Hofmann, T.: Learning the similarity of documents: An information-geometric approach to document retrieval and categorization. In: Advances in Neural Information Processing Systems (NIPS), vol. 12, pp. 914–920 (2000)Google Scholar
  2. 2.
    Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Advances in Neural Information Processing Systems (NIPS), vol. 11, pp. 487–493 (1999)Google Scholar
  3. 3.
    Lewis, D.: Naive (Bayes) at forty: The independence assumption in information retrieval. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 4–15. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  4. 4.
    Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using EM. Machine Learning 39, 103–134 (2000)MATHCrossRefGoogle Scholar
  5. 5.
    Hofmann, T.: Probabilistic latent semantic indexing. In: Proc. of 22th International Conference on Research and Development in Information Retrieval, pp. 50–57 (1999)Google Scholar
  6. 6.
    Nyffenegger, M.: Similarités textuelles à base de noyaux de Fisher. Master’s thesis, Ecole Polytechnique Fédérale de Lausanne, Switerland (2005)Google Scholar
  7. 7.
    Jin, X., Zhou, Y., Mobasher, B.: Web usage mining based on probabilistic latent semantic analysis. In: Proc. of the 10th ACM SIGKDD International Conference on Knowledge discovery and Data Mining (KDD 2004), pp. 197–205 (2004)Google Scholar
  8. 8.
    Ahrendt, P., Goutte, C., Larsen, J.: Co-occurrence models in music genre classification. In: IEEE Int. Workshop on Machine Learning for Signal Processing (2005)Google Scholar
  9. 9.
    Vinokourov, A., Girolami, M.: A probabilistic framework for the hierarchic organisation and classification of document collections. Journal of Intelligent Information Systems 18, 153–172 (2002)CrossRefGoogle Scholar
  10. 10.
    Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)MATHGoogle Scholar
  11. 11.
    McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, Chichester (2000)MATHCrossRefGoogle Scholar
  12. 12.
    Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval. Addison-Wesley, Reading (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Martin Nyffenegger
    • 1
  • Jean-Cédric Chappelier
    • 1
  • Éric Gaussier
    • 2
  1. 1.Ecole Polytechnique Fédérale de LausanneSwitzerland
  2. 2.Xerox Research Center EuropeMeylanFrance

Personalised recommendations