Neural Processing Letters

, Volume 22, Issue 2, pp 149–161 | Cite as

Merging Strategy for Cross-Lingual Information Retrieval Systems based on Learning Vector Quantization

  • M. T. Martín-ValdiviaEmail author
  • F. Martínez-Santiago
  • L. A. Ureña-López


We present a new approach based on neural networks to solve the merging strategy problem for Cross-Lingual Information Retrieval (CLIR). In addition to language barrier issues in CLIR systems, how to merge a ranked list that contains documents in different languages from several text collections is also critical. We propose a merging strategy based on competitive learning to obtain a single ranking of documents merging the individual lists from the separate retrieved documents. The main contribution of the paper is to show the effectiveness of the Learning Vector Quantization (LVQ) algorithm in solving the merging problem. In order to investigate the effects of varying the number of codebook vectors, we have carried out several experiments with different values for this parameter. The results demonstrate that the LVQ algorithm is a good alternative merging strategy.


cross-lingual information retrieval Kohonen neural network learning vector quantization merging strategy retrieve status value 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chen, A.: Cross-Language Retrieval Experiments at CLEF-2002, In: C. Peters (ed.), Proceedings of the CLEF 2002 Cross-Language Text Retrieval System Evaluation Campaign. Lecture Notes in Computer Science, pp. 5–20, 2003.Google Scholar
  2. 2.
    Cristianini, N., Shawe-Taylor, J. 2000An Introduction to Support Vector MachinesCambridge University PressCAGoogle Scholar
  3. 3.
    Dumais, S.: Latent Semantic Indexing (LSI) and TREC-2, In: NIST (ed.), Proceedings of TREC’2, Vol. 500. Gaithersburg, pp. 105–115, 1994.Google Scholar
  4. 4.
    Frakes, W.Baeza-Yates, R. eds. 1992Information Retrieval: Data, Structures and AlgorithmPrentice HallNJGoogle Scholar
  5. 5.
    Genkin, A., Lewis, D. D. and Madigan, D.: Large-Scale Bayesian Logistic Regression for Text Categorization. Technical report, 2004.Google Scholar
  6. 6.
    Grefenstette, G. 1998Cross-Language Information RetrievalKluwer academic publishersBoston, USAGoogle Scholar
  7. 7.
    Joachims, T.: Learning to Classify Text Using Support Vector Machines. The Netherlands Kluwer, 2002.Google Scholar
  8. 8.
    Kohonen, T. 1995Self-organization and Associative Memory2Springer VerlagBerlinGoogle Scholar
  9. 9.
    Kohonen, T., Hynninen, J., Kangas, J., Laaksonen, J., Torkkola, K. 1996LVQ-PAK: The Learning Vector Quantization Program PackageUniversity of Technology, Laboratory of Computer and Information ScienceHelsinki, FinlandTechnical Report FIN-02150Google Scholar
  10. 10.
    Le Calvé, A., Savoy, J. 2000Database merging strategy based on logistic regressionInformation Processing and Management36341359CrossRefGoogle Scholar
  11. 11.
    Manning, C.Schtze, H. eds. 2000Foundations of Statistical Natural Language ProcessingMIT PressMAGoogle Scholar
  12. 12.
    Neumann, G.: Morphix Software Package,ñeumann/morphix/morphix.html, 2003.Google Scholar
  13. 13.
    Powell, A. L., French, J. C., Callan, J., Connell, M. and Viles, C. L.: The impact of database selection on distributed searching, In: T. A. Press (ed.), Proceedings of the 23rd International Conference of the ACM-SIGIR’2000. New York, pp. 232–239, 2000.Google Scholar
  14. 14.
    Robertson, S. E., Walker, S., Beaulieu, M. 2000Experimentation as a Way of Life: Okapi at TRECInformation Processing and Management195108CrossRefGoogle Scholar
  15. 15.
    Salton, G., McGill, M. J. 1983Introduction to Modern Information RetrievalMcGraw-HillLondon, U.K.Google Scholar
  16. 16.
    Savoy, J.: Report on CLEF-2001 Experiments, In: C. Peters (ed.) Proceedings of the CLEF 2001 Cross-Language Text Retrieval System Evaluation Campaign. Lecture Notes in Computer Science. pp. 27–43, 2002.Google Scholar
  17. 17.
    Savoy, J.: Report on CLEF-2002 Experiments: Combining Multiple Sources of Evidence, In: C. Peters (ed.), Proceedings of the CLEF 2002 Cross-Language Text Retrieval System Evaluation Campaign. Lecture Notes in Computer Science. pp. 31–46, 2003.Google Scholar
  18. 18.
    Towell, G., Voorhees, E., Gupta, N. and Johnson-Laird, B. Learning Collection Fusion Strategies for Information Retrieval, In: Proceedings Twelfth Anual Machine Learning Conference, 1995.Google Scholar
  19. 19.
    Voorhees, E., Gupta, N. and Jhonson-Laird, B. The collection fusion problem, In: NIST (ed.), Proceedings of the 3th Text Retrieval Conference TREC-3, Vol. 500. Gaithersburg, pp. 95–104, 1995.Google Scholar

Copyright information

© Springer 2005

Authors and Affiliations

  • M. T. Martín-Valdivia
    • 1
    Email author
  • F. Martínez-Santiago
    • 1
  • L. A. Ureña-López
    • 1
  1. 1.Departamento de InformáticaUniversity of JaénJaénSpain

Personalised recommendations