Interactive Cross-Language Document Selection

Oard, Douglas W.; Gonzalo, Julio; Sanderson, Mark; López-Ostenero, Fernando; Wang, Jianqiang

doi:10.1023/B:INRT.0000009446.22036.e3

Interactive Cross-Language Document Selection

Published: January 2004

Volume 7, pages 205–228, (2004)
Cite this article

Download PDF

Information Retrieval Aims and scope Submit manuscript

Interactive Cross-Language Document Selection

Download PDF

Douglas W. Oard¹,
Julio Gonzalo²,
Mark Sanderson³,
Fernando López-Ostenero⁴ &
…
Jianqiang Wang⁵

112 Accesses
14 Citations
Explore all metrics

Abstract

The problem of finding documents written in a language that the searcher cannot read is perhaps the most challenging application of cross-language information retrieval technology. In interactive applications, that task involves at least two steps: (1) the machine locates promising documents in a collection that is larger than the searcher could scan, and (2) the searcher recognizes documents relevant to their intended use from among those nominated by the machine. This article presents the results of experiments designed to explore three techniques for supporting interactive relevance assessment: (1) full machine translation, (2) rapid term-by-term translation, and (3) focused phrase translation. Machine translation was found to better support this task than term-by-term translation, and focused phrase translation further improved recall without an adverse effect on precision. The article concludes with an assessment of the strengths and weaknesses of the evaluation framework used in this study and some remarks on implications of these results for future evaluation campaigns.

References

Capstick J, Diagne AK, Erbach G, Uzkoreit H, Leisenberg A and Leisenberg M (1999) A system for supporting cross-lingual information retrieval. Information Processing and Management, 36(2):275–289.
Google Scholar
Cleveland DB and Cleveland AD (2000) Introduction to Indexing and Abstracting, 3rd edn. Libraries Unlimited, Englewood, CO.
Google Scholar
Hearst MA (1999) User interfaces and visualization. In: Baeza-Yates R and Ribeiro-Neto B, Eds. Modern Information Retrieval. Addison Wesley, New York, Chapt. 10.
Google Scholar
Hersh W, Turpin A, Price S, Chan B, Kraemer D, Sacherek L and Olson D (1998) Do batch and user evaluations give the same results? In: Proceedings of the 23nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 17-24.
Lagergren E and Over P (1998) Comparing interactive information retrieval systems across sites: The TREC-6 interactive track matrix experiment. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.
López-Ostenero F, Gonzalo J, Peñas A and Verdejo F (2001) Noun phrase translations for cross-language document selection. In: Evaluation of Cross-Language Information Retrieval Systems, Second Workshop of the Cross-Language Evaluation Forum, CLEF-2001. Revised papers. Springer-Verlag, LNCS 2406.
Michos S, Stamatatos E and Fakotakis N (1999) Supporting multilinguality in library automation systems using AI tools. Applied Artificial Intelligence.
Oard DW and Diekema AR (1998) Cross-language information retrieval. In: Annual Review of Information Science and Technology, Vol. 33, American Society for Information Science.
Oard DW, Levow G-A and Cabezas CI (2001) CLEF experiments at Maryland: Statistical stemming and backoff translation. In: Peters C, Ed. Proceedings of the First Cross-Language Evaluation Forum. Cross-Language Information Retrieval and Evaluation. Springer-Verlag, LNCS 2069.
Oard DW and Resnik P (1999) Support for interactive document selection in cross-language information retrieval. Information Processing and Management, 35(3):363–379.
Google Scholar
Ogden W, Cowie J, Davis M, Ludovik E, Molina-Salgado H and Shin H (1999) Getting information from documents you cannot read: An interactive cross-language text retrieval and summarization system. In: Joint ACM DL/SIGIR Workshop on Multilingual Information Discovery and Access.
OgdenWC and Davis MW (2000) Improving cross-language text retrieval with human interactions. In: Proceedings of the 33rd Hawaii International Conference on System Sciences.
Peñas A, Gonzalo J and Verdejo F (2001) Cross-language information access through phrase browsing. In: Applications of Natural Language to Information Systems, pp. 121-130.
Pinheiro J and Bates D (2000) Mixed-Effects Models in S and S-PLUS. Springer.
Resnik P (1997) Evaluating multilingual gisting of Web pages. In: AAAI Symposium on Cross-Language Text and Speech Retrieval.
Sanderson M (1998) Accurate user-directed summarization from existing tools. In: Proceedings of the 7th International Conference on Information and Knowledge Management.
Sanderson M and Bathie Z (2001) iCLEF at Sheffield. In: Evaluation of Cross-Language Information Retrieval Systems. Second Workshop of the Cross-Language Evaluation Forum, CLEF-2001. Revised papers. Springer-Verlag, LNCS 2406.
Suzuki M, Inoue N and Hashimoto K (2001) A method for supporting document selection in cross-language information retrieval and its evaluation. Computers and the Humanities, 35(4):421–438.
Google Scholar
Taylor K and White J (1998) Predicting what MT is good for: User judgments and task performance. In: Farwell D, Gerber L and Hovy E, Eds. Third Conference of the Association for Machine Translation in the Americas, Springer. Lecture Notes in Artificial Intelligence 1529, pp. 364-373.
van Rijsbergen CJ (1979) Information Retrieval, 2nd edn. Butterworths, London.
Google Scholar
Voorhees E (1998) Variations in relevance judgments and the measurement of retrieval effectiveness. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.
Wang J and Oard DW (2001) iCLEF 2001 at Maryland: Comparting term-for-term and gloss translations. In: Evaluation of Cross-Language Information Retrieval Systems. Second Workshop of the Cross-Language Evaluation Forum, CLEF-2001. Revised papers. Springer-Verlag, LNCS 2406.
White JS and Taylor KB (1998) A task-oriented evaluation metric for machine translation. In: First International Conference on Language Resources and Evaluation, pp. 21-25.

Download references

Author information

Authors and Affiliations

Human-Computer Interaction Laboratory, College of Information Studies and Institute for Advanced Computer Studies, University of Maryland, College Park, MD, 20742, USA
Douglas W. Oard
Departamento de Lenguajes y Sistemas Informáticos, Universidad Nacional de Educación a Distancia, E.T.S.I Industriales, Ciudad Universitaria s/n, 28040, Madrid, Spain
Julio Gonzalo
Department of Information Studies, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK, USA
Mark Sanderson
Departamento de Lenguajes y Sistemas Informáticos, Universidad Nacional de Educación a Distancia, E.T.S.I Industriales, Ciudad Universitaria s/n, 28040, Madrid, Spain
Fernando López-Ostenero
College of Information Studies, University of Maryland, College Park, MD, 20742, USA
Jianqiang Wang

Authors

Douglas W. Oard
View author publications
You can also search for this author in PubMed Google Scholar
Julio Gonzalo
View author publications
You can also search for this author in PubMed Google Scholar
Mark Sanderson
View author publications
You can also search for this author in PubMed Google Scholar
Fernando López-Ostenero
View author publications
You can also search for this author in PubMed Google Scholar
Jianqiang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Oard, D.W., Gonzalo, J., Sanderson, M. et al. Interactive Cross-Language Document Selection. Information Retrieval 7, 205–228 (2004). https://doi.org/10.1023/B:INRT.0000009446.22036.e3

Download citation

Issue Date: January 2004
DOI: https://doi.org/10.1023/B:INRT.0000009446.22036.e3

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Interactive Cross-Language Document Selection

Abstract

Article PDF

Similar content being viewed by others

Lessons Learnt from Experiments on the Ad Hoc Multilingual Test Collections at CLEF

Overview of INEX 2014

Adjusting Machine Translation Datasets for Document-Level Cross-Language Information Retrieval: Methodology

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Interactive Cross-Language Document Selection

Abstract

Article PDF

Similar content being viewed by others

Lessons Learnt from Experiments on the Ad Hoc Multilingual Test Collections at CLEF

Overview of INEX 2014

Adjusting Machine Translation Datasets for Document-Level Cross-Language Information Retrieval: Methodology

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation