IR and AI: Traditions of Representation and Anti-representation in Information Processing

Wilks, Yorick

doi:10.1007/978-3-540-24752-4_2

Yorick Wilks⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2997))

Included in the following conference series:

European Conference on Information Retrieval

787 Accesses
1 Citations

Abstract

The paper discusses the traditional, and ongoing, question as to whether natural language processing (NLP) techniques, or indeed and representational techniques at all, aid in the retrieval of information, as that task is traditionally understood. The discussion is partly a response to Karen Sparck Jones’ (1999) claim that artificial intelligence, and by implication NLP, should learn from the methodology of Information Retrieval (IR), rather than vice versa, as the first sentence above implies. The issue has been made more interesting and complicated by the shift of interest from classic IR experiments with very long queries to Internet search queries which are typically of two highly ambiguous terms. This simple fact has changed the assumptions of the debate. Moreover, the return to statistical and empirical methods with NLP have made it less clear what an NLP technique, or even a “representational” method, is. The paper also notes the growth of “language models” within IR and the use of the term “translation” in recent years to describe a range of activities, including IR, and which constitutes rather the opposite of what Sparck Jones was calling for.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agichtein, E., Grishman, R., Borthwick, A., Sterling, J.: Description of the named entity system as used in MUC-7. In: Proceedings of the MUC-7 Conference, NYU (1998)
Google Scholar
Azzam, S., Humphreys, K., Gaizauskas, R.: Using coreference chains for text summarization. In: Proc. ACL Workshop on Coreference and its Applications, Maryland (1999)
Google Scholar
Bely, N., Borillo, A., Virbel, J., Siot-Decauville, N.: Procedures d’analyse semantique appliquees a la documentation scientifique. Gauthier-Villars, Paris (1970)
Google Scholar
Berger, A., Lafferty, J.: Information retrieval as statistical translation. In: SIGIR 1999 (1999)
Google Scholar
Berger, A., et al.: Bridging the lexical chasm: statistical approaches to question answering. In: SIGIR 2000 (2000)
Google Scholar
Bikel, D., Miller, S., Schwartz, R., Weischedel, R.: Nymble: a High-Performance Learning Name-finder. In: Proceedings of the Fifth conference on Applied Natural Language Processing (1997)
Google Scholar
Brill, E.: Some Advances in Transformation-Based Part of Speech Tagging. In: Proceedings of the Twelfth National Conference on AI (AAAI 1994), Seattle, Washington (1994)
Google Scholar
Brown, P.F., Cocke, J.: A Statistical Approach to Machine Translation, IBM Research Division, T.J. Watson Research Center, RC 14773 (1989)
Google Scholar
Carberry, S., Samuel, K., Vijay-Shanker, K.: Dialogue act tagging with transformation-based learning. In: Proceedings of the COLING-ACL 1998 Conference, Montreal, Canada, vol. 2, pp. 1150–1156 (1998)
Google Scholar
Cardie, C.: Empirical methods in information extraction. AI Magazine 18(4) (1997); Special Issue on Empirical Natural Language Processing
Google Scholar
Chiaramella, Y., Nie, J.: A retrieval model based on an extended modal logic and its application to the RIME experimental approach. In: Proceedings of the 13th ACM International Conference on Research and Development in Information Retrieval (SIGIR), pp. 25–43 (1990)
Google Scholar
Collier, R.: Automatic Template Creation for Information Extraction. PhD thesis, University of Sheffield Computer Science Dept., UK (1998)
Google Scholar
Cowie, J., Guthrie, L., Jin, W., Odgen, W., Pustejowsky, J., Wanf, R., Wakao, T., Waterman, S., Wilks, Y.: CRL/Brandeis: The Diderot System. In: Proceedings of Tipster Text Program (Phase I), Morgan Kaufmann, San Francisco (1993)
Google Scholar
Cunningham, H.: JAPE – a Java Annotation Patterns Engine. Technical Report, Department of Computer Science, University of Sheffield (1999)
Google Scholar
Daelemans, W., Zavrel, J., van der Sloot, K., van den Bosch, A.: TiMBL: Tilburg memory based learner version 1.0. Technical report, ILK Technical Report 98-03 (1998)
Google Scholar
Gaizauskas, R., Wilks, Y.: Information Extraction: beyond document retrieval. Journal of Documentation (1997)
Google Scholar
Gardin, J.: Syntol. Rutgers Graduate School of Library Science, New Brunswick (1965)
Google Scholar
Gollins, T., Sanderson, M.: Improving Cross Language Information Retrieval with triangulated translation. In: SIGIR 2001 (2001)
Google Scholar
Granger, R.: FOULUP: a program that figures out meanings of words from context. In: Proc. Fifth Joint Internat. Conf. on AI (1977)
Google Scholar
Green, B., Wolf, A., Chomsky, C., Laughery, K.: BASEBALL, an automatic question answerer. In: Proc. Western Joint Computer Conference 19, pp. 219–224 (1961)
Google Scholar
Grefenstette, G., Hearst, M.A.: Method for Refining Automatically-Discovered Lexical Relations: Combining Weak Techniques for Stronger Results. In: Weir (ed.) Statistically-based natural language programming techniques, Proc. AAAI Workshop, AAAI Press, Menlo Park (1992)
Google Scholar
Grishman, R.: Information extraction: Techniques and challenges. In: Pazienza, M.T. (ed.) SCIE 1997. LNCS (LNAI), vol. 1299, Springer, Heidelberg (1997)
Google Scholar
Grishman, R., Sterling, J.: Generalizing automatically generated patterns. In: Proceedings of COLING 1992 (1992)
Google Scholar
Gross, M.: On the equivalence of models of language used in the fields of mechanical translation and information retrieval. Information Storage and Retrieval 2(1) (1964)
Google Scholar
Hobbs, J.R.: The generic information extraction system. In: Proceedings of the Fifth Message Understanding Conference (MUC-5), pp. 87–91. Morgan Kaufman, San Francisco (1993)
Chapter Google Scholar
Hutchins, W.J.: Linguistic processes in the indexing and retrieval of documents. Linguistics 61 (1970)
Google Scholar
Jeffrey, K.: What’s next in databases? ERCIM News 39 (1999), www.ercim.org
Krovetz, R.: More than one sense per discourse. NEC Princeton NJ Labs., Research Memorandum (1998)
Google Scholar
Lehnert, W.: A Conceptual Theory of Question Answering. In: Proc. Fifth IJCAI, pp. 158–164. Kaufmann, Cambridge (1977)
Google Scholar
Lehnert, W., Cardie, C., Fisher, D., McCarthy, J., Riloff, E.: University of Massachusetts: Description of the CIRCUS system as used for MUC-4. In: Proceedings of the Fourth Message Understanding Conference MUC-4, pp. 282–288. Morgan Kaufmann, San Francisco (1992)
Chapter Google Scholar
Lenat, D., Prakash, M., Shepherd, M.: CYC: Using common sense knowledge to overcome brittleness and knowledge acquisition bottlenecks. The AI Magazine 6(4) (1986)
Google Scholar
Luhn, H.P.: A statistical approach to mechanized encoding and searching of literary information. IBM Journal of Research and Development 1, 309–317 (1957)
Article MathSciNet Google Scholar
Mauldin, M.: Retrieval performance in FERRET: a conceptual information retrieval system. In: SIGIR 1991 (1991)
Google Scholar
Miller, G.A. (ed.): WordNet: An on-line Lexical Database. International Journal of Lexicography 3(4) (1990)
Google Scholar
Morgan, R., Garigliano, R., Callaghan, P., Poria, S., Smith, M., Urbanowicz, A., Collingham, R., Costantino, M., Cooper, C.: Description of the LOLITA System as used for MUC-6. In: Proceedings of the Sixth Message Understanding Conference (MUC-6), pp. 71–86. Morgan Kaufmann, San Francisco (1995)
Chapter Google Scholar
Muggleton, S.: Recent advances in inductive logic programming. In: Proc. 7th Ann. ACM Workshop on Comput. Learning Theory, pp. 3–11. ACM Press, New York (1994)
Google Scholar
Muggleton, S., Cussens, J., Page, D., Srinivasan, A.: Using inductive logic programming for natural language processing. In: van Someren, M., Widmer, G. (eds.) ECML 1997. LNCS, vol. 1224, pp. 25–34. Springer, Heidelberg (1997); Workshop Notes on Empirical Learning of Natural Language Tasks
Google Scholar
Pearl, J.: Bayesian Networks: A Model of Self-Activated Memory for Evidential Reasoning. In: Proceedings of the Cognitive Science Society (CSS-7) (1985)
Google Scholar
Pietrosanti, E., Graziadio, B.: Extracting Information for Business Needs. Unicom Seminar on Information Extraction, London (March 1997)
Google Scholar
Riloff, E., Lehnert, W.: Automated dictionary construction for information extraction from text. In: Proceedings of Ninth IEEE Conference on Artificial Intelligence for Applications, pp. 93–99 (1993)
Google Scholar
Riloff, E., Shoen, J.: Automatically acquiring conceptual patterns without an annotated corpus. In: Proceedings of the Third Workshop on Very Large Corpora (1995)
Google Scholar
Roche, E., Schabes, Y.: Deterministic Part-of-Speech Tagging with Finite-State Transducers. Computational Linguistics 21(2), 227–254 (1995)
Google Scholar
Salton, G.: A new comparison between conventional indexing (MEDLARS) and automatic text processing (SMART). Journal of the American Society of Information Science 23(2) (1972)
Google Scholar
Schvaneveldt, R. (ed.): Pathfinder Networks: Theory and Applications. Ablex, Norwood (1990)
Google Scholar
Smeaton, A., van Rijsbergen, C.: Experiments in incorporating syntactic processing of user queries into a document retrieval strategy. In: Proc. 11th. ACM SIGIR (1988)
Google Scholar
Sparck Jones, K.: Synonymy and Semantic Classification. Edinburgh University Press, Edinburgh (1966/1986)
Google Scholar
Sparck Jones, K.: What is the role of NLP in text retrieval. In: Strzalkowski (ed.) Natural language Information Retrieval, Kluwer, New York (1999a)
Google Scholar
Sparck Jones, K.: Information Retrieval and Artificial Intelligence. Artificial Intelligence Journal 114 (1999b)
Google Scholar
Stevenson, M., Wilks, Y.: Combining Weak Knowledge Sources for Sense Disambiguation. In: Proceedings of the International Joint Conference for Artifical Intelligence (IJCAI 1999) (1999)
Google Scholar
Strzalkowski, T., Vauthey, B.: Natural Language Processing in Automated Information Retrieval, PROTEUS Project Memorandum. Department of Computer Science, New York University (1991)
Google Scholar
Vilain, M.: Validation of terminological inference in an information extraction task. In: Proceedings of the 1993 ARPA Human Language Workshop (1993)
Google Scholar
Wilks, Y.: Text Searching with Templates. Cambridge Language Research Unit Memo, ML.156 (1964)
Google Scholar
Wilks, Y.: The application of CLRU’s method of semantic analysis to information retrieval. Cambridge Language Research Unit Memo, ML.173 (1965)
Google Scholar
Wilks, Y.: Frames, semantics and novelty. In: Metzing (ed.) Frame Conceptions and Text Understanding, de Gruyter, Berlin (1979)
Google Scholar
Wilks, Y.: Senses and Texts. In: Ide, N. (ed.) Special issue of Computers and the Humanities (1998)
Google Scholar
Wilks, Y., Catizone, R.: Making information extraction more adaptive. In: Pazienza, M.-T. (ed.) Proc. Information Extraction Workshop, Frascati (1999)
Google Scholar
Winograd, T.: Understanding Natural language (1971)
Google Scholar
Winograd, T., Flores, A.: Understanding Computers and Cognition: A New Foundation for Design. Ablex, Norwood (1986)
MATH Google Scholar
Yarowsky, D.: Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora. In: Proc. COLING 1992, Nantes, France (1992)
Google Scholar
Yarowsky, D.: Unsupervised word-sense disambiguation rivalling supervised methods. In: Proc. ACL 1995 (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, Sheffield, S1 4DP
Yorick Wilks

Authors

Yorick Wilks
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computing and Technology, David Goldman Informatics Centre, University of Sunderland, St. Peter’s Campus, SR6 0DD, Sunderland, UK
Sharon McDonald
School of Computing and Technology, University of Sunderland, St. Peter’s Campus, St. Peter’s Way, SR6 0DD, Sunderland, United Kingdom
John Tait

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wilks, Y. (2004). IR and AI: Traditions of Representation and Anti-representation in Information Processing. In: McDonald, S., Tait, J. (eds) Advances in Information Retrieval. ECIR 2004. Lecture Notes in Computer Science, vol 2997. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24752-4_2

Download citation

DOI: https://doi.org/10.1007/978-3-540-24752-4_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21382-6
Online ISBN: 978-3-540-24752-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics