Abstract
Data retrieval (DR) and information retrieval (IR) have traditionally occupied two distinct niches in the world of information systems. DR systems effectively store and query structured data, but lack the flexibility of IR, i.e., the ability to retrieve results which only partially match a given query. IR, on the other hand, is quite useful for retrieving partial matches, but lacks the completed query specification on semantically unambiguous data of DR systems. Due to these drawbacks, we propose an approach to combine the two systems using pre-defined word similarities to determine the correlation between a keyword query (commonly used in IR) and data records stored in the inner framework of a standard RDBMS. Our integrated approach is flexible, context-free, and can be used on a wide variety of RDBs. Experimental results show that RDBMSs using our word-similarity matching approach achieve high mean average precision in retrieving relevant answers, besides exact matches, to a keyword query, which is a significant enhancement of query processing in RDBMSs.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aslam, J., Pavlu, V., Yilmaz, E.: A Statistical Method for System Evaluation Using Incomplete Judgments. In: Intl. ACM SIGIR Conf., pp. 541–548. ACM, New York (2006)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley, Reading (1999)
Bai, J., Song, D., Bruza, P., Nie, J.-Y., Cao, G.: Query Expansion Using Term Relationships in Language Models for Information Retrieval. In: ACM CIKM, pp. 688–695. ACM, New York (2005)
Bremer, J.M., Gertz, M.: Integrating Document and Data Retrieval Based on XML. VLDB J. 1(15), 53–83 (2006)
Bruno, N., Chaudhuri, S., Gravano, L.: Top-K Selection Queries over Relational Databases: Mapping Strategies and Performance Evaluation. ACM TODS 2(27), 153–187 (2002)
Carpineto, C., de Mori, R., Romano, G., Bigi, B.: An Information-Theoretic Approach to Automatic Query Expansion. ACM Transactions on Information Systems 1(19), 1–27 (2001)
Cohen, W.W.: Data Integration Using Similarity Joins and a Word-Based Information Representation Language. ACM Transaction on Information Systems 3(18), 288–321 (2000)
Fang, H., Zhai, C.: Semantic Term Matching in Axiomatic Approaches to Information Retrieval. In: ACM SIGIR Conf., pp. 115–122. ACM, New York (2006)
Goldman, R., Widom, J.: WSQ/DSQ: A Practical Approach for Combined Querying of Databases and the Web. In: ACM SIGMOD Conf., pp. 285–296. ACM, New York (2000)
Grossman, D., Frieder, P.: Information Retrieval: Algorithms and Heuristics. In: Information Retrieval Functionality Using the Relational Model, pp. 168–176. Kluwer Academic, Dordrecht (1998)
Hristidis, V., Gravano, L., Papakonstantinou, Y.: Efficient IR-Style Keyword Search over Relational Databases. In: Intl. Conf. on Very Large Data Bases, pp. 850–861. ACM, New York (2003)
Kelly, D., Dollu, V., Fu, X.: The Loquacious User: A Document-Independent Source of Terms for Query Expansion. In: ACM SIGIR Conf., pp. 457–464. ACM, New York (2005)
Liu, S., Liu, F., Yu, C., Meng, W.: An Effective Approach to Document Retrieval via Utilizing WordNet and Recognizing Phrases. In: ACM SIGIR Conf., pp. 266–272. ACM, New York (2004)
Liu, F., Yu, C., Meng, W., Chowdhury, A.: Effective Keyword Search in Relational Databases. In: ACM SIGMOD Conf., pp. 563–574. ACM, New York (2006)
Sentz, K., Ferson, S.: Combination of Evidence in Dempster-Shafer Theory. SANDIA SAND2002-0835 (April 2002)
Silverstein, C., Marais, H., Henzinger, M., Moricz, M.: Analysis of a Very Large Web Search Engine Query Log. ACM SIGIR Forum 1(33), 6–12 (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gustafson, N., Ng, YK. (2008). Augmenting Data Retrieval with Information Retrieval Techniques by Using Word Similarity. In: Kapetanios, E., Sugumaran, V., Spiliopoulou, M. (eds) Natural Language and Information Systems. NLDB 2008. Lecture Notes in Computer Science, vol 5039. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69858-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-69858-6_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69857-9
Online ISBN: 978-3-540-69858-6
eBook Packages: Computer ScienceComputer Science (R0)