BOOSTING ITEM FINDABILITY: BRIDGING THE SEMANTIC GAP BETWEEN SEARCH PHRASES AND ITEM INFORMATION

Davulcu, Hasan; Nguyen, Hung V.; Ramachandran, Viswanathan

doi:10.1007/978-1-4020-5347-4_24

Hasan Davulcu⁵,
Hung V. Nguyen⁵ &
Viswanathan Ramachandran⁵

677 Accesses
2 Citations

Abstract

Most search engines do their text query and retrieval based on keyword phrases. However, publishers cannot anticipate all possible ways in which users search for the items in their documents. In fact, many times, there may be no direct keyword match between a search phrase and descriptions of items that are perfect “hits” for the search. We present a highly automated solution to the problem of bridging the semantic gap between item information and search phrases. Our system can learn rule-based definitions that can be ascribed to search phrases with dynamic connotations by extracting structured item information from product catalogs and by utilizing a frequent itemset mining algorithm. We present experimental results for a realistic e-commerce domain. Also, we compare our rule-mining approach to vector-based relevance feedback retrieval techniques and show that our system yields definitions that are easier to validate and perform better.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

R. Agrawal and R. Srikant. 1994, “Fast Algorithms for mining association rules”. In Proc. 20th Int. Conf. VLDB pp. 487–499
Google Scholar
H. Aholen, O. Heinonen, M. Klemettinen, and A. I. Verkamo. 1998, “Applying Data Mining Techniques for Descriptive Phrase Extraction in Digital Collections”. In Proceedings of ADL'98, Santa Barabara, USA W. Andrews. 2003 “Gartner Report: Visionaries Invade the 2003 Search Engine Magic Quadrant”.
Google Scholar
V. Crescenzi, G. Mecca, and P. Merialdo. 2001 “Roadrunner: Towards automatic data extraction from large web sites”, In Proc. of the 2001 Intl. Conf. on Very Large Data Bases.
Google Scholar
H. Davulcu, S. Vadrevu, S. Nagarajan, I.V. Ramakrishnan. 2003, “OntoMiner: Bootstrapping and Populating Ontologies From Domain Specific Web Sites”, in IEEE Intelligent Systems, Volume 18, Number 5.
Google Scholar
Deerwester, S., Dumais, S. T., Landauer, T. K., Furnas, G. W. and Harshman, R. A. 1990, “Indexing. Latent semantic analysis”, journal of the Society for Information Science, 41(6), pp. 391–407.
Article Google Scholar
Steve Finch and Andrei Mikheev. 1997, “A Workbench for Finding Structure in Texts”. Applied Natural Language Processing, Washington D.C.
Google Scholar
J. Han J.Pei, Y.Yin, and R. Mao. 2000, “Mining frequent pattern without candidate generation.” In Proceedings of the ACM SIGMOD International Conference on Management of Data, volume 29(2) of SIGMOD Record, ACM Press.
Google Scholar
J. Han, and M. Kamber. 2001, “Data Mining: Concepts and Techniques”, Morgan Kaufmann Publishers.
Google Scholar
Hung V. Nguyen, P. Velamuru, D. Kolippakkam, H. Davulcu, H. Liu, and M. Ates. 2003, “Mining “Hidden Phrase” Definitions from the Web”. APWeb, Xi'an, China, Springer-Velag, LNCS Vol 2642, pp. 156–165.
Google Scholar
M.F. Porter. 1980, “An algorithm for suffix stripping”, Program, 14 no. 3, pp. 130–137.
Google Scholar
G. Salton and C. Buckley. 1990, “Improving retrieval performance by relevance feedback”, journal of the American Society for Information Science, pp. 288–297.
Google Scholar
R. A. Baeza-Yates and Berthier A. Ribeiro-Neto. 1999, “Modern Information Retrieval”, ACM Press / Addison-Wesley.
Google Scholar
M.J. Zaki. 2000, “Scalable algorithms for association mining”. IEEE Transactions on Knowledge and Data Engineering, 12(3), pp. 372–390.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Arizona State University Tempe, AZ, 85287, USA
Hasan Davulcu, Hung V. Nguyen & Viswanathan Ramachandran

Authors

Hasan Davulcu
View author publications
You can also search for this author in PubMed Google Scholar
Hung V. Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Viswanathan Ramachandran
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Florida International University, Miami, FL, U.S.A.
Chin-Sheng Chen
INSTICC/ EST, Setúbal, Portugal
Joaquim Filipe
Universidade Portucalense, Porto, Portugal
Isabel Seruca
INSTICC/ EST, Setúbal, Portugal
José Cordeiro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Davulcu, H., Nguyen, H.V., Ramachandran, V. (2007). BOOSTING ITEM FINDABILITY: BRIDGING THE SEMANTIC GAP BETWEEN SEARCH PHRASES AND ITEM INFORMATION. In: Chen, CS., Filipe, J., Seruca, I., Cordeiro, J. (eds) Enterprise Information Systems VII. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-5347-4_24

Download citation

DOI: https://doi.org/10.1007/978-1-4020-5347-4_24
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-5323-8
Online ISBN: 978-1-4020-5347-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics