Querying E-Catalogs Using Content Summaries
With the rapid development of e-services on the Web, increasing number of e-catalogs are becoming accessible to users. A large number of e-catalogs provide information about similar type of products/services. To simplify users information searching effort, data integration systems have being developed to integrate e-catalogs providing similar type of information such that users can query those e-catalogs with a mediator through an uniform query interface. The conventional approach to answer a query received by a mediator is to select e-catalogs purely based on their query capabilities, i.e., query interface specifications. However, an e-catalog having the capability to answer a query does not mean it has relevant answers to the query. To remedy the wasted resources of querying catalogs that do not generate an answer, in this paper, we propose to use catalog content summary as a filter and select the relevant e-catalogs to answer a given query based not only on their query capabilities but also on their content relevance to the query. A multi-attribute content (MAC) summary is proposed to describe an e-catalog with respect to its content. With MAC summary, an e-catalog is selected to answer a query only if the e-catalog is likely having answers to the query. MAC summary can be constructed and updated using answers returned from e-catalogs and therefore the e-catalogs need not be cooperative. We evaluated MAC summary on 50 e-catalogs, and the experimental results were promising.
KeywordsQuery Processing Range Query Content Summary User Query Query Plan
Unable to display preview. Download preview PDF.
- 1.Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval, 1st edn. Addison-Wesley, Reading (1999)Google Scholar
- 2.Baina, K., Benatallah, B., Paik, H.-Y., Toumani, F., Rey, C., Rutkowska, A., Susanto, H.: WS-CatalogNet: An infrastructure for creating, peering, and querying e-catalog communities. In: Proc. of VLDB 2004, Toronto, Canada, August 2004, pp. 1325–1328 (2004)Google Scholar
- 4.Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Computer Networks 30(1-7), 107–117 (1998)Google Scholar
- 5.Caverlee, J., Liu, L., Rocco, D.: Discovering and ranking web services with basil: a personalized approach with biased focus. In: Proc. of ICSOC 2004, pp. 153–162. ACM Press, New York (2004)Google Scholar
- 7.Cheng, X., Dong, G., Lau, T., Su, J.: Data integration by describing sources with constraint databases. In: Proc. of ICDE 1999, Sydney, Australia. IEEE Computer Society Press, Los Alamitos (1999)Google Scholar
- 14.Levy, A.Y., Rajaraman, A., Ordille, J.J.: Querying heterogeneous information sources using source descriptions. In: Proc. of VLDB 1996, Bombay, India, pp. 251–262. Morgan Kaufmann, San Francisco (1996)Google Scholar
- 15.Liu, L.: Query routing in large-scale digital library systems. In: Proc. of ICDE 1999, Washington DC, pp. 154–163. IEEE Computer Society Press, Los Alamitos (1999)Google Scholar
- 16.McCann, R., AlShebli, B.K., Le, Q., Nguyen, H., Vu, L., Doan, A.: Mapping maintenance for data integration systems. In: Proc. of VLDB 2005, Trondheim, Norway (2005)Google Scholar
- 19.OCEAN. On-board Communication, Entertainment, And iNformation, Available at: http://www.ocean.cse.unsw.edu.au.
- 21.Saint-Paul, R., Raschia, G., Mouaddib, N.: General purpose dataset summarization. In: Proc. of VLDB 2005, Trondheim, Norway (2005)Google Scholar
- 22.Ullman, J.D.: Information integration using logical views. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 19–40. Springer, Heidelberg (1996)Google Scholar